Published May 31, 2026 | Version v4
Software Open

Anonymous Supplementary Artifact — EMNLP 2026 Industry Track Submission (Multilingual Indian SMS Scam Detection)

Authors/Creators

Description

Note :: v4 supersedes v3 (DOI: 10.5281/zenodo.20478597). IIITD-sourced rows are now hash-only. v3 should not be used for review.

See CHANGELOG.md inside the zip for the full diff.

 

Anonymous Supplementary Artifact

This artifact accompanies the paper “Default LLM APIs Over Flag Indian Financial SMS: A Cleaned Label Benchmark and Template Level Deployment Audit,” currently under anonymous double blind review for the EMNLP 2026 Industry Track.

Artifact contents: 76 files, 2.26 GB uncompressed.

The artifact includes:

• A cleaned multilingual Indian SMS test set with 1,393 rows, covering Hindi Devanagari, Hinglish, and English India messages, with per row prob_scam scores.

• Two trained XLM RoBERTa base checkpoints, each with approximately 270M parameters: one cleaned multilingual detector and one English only control model.

• Per sample predictions from eight evaluated systems: our cleaned XLM R model, English only XLM R, GPT 4o, GPT 4o mini, Claude Haiku 4.5, Gemini 2.5 Flash, Llama 3.3 70B through Together AI, and Qwen 2.5 7B through Together AI.

• The dual LLM judge label cleanup pipeline, along with the full 7,234 row audit CSV.

• The frozen regex based template taxonomy, using a 9 bucket institutional partition for per template FPR analysis.

• A three annotator inter annotator agreement study on a 276 row stratified subset, with Fleiss κ = 0.82 and Krippendorff α = 0.82, which is considered almost perfect agreement under Landis and Koch.

• An operating point sweep for τ ∈ {0.30 to 0.90}, including both global and per template false positive rates.

• The official IIITD SMSAssassin zip file from Yadav et al., HotMobile 2011, redistributed verbatim with MD5, SHA 1, and SHA 256 hash provenance.

• Training scripts, evaluation scripts, figure regeneration scripts, Dockerfile, Makefile, and pinned requirements.

Reproducibility

The artifact was tested end to end in a clean Python 3.11 virtual environment on both macOS arm64 and Linux x86 64. The Makefile targets make smoke, make eval, make iaa, and make figures reproduce all reported paper numbers exactly.

License

Code, scripts, and redistributable datasets are released under Apache 2.0. 

Author identity will be disclosed in the final non anonymous version of the work.

Files

scamshield_emnlp_artifact_v4_2026-05-31.zip

Files (1.7 GB)

Name Size Download all
md5:71c570c9a26f4be8e06d2da1247edda9
1.7 GB Preview Download

Additional details

Related works

Is new version of
Software: 10.5281/zenodo.20478597 (DOI)