Dataset research report
Sms Spam research report
A reproducible data report with schema notes, generated chart evidence, suggested follow-up questions, and export-ready Helix queries.
Executive Summary
Dataset Card for [Dataset Name] Dataset Summary The SMS Spam Collection v.1 is a public set of SMS labeled messages that have been collected for mobile phone spam research. It has one collection composed by 5,574 English, real and non-enconded messages, tagged according being legitimate (ham) or spam. Supported Tasks and Leaderboards [More Information Needed] Languages English Dataset Structure Data Instances [More Information… See the full description on the dataset page: https://huggingface.co/datasets/ucirvine/sms_spam.
Research Context
Sms Spam: 500 rows by 2 columns. These exploratory charts are generated automatically from the data - open the dataset in Helix to ask your own questions.
Data Profile
Chart Evidence
These views are generated from the dataset profile. Each chart is paired with a Helix query so it can be opened, adjusted, and exported.
Follow-Up Queries
Preview Rows
| # | smstext | labelinteger |
|---|---|---|
| 1 | Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat... | 0 |
| 2 | Ok lar... Joking wif u oni... | 0 |
| 3 | Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 084… | 1 |
| 4 | U dun say so early hor... U c already then say... | 0 |
| 5 | Nah I don't think he goes to usf, he lives around here though | 0 |
| 6 | FreeMsg Hey there darling it's been 3 week's now and no word back! I'd like some fun you up for it still? Tb ok! XxX std chgs to send, £1.5… | 1 |
Data Dictionary
- sms text
- label numeric
Method And Limits
- Load the catalog entry and preview rows from the processed dataset file.
- Infer numeric, categorical, time, and location fields from real columns.
- Generate a small set of defensive Plotly chart specifications from that profile.
- Expose each chart idea as a query link so the report can be rerun or exported in Helix.
This report is intentionally reproducible. It uses the local catalog metadata and generated chart specifications rather than claiming external conclusions beyond the dataset.