Helix the Robot
Helix
arrow_backSpambase

Dataset research report

Spambase research report

A reproducible data report with schema notes, generated chart evidence, suggested follow-up questions, and export-ready Helix queries.

storageHf descriptionmstz--spambase.parquet view_list500 rows

Executive Summary

Spambase The Spambase dataset from the UCI ML repository. Is the given mail spam? Configurations and tasks Configuration Task Description spambase Binary classification Is the mail spam? Usage from datasets import load_dataset dataset = load_dataset("mstz/spambase")["train"]

Finding 1The dataset has 500 rows available in the catalog.
Finding 2The catalog exposes 58 documented or inferred columns.
Finding 3Helix has 5 ready query prompts for this dataset.
Finding 4This report includes 3 generated chart views.

Research Context

Spambase: 500 rows by 58 columns. These exploratory charts are generated automatically from the data - open the dataset in Helix to ask your own questions.

Data Profile

Rows500
Columns58
Numeric cols58

Chart Evidence

These views are generated from the dataset profile. Each chart is paired with a Helix query so it can be opened, adjusted, and exported.

Follow-Up Queries

Preview Rows

# word_freq_makefloat word_freq_addressfloat word_freq_allfloat word_freq_3dfloat word_freq_ourfloat word_freq_overfloat word_freq_removefloat word_freq_internetfloat
1 00.640.6400.32000
2 0.210.280.500.140.280.210.07
3 0.0600.7101.230.190.190.12
4 00000.6300.310.63
5 00000.6300.310.63
6 00001.85001.85

Data Dictionary

  • word_freq_make numeric
  • word_freq_address numeric
  • word_freq_all numeric
  • word_freq_3d numeric
  • word_freq_our numeric
  • word_freq_over numeric
  • word_freq_remove numeric
  • word_freq_internet numeric
  • word_freq_order numeric
  • word_freq_mail numeric
  • word_freq_receive numeric
  • word_freq_will numeric
  • word_freq_people numeric
  • word_freq_report numeric
  • word_freq_addresses numeric
  • word_freq_free numeric
  • word_freq_business numeric
  • word_freq_email numeric
  • word_freq_you numeric
  • word_freq_credit numeric
  • word_freq_your numeric
  • word_freq_font numeric
  • word_freq_000 numeric
  • word_freq_money numeric
  • word_freq_hp numeric
  • word_freq_hpl numeric
  • word_freq_george numeric
  • word_freq_650 numeric
  • word_freq_lab numeric
  • word_freq_labs numeric
  • word_freq_telnet numeric
  • word_freq_857 numeric
  • word_freq_data numeric
  • word_freq_415 numeric
  • word_freq_85 numeric
  • word_freq_technology numeric
  • word_freq_1999 numeric
  • word_freq_parts numeric
  • word_freq_pm numeric
  • word_freq_direct numeric
  • word_freq_cs numeric
  • word_freq_meeting numeric
  • word_freq_original numeric
  • word_freq_project numeric
  • word_freq_re numeric
  • word_freq_edu numeric
  • word_freq_table numeric
  • word_freq_conference numeric
  • char_freq_; numeric
  • char_freq_( numeric
  • char_freq_[ numeric
  • char_freq_! numeric
  • char_freq_$ numeric
  • char_freq_# numeric
  • capital_run_length_average numeric
  • capital_run_length_longest numeric
  • capital_run_length_total numeric
  • is_spam numeric

Method And Limits

  • Load the catalog entry and preview rows from the processed dataset file.
  • Infer numeric, categorical, time, and location fields from real columns.
  • Generate a small set of defensive Plotly chart specifications from that profile.
  • Expose each chart idea as a query link so the report can be rerun or exported in Helix.

This report is intentionally reproducible. It uses the local catalog metadata and generated chart specifications rather than claiming external conclusions beyond the dataset.

Related Dataset Reports

Login to Helix

Don't have an account? Sign up here

Sign Up for Helix

Already have an account? Login here