Dataset research report
Nbfi research report
A reproducible data report with schema notes, generated chart evidence, suggested follow-up questions, and export-ready Helix queries.
Executive Summary
NBFI The NBFI dataset from the Kaggle. Client default prediction. Configuration Task Description default Binary classification Has the client defaulted? Usage from datasets import load_dataset dataset = load_dataset("mstz/nbfi")["train"] Features Feature Type income float32 owns_a_car bool owns_a_bike bool has_an_active_loan bool owns_a_house bool nr_children int8 credit float32 loan_annuity float32… See the full description on the dataset page: https://huggingface.co/datasets/mstz/nbfi.
Research Context
Nbfi: 500 rows by 36 columns. These exploratory charts are generated automatically from the data - open the dataset in Helix to ask your own questions.
Data Profile
Chart Evidence
These views are generated from the dataset profile. Each chart is paired with a Helix query so it can be opened, adjusted, and exported.
Total income by owns_a_car
Top owns_a_car values ranked by summed income.
Open and export this chartincome vs nr_children
income vs nr_children, coloured by owns_a_car.
Open and export this chartCorrelation of numeric columns
Pearson correlation between numeric columns.
Open and export this chartFollow-Up Queries
Preview Rows
| # | incomefloat | owns_a_carboolean | owns_a_bikeboolean | has_an_active_loanboolean | owns_a_houseboolean | nr_childreninteger | creditfloat | loan_annuityfloat |
|---|---|---|---|---|---|---|---|---|
| 1 | 15750 | False | False | False | False | 1 | 1.5e+05 | 4398 |
| 2 | 27000 | True | False | False | True | 0 | 28440 | 1913 |
| 3 | 11250 | False | True | True | False | 0 | 36000 | 2066 |
| 4 | 27000 | False | False | True | True | 0 | 7.602e+04 | 3234 |
| 5 | 15750 | False | False | True | True | 3 | 128835 | 3780 |
| 6 | 27000 | True | False | False | True | 3 | 5.337e+04 | 4003 |
Data Dictionary
- income numeric
- owns_a_car bool
- owns_a_bike bool
- has_an_active_loan bool
- owns_a_house bool
- nr_children numeric
- credit numeric
- loan_annuity numeric
- accompanied_by categorical
- income_type categorical
- education_level numeric
- marital_status categorical
- is_male bool
- type_of_contract categorical
- type_of_housing categorical
- residence_density numeric
- age_in_days datetime
- consecutive_days_of_employment datetime
- nr_days_since_last_registration_change datetime
- nr_days_since_last_document_change datetime
- has_provided_a_mobile_number bool
- has_provided_a_home_number bool
- was_reachable_at_work bool
- job text
- nr_family_members numeric
- city_rating numeric
- weekday_of_application datetime
- hour_of_application numeric
- same_residence_and_home bool
- same_work_and_home bool
- score_1 numeric
- score_2 numeric
- score_3 numeric
- nr_defaults_in_social_circle bool
- inquiries_in_last_year datetime
- has_defaulted numeric
Method And Limits
- Load the catalog entry and preview rows from the processed dataset file.
- Infer numeric, categorical, time, and location fields from real columns.
- Generate a small set of defensive Plotly chart specifications from that profile.
- Expose each chart idea as a query link so the report can be rerun or exported in Helix.
This report is intentionally reproducible. It uses the local catalog metadata and generated chart specifications rather than claiming external conclusions beyond the dataset.