Dataset research report
Reddit Finance 43 250K research report
A reproducible data report with schema notes, generated chart evidence, suggested follow-up questions, and export-ready Helix queries.
Executive Summary
reddit finance 43 250k reddit_finance_43_250k is a collection of 250k post/comment pairs from 43 financial, investing and crypto subreddits. Post must have all been text, with a length of 250chars, and a positive score. Each subreddit is narrowed down to the 70th qunatile before being mergered with their top 3 comments and than the other subs. Further score based methods are used to select the top 250k post/comment pairs. The code to recreate the dataset is here:… See the full description on the dataset page: https://huggingface.co/datasets/winddude/reddit_finance_43_250k.
Research Context
Reddit Finance 43 250K: 500 rows by 9 columns. These exploratory charts are generated automatically from the data - open the dataset in Helix to ask your own questions.
Data Profile
Chart Evidence
These views are generated from the dataset profile. Each chart is paired with a Helix query so it can be opened, adjusted, and exported.
Total z_score by subreddit
Top subreddit values ranked by summed z_score.
Open and export this chartz_score vs normalized_score
Relationship between z_score and normalized_score.
Open and export this chartCorrelation of numeric columns
Pearson correlation between numeric columns.
Open and export this chartFollow-Up Queries
Preview Rows
| # | idtext | titletext | selftexttext | z_scorefloat | normalized_scorefloat | subreddittext | bodytext | comment_normalized_scorefloat |
|---|---|---|---|---|---|---|---|---|
| 1 | utf5u | Where has all the money in the world gone? | Honest question. Where is all the money? I hear nothing but bad news about financial crisis all over the world, and it seems that there i… | 34.13 | 1 | finance | (relix already hit on some of this) It's hard to explain this to a five-year-old, because there are some fairly abstract concepts involved… | 1 |
| 2 | m3g13g | Is there a better sub where comments aren’t hidden 99.9% of the time? | So often someone will ask an amazing question, something I’m really interested in getting a good answer to, or even someone’s opinion, but … | 14.81 | 1 | AskEconomics | I someone were to make a good alternative then I'd be very happy about it. It can take ages moderating this sub. I'm sure lots of the other… | 1 |
| 3 | c84bp | How real-world corruption works. | This is a throwaway account (I'm a longtime redditor under another login). /r/economics might not be the correct place to put this, but it … | 15.87 | 1 | Economics | So I said I would talk about the US Military if this got any interest. Here goes: The US Department of Defense (hereafter DOD) has put in … | 1 |
| 4 | l6x130 | CLASS ACTION AGAINST ROBINHOOD. Allowing people to only sell is the definition of market manipulation. A class action must be started, Robi… | LEAVE ROBINHOOD. They dont deserve to make money off us after the millions they caused in losses. It might take a couple of days, but send … | 91.56 | 1 | wallstreetbets | Chapman Albin is an investors rights firm that my buddy works at. Just got off the phone w him. He is going to post a press release regardi… | 1 |
| 5 | l6i4t3 | Wallstreet Bets Set to Private Megathread | The moderators there have made that sub private before. That’s why this sub was created. It’ll probably open back up soon. Calm down. Edit… | 78.81 | 1 | Wallstreetbetsnew | You there. Yeah you. The person reading this comment. Calm the fuck down. Seriously. You already know what you have to do. Hold that dang l… | 1 |
| 6 | l6u6d5 | Trading212 restricts the purchase of certain stocks under guise of “mitigating risk” | EDIT: T212 ALLEGEDLY SELLING STOCKS WITHOUT USER PERMISSION [tweet](https://twitter.com/able_adam/status/1355174529665028100?s=21) [tweet]… | 27.02 | 1 | UKInvesting | - email them with a complaint - they have 8 weeks to respond - if they dont contact the Finanical Ombudsman or the FCA with evidence (scree… | 0.8563 |
Data Dictionary
- id text
- title text
- selftext text
- z_score numeric
- normalized_score numeric
- subreddit text
- body text
- comment_normalized_score numeric
- combined_score numeric
Method And Limits
- Load the catalog entry and preview rows from the processed dataset file.
- Infer numeric, categorical, time, and location fields from real columns.
- Generate a small set of defensive Plotly chart specifications from that profile.
- Expose each chart idea as a query link so the report can be rerun or exported in Helix.
This report is intentionally reproducible. It uses the local catalog metadata and generated chart specifications rather than claiming external conclusions beyond the dataset.