Helix the Robot
Helix Helix
arrow_backAll datasets

Reddit Finance 43 250K

Hugging Face

reddit finance 43 250k reddit_finance_43_250k is a collection of 250k post/comment pairs from 43 financial, investing and crypto subreddits. Post must have all been text, with a length of 250chars, and a positive score. Each subreddit is narrowed down to the 70th qunatile before being mergered with their top 3 comments and than the other subs. Further score based methods are used to select the top 250k post/comment pairs. The code to recreate the dataset is here:… See the full description on the dataset page: https://huggingface.co/datasets/winddude/reddit_finance_43_250k.

descriptionwinddude--reddit-finance-43-250k.parquet view_list500 rows cloud_downloadwinddude/reddit_finance_43_250k
boltOpen in Helix

Interesting queries to try

Columns

  • id text
  • title text
  • selftext text
  • z_score numeric
  • normalized_score numeric
  • subreddit text
  • body text
  • comment_normalized_score numeric
  • combined_score numeric

Login to Helix

Don't have an account? Sign up here

Sign Up for Helix

Already have an account? Login here