Helix the Robot
Helix
arrow_backAll datasets

Reddit

Hugging Face

This corpus contains preprocessed posts from the Reddit dataset. The dataset consists of 3,848,330 posts with an average length of 270 words for content, and 28 words for the summary. Features includes strings: author, body, normalizedBody, content, summary, subreddit, subreddit_id. Content is used as document and summary is used as summary.

cloud_downloadreddit

cloud_off This dataset hasn't been imported yet, so it can't be charted here. You can browse it on Hugging Face.

Interesting queries to try

Login to Helix

Don't have an account? Sign up here

Sign Up for Helix

Already have an account? Login here