Helix the Robot
Helix Helix
arrow_backAll datasets

Ml Arxiv Papers

Hugging Face

This dataset contains the subset of ArXiv papers with the "cs.LG" tag to indicate the paper is about Machine Learning. The core dataset is filtered from the full ArXiv dataset hosted on Kaggle: https://www.kaggle.com/datasets/Cornell-University/arxiv. The original dataset contains roughly 2 million papers. This dataset contains roughly 100,000 papers following the category filtering. The dataset is maintained by with requests to the ArXiv API. The current iteration of the dataset only contains… See the full description on the dataset page: https://huggingface.co/datasets/CShorten/ML-ArXiv-Papers.

descriptioncshorten--ml-arxiv-papers.parquet view_list500 rows cloud_downloadCShorten/ML-ArXiv-Papers
boltOpen in Helix

Interesting queries to try

Columns

  • Unnamed: 0.1 numeric
  • Unnamed: 0 numeric
  • title text
  • abstract text

Login to Helix

Don't have an account? Sign up here

Sign Up for Helix

Already have an account? Login here