Arxiv Papers By Subject
Hugging FacearXiv Papers by Subject A reorganised version of the nick007x/arxiv-papers dataset, partitioned by subject code, year, and month for efficient selective access. Dataset Description This dataset contains metadata for over 2.5 million arXiv papers, organised into a hierarchical directory structure that allows users to download only the specific subjects and time periods they need, rather than the entire dataset. Motivation The original nick007x/arxiv-papers… See the full description on the dataset page: https://huggingface.co/datasets/permutans/arxiv-papers-by-subject.
Interesting queries to try
Columns
- arxiv_id text
- title text
- authors mixed
- submission_date datetime
- comments text
- primary_subject categorical
- subjects categorical
- doi text
- abstract text
- file_path text