Ai Arxiv Chunked

Name: Ai Arxiv Chunked
Creator: Helix
Keywords: dataset, format:json, hugging face, library:datasets, library:pandas, modality:text, research, size_categories:10K<n<100K

Hugging Face

Hugging Face dataset: jamescalam/ai-arxiv-chunked

jamescalam--ai-arxiv-chunked.parquet 500 rows jamescalam/ai-arxiv-chunked

Open in Helix Read research report

Ask a question about this data

Type any question in plain English — Helix builds the chart with AI. Sign in to run it and save your charts.

count of records over updated count of records by doi distribution of doi

Data preview

500 rows · 15 columns · showing first 12

#	doi text	chunk-id text	chunk text	id text	title text	summary text	source text	authors text	categories text	comment text	primary_category text	published text	updated text	references text
1	1910.01108	0	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter Victor SANH, Lysandre DEBUT, Julien CHAUMOND, Thomas WOLF Hug…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
2	1910.01108	1	loss combining language modeling, distillation and cosine-distance losses. Our smaller, faster and lighter model is cheaper to pre-train an…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
3	1910.01108	2	in real-time has the potential to enable novel and interesting language processing applications, the growing computational and memory requi…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
4	1910.01108	3	through distillation via the supervision of a bigger Transformer language model can achieve similar performance on a variety of downstream …	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
5	1910.01108	4	generalization capabilities of the model and how well it will perform on the test set3. Training loss The student is trained with a distill…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
6	1910.01108	5	and teacher hidden states vectors. 3 DistilBERT: a distilled version of BERT Student architecture In the present work, the student - Distil…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
7	1910.01108	6	3E.g. BERT-base’s predictions for a masked token in " I think this is the beginning of a beautiful [MASK] " comprise two high probability t…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
8	1910.01108	7	performance on downstream tasks. Comparison on downstream tasks: IMDb (test accuracy) and SQuAD 1.1 (EM/F1 on dev set). D: with a second st…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
9	1910.01108	8	examples per batch) using dynamic masking and without the next sentence prediction objective. Data and compute power We train DistilBERT on…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
10	1910.01108	9	et al. [2018]) encoder followed by two BiLSTMs.4 The results on each of the 9 tasks are showed on Table 1 along with the macro-score (avera…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
11	1910.01108	10	We also studied whether we could add another step of distillation during the adaptation phase by ﬁne-tuning DistilBERT on SQuAD using a BER…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]
12	1910.01108	11	Size and inference speed To further investigate the speed-up/size trade-off of DistilBERT, we compare (in Table 3) the number of parameters…	1910.01108	DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter	As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large…	http://arxiv.org/pdf/1910.01108	['Victor Sanh' 'Lysandre Debut' 'Julien Chaumond' 'Thomas Wolf']	['cs.CL']	February 2020 - Revision: fix bug in evaluation metrics, updated metrics, argumentation unchanged. 5 pages, 1 figure, 4 tables. Accepted …	cs.CL	20191002	20200301	[{'id': '1910.01108'}]

Auto-generated charts

Ai Arxiv Chunked: 500 rows by 15 columns. These exploratory charts are generated automatically from the data - open the dataset in Helix to ask your own questions.

Rows500

Columns15

Categorical cols9

Date range2015-06-22 → 2022-05-19

Charts

doi by record count

Most common doi values across records.

Interesting queries to try

Columns

doi categorical
chunk-id text
chunk text
id categorical
title categorical
summary categorical
source categorical
authors mixed
categories mixed
comment categorical
journal_ref categorical
primary_category categorical
published categorical
updated datetime
references mixed