Wikitext

Name: Wikitext
Creator: Helix
Keywords: annotations_creators:no-annotation, dataset, hugging face, nlp, task_categories:fill-mask, task_categories:text-generation, task_ids:language-modeling, task_ids:masked-language-modeling

Hugging Face

Dataset Card for "wikitext" Dataset Summary The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. The dataset is available under the Creative Commons Attribution-ShareAlike License. Compared to the preprocessed version of Penn Treebank (PTB), WikiText-2 is over 2 times larger and WikiText-103 is over 110 times larger. The WikiText dataset also features a far larger… See the full description on the dataset page: https://huggingface.co/datasets/Salesforce/wikitext.

salesforce--wikitext--wikitext-103-raw-v1.parquet 500 rows Salesforce/wikitext

Open in Helix Read research report

Ask a question about this data

Type any question in plain English — Helix builds the chart with AI. Sign in to run it and save your charts.

most common values in text length distribution of text

Data preview

500 rows · 1 columns · showing first 12

#	text text
1
2	= Valkyria Chronicles III =
3
4	Senjō no Valkyria 3 : Unrecorded Chronicles ( Japanese : 戦場のヴァルキュリア3 , lit . Valkyria of the Battlefield 3 ) , commonly referred to as Val…
5	The game began development in 2010 , carrying over a large portion of the work done on Valkyria Chronicles II . While it retained the stan…
6	It met with positive sales in Japan , and was praised by both Japanese and western critics . After release , it received downloadable cont…
7
8	= = Gameplay = =
9
10	As with previous Valkyira Chronicles games , Valkyria Chronicles III is a tactical role @-@ playing game where players take control of a m…
11	The game 's battle system , the BliTZ system , is carried over directly from Valkyira Chronicles . During missions , players select each u…
12	Troops are divided into five classes : Scouts , Shocktroopers , Engineers , Lancers and Armored Soldier . Troopers can switch classes by c…

Interesting queries to try

Columns

text text

Ask a question about this data

Data preview

Interesting queries to try

Columns

Related datasets