Openthoughts 114K
Hugging Face[!NOTE] We have released a paper for OpenThoughts! See our paper here. Open-Thoughts-114k Open synthetic reasoning dataset with 114k high-quality examples covering math, science, code, and puzzles! Inspect the content with rich formatting with Curator Viewer. Available Subsets default subset containing ready-to-train data used to finetune the OpenThinker-7B and OpenThinker-32B models: ds = load_dataset("open-thoughts/OpenThoughts-114k", split="train")… See the full description on the dataset page: https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k.
Interesting queries to try
Columns
- system categorical
- conversations mixed