Ml Arxiv Papers

Name: Ml Arxiv Papers
Creator: Helix
Keywords: dataset, format:csv, hugging face, license:afl-3.0, modality:tabular, modality:text, research, size_categories:100K<n<1M

Hugging Face

This dataset contains the subset of ArXiv papers with the "cs.LG" tag to indicate the paper is about Machine Learning. The core dataset is filtered from the full ArXiv dataset hosted on Kaggle: https://www.kaggle.com/datasets/Cornell-University/arxiv. The original dataset contains roughly 2 million papers. This dataset contains roughly 100,000 papers following the category filtering. The dataset is maintained by with requests to the ArXiv API. The current iteration of the dataset only contains… See the full description on the dataset page: https://huggingface.co/datasets/CShorten/ML-ArXiv-Papers.

cshorten--ml-arxiv-papers.parquet 500 rows CShorten/ML-ArXiv-Papers

Open in Helix Read research report

Ask a question about this data

Type any question in plain English — Helix builds the chart with AI. Sign in to run it and save your charts.

scatter Unnamed: 0.1 vs Unnamed: 0 histogram of Unnamed: 0.1 most common values in title length distribution of title

Data preview

500 rows · 3 columns · showing first 12

#	column_1 float	title text	abstract text
1	0	Learning from compressed observations	The problem of statistical learning is to construct a predictor of a random variable $Y$ as a function of a related random variable $X$ o…
2	1	Sensor Networks with Random Links: Topology Design for Distributed Consensus	In a sensor network, in practice, the communication among sensors is subject to:(1) errors or failures at random times; (3) costs; and(2)…
3	2	The on-line shortest path problem under partial monitoring	The on-line shortest path problem is considered under various models of partial monitoring. Given a weighted directed acyclic graph whose…
4	3	A neural network approach to ordinal regression	Ordinal regression is an important type of learning, which has properties of both classification and regression. Here we describe a simpl…
5	4	Parametric Learning and Monte Carlo Optimization	This paper uncovers and explores the close relationship between Monte Carlo Optimization of a parametrized integral (MCO), Parametric mac…
6	5	Preconditioned Temporal Difference Learning	This paper has been withdrawn by the author. This draft is withdrawn for its poor quality in english, unfortunately produced by the autho…
7	6	A Note on the Inapproximability of Correlation Clustering	We consider inapproximability of the correlation clustering problem defined as follows: Given a graph $G = (V,E)$ where each edge is labe…
8	7	Joint universal lossy coding and identification of stationary mixing sources	The problem of joint universal source coding and modeling, treated in the context of lossless codes by Rissanen, was recently generalized…
9	8	Supervised Feature Selection via Dependence Estimation	We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence…
10	9	Equivalence of LP Relaxation and Max-Product for Weighted Matching in General Graphs	Max-product belief propagation is a local, iterative algorithm to find the mode/MAP estimate of a probability distribution. While it has …
11	10	HMM Speaker Identification Using Linear and Non-linear Merging Techniques	Speaker identification is a powerful, non-invasive and in-expensive biometric technique. The recognition accuracy, however, deteriorates …
12	11	Statistical Mechanics of Nonlinear On-line Learning for Ensemble Teachers	We analyze the generalization performance of a student in a model composed of nonlinear perceptrons: a true teacher, ensemble teachers, a…

Auto-generated charts

Ml Arxiv Papers: 500 rows by 3 columns. These exploratory charts are generated automatically from the data - open the dataset in Helix to ask your own questions.

Rows500

Columns3

Numeric cols1

Charts

Distribution of column_1

Histogram of column_1 values.

Interesting queries to try

Columns

Unnamed: 0.1 numeric
Unnamed: 0 numeric
title text
abstract text