Helix the Robot
Helix
arrow_backAll datasets

Ml Arxiv Papers

Hugging Face

This dataset contains the subset of ArXiv papers with the "cs.LG" tag to indicate the paper is about Machine Learning. The core dataset is filtered from the full ArXiv dataset hosted on Kaggle: https://www.kaggle.com/datasets/Cornell-University/arxiv. The original dataset contains roughly 2 million papers. This dataset contains roughly 100,000 papers following the category filtering. The dataset is maintained by with requests to the ArXiv API. The current iteration of the dataset only contains… See the full description on the dataset page: https://huggingface.co/datasets/CShorten/ML-ArXiv-Papers.

descriptioncshorten--ml-arxiv-papers.parquet view_list500 rows cloud_downloadCShorten/ML-ArXiv-Papers
boltOpen in Helix

Ask a question about this data

Type any question in plain English — Helix builds the chart with AI. Sign in to run it and save your charts.

auto_awesome

Data preview

500 rows · 3 columns · showing first 12
# column_1 float title text abstract text
1 0Learning from compressed observations The problem of statistical learning is to construct a predictor of a random variable $Y$ as a function of a related random variable $X$ o…
2 1Sensor Networks with Random Links: Topology Design for Distributed Consensus In a sensor network, in practice, the communication among sensors is subject to:(1) errors or failures at random times; (3) costs; and(2)…
3 2The on-line shortest path problem under partial monitoring The on-line shortest path problem is considered under various models of partial monitoring. Given a weighted directed acyclic graph whose…
4 3A neural network approach to ordinal regression Ordinal regression is an important type of learning, which has properties of both classification and regression. Here we describe a simpl…
5 4Parametric Learning and Monte Carlo Optimization This paper uncovers and explores the close relationship between Monte Carlo Optimization of a parametrized integral (MCO), Parametric mac…
6 5Preconditioned Temporal Difference Learning This paper has been withdrawn by the author. This draft is withdrawn for its poor quality in english, unfortunately produced by the autho…
7 6A Note on the Inapproximability of Correlation Clustering We consider inapproximability of the correlation clustering problem defined as follows: Given a graph $G = (V,E)$ where each edge is labe…
8 7Joint universal lossy coding and identification of stationary mixing sources The problem of joint universal source coding and modeling, treated in the context of lossless codes by Rissanen, was recently generalized…
9 8Supervised Feature Selection via Dependence Estimation We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence…
10 9Equivalence of LP Relaxation and Max-Product for Weighted Matching in General Graphs Max-product belief propagation is a local, iterative algorithm to find the mode/MAP estimate of a probability distribution. While it has …
11 10HMM Speaker Identification Using Linear and Non-linear Merging Techniques Speaker identification is a powerful, non-invasive and in-expensive biometric technique. The recognition accuracy, however, deteriorates …
12 11Statistical Mechanics of Nonlinear On-line Learning for Ensemble Teachers We analyze the generalization performance of a student in a model composed of nonlinear perceptrons: a true teacher, ensemble teachers, a…

Auto-generated charts

Ml Arxiv Papers: 500 rows by 3 columns. These exploratory charts are generated automatically from the data - open the dataset in Helix to ask your own questions.

Rows500
Columns3
Numeric cols1

Charts

Distribution of column_1

Histogram of column_1 values.

Interesting queries to try

Columns

  • Unnamed: 0.1 numeric
  • Unnamed: 0 numeric
  • title text
  • abstract text

Login to Helix

Don't have an account? Sign up here

Sign Up for Helix

Already have an account? Login here