Swe Bench Verified

Hugging Face

Dataset Summary SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more details on the human-validation process. The dataset collects 500 test Issue-Pull Request pairs from popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution. The original… See the full description on the dataset page: https://huggingface.co/datasets/princeton-nlp/SWE-bench_Verified.

princeton-nlp/SWE-bench_Verified

This dataset hasn't been imported yet, so it can't be charted here. You can browse it on Hugging Face.

Interesting queries to try

play_arrow top 10 rows of Swe Bench Verified with summary statistics
play_arrow counts grouped by the most common field in Swe Bench Verified
play_arrow summary charts for Swe Bench Verified

Interesting queries to try

Related datasets