awesome-human-label-variation

github.com/mainlp/awesome-human-label-variation ↗

A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, accompanying The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation (EMNLP 2022)

102

GitHub Stars

Curated Resources

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me :bar_chart: datasets resources from awesome-human-label-variation"

Installation instructions →

What's inside

Human Label Variation - Related Initiatives and further reading

Almanea and Poesio 2022:bar_chart: Datasets
ArMIS; New Le-Wi-Di SemEval23 dataset on Arabic tweets annotated for misogyny detection
Bassignana and Plank, 2022:bar_chart: Datasets
CrossRE, relation extraction, small doubly-annotated subset
Berzak et al., 2016:bar_chart: Datasets
Dependency Parsing, WSJ-23, 4 annotators
Bryant and Ng, 2015:bar_chart: Datasets
Grammatical error correction
Cercas Curry et al., 2021:bar_chart: Datasets
ConvAbuse, abusive language towards three conversational AI systems; also part of Le-Wi-Di SemEval23
Cheplygina et al. 2018:bar_chart: Datasets
Medical lesion classification challenge, 6 annotators each

Showing a sample of 60 resources. View the full list on GitHub →