awesome-human-label-variation
github.com/mainlp/awesome-human-label-variation ↗A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, accompanying The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation (EMNLP 2022)
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me :bar_chart: datasets resources from awesome-human-label-variation"
Installation instructions →What's inside
Human Label Variation - Related Initiatives and further reading
- Almanea and Poesio 2022:bar_chart: Datasets
ArMIS; New Le-Wi-Di SemEval23 dataset on Arabic tweets annotated for misogyny detection
- Bassignana and Plank, 2022:bar_chart: Datasets
CrossRE, relation extraction, small doubly-annotated subset
- Berzak et al., 2016:bar_chart: Datasets
Dependency Parsing, WSJ-23, 4 annotators
- Bryant and Ng, 2015:bar_chart: Datasets
Grammatical error correction
- Cercas Curry et al., 2021:bar_chart: Datasets
ConvAbuse, abusive language towards three conversational AI systems; also part of Le-Wi-Di SemEval23
- Cheplygina et al. 2018:bar_chart: Datasets
Medical lesion classification challenge, 6 annotators each
Showing a sample of 60 resources. View the full list on GitHub →