awesome-malware-benign-datasets
github.com/0xh3xa/awesome-malware-benign-datasets ↗🪲 A list of malware and benign datasets for malware research
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me android datasets resources from awesome-malware-benign-datasets"
Installation instructions →What's inside
Android Datasets
- Android-Malware-2023 (AIM-2023)
New Android malware + benign apps with detailed metadata
- Android Malware Genome
Historic dataset of early Android malware
- CIC-AAGM2017
Real-device collected network traffic from Android adware and general malware apps
- CICAndMal2017
Real-device collected malware samples with network and behavior data
- CICMalDroid-2020
Comprehensive Android malware dataset with dynamic and static features
- Drebin
One of the most famous Android malware datasets
Windows Datasets
- BODMAS
Blue Hexagon dataset with malware samples and family info
- ContagioDump
Collection of malware samples for research
- Dumpware 10
Malware images
- EMBER2017-2018
Large public benchmark for malware classifiers
- EMBER2024 (New Benchmark)
Large public benchmark for malware classifiers
- Malimg
Grayscale images for malware classification
Document Datasets
- CIC-Evasive-PDFMal2022
A dataset with 5,557 malicious and 4,468 benign PDF records that attempt to evade common detection techniques.
- Dike
A dataset containing various document formats (doc, xls, ppt) for malware detection.
- Malicious PDF Generator
Generates 10 different malicious PDFs for penetration testing with phone-home functionality.
Showing a sample of 26 resources. View the full list on GitHub →