awesome-cybersecurity-datasets
github.com/shramos/awesome-cybersecurity-datasets ↗A curated list of amazingly awesome Cybersecurity datasets
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me email resources from awesome-cybersecurity-datasets"
Installation instructions →What's inside
Datasets
- 2007 TREC Public Spam CorpusEmail
The corpus trec07p contains 75,419 messages: 25220 ham and 50199 spam.
- 2017-SUEE-data-setNetwork traffic
The data sets contain traffic in and out of the web server of the Student Union for Electrical Engineering (Fachbereichsvertretung Elektrotechnik) at Ulm University. Internal hosts are hosts from within the university network, some of them are cable bound, others connect through one of two wifi services on campus (eduroam and welcome). The data was mixed with attack traffic.
- 500K HTTP HeadersWebApps
Recently we crawled the Top 500K sites (as ranked by Alexa). Following requests from readers we are making available the HTTP Headers for research purposes.
- Aktaion2 DataHost
The project is meant to be a learning/teaching tool on how to blend multiple security signals and behaviors into an expressive framework for intrusion detection.
- Alexa Top 1 MillionURLs & Domain Names
CSV dataset with the most popular sites by Alexa.
- AZSecure-dataWebApps
The AZSecure-data PORTAL currently provides access to Web forums, Internet phishing websites, Twitter data, and other data.
Showing a sample of 54 resources. View the full list on GitHub →