Skip to main content

A curated list of amazingly awesome Cybersecurity datasets

2k
GitHub Stars
54
Curated Resources
1
Categories
1 hour ago
Last Refreshed
Datasets

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me email resources from awesome-cybersecurity-datasets"

Installation instructions →

What's inside

Datasets

  • 2007 TREC Public Spam CorpusEmail

    The corpus trec07p contains 75,419 messages: 25220 ham and 50199 spam.

  • 2017-SUEE-data-setNetwork traffic

    The data sets contain traffic in and out of the web server of the Student Union for Electrical Engineering (Fachbereichsvertretung Elektrotechnik) at Ulm University. Internal hosts are hosts from within the university network, some of them are cable bound, others connect through one of two wifi services on campus (eduroam and welcome). The data was mixed with attack traffic.

  • 500K HTTP HeadersWebApps

    Recently we crawled the Top 500K sites (as ranked by Alexa). Following requests from readers we are making available the HTTP Headers for research purposes.

  • Aktaion2 DataHost

    The project is meant to be a learning/teaching tool on how to blend multiple security signals and behaviors into an expressive framework for intrusion detection.

  • Alexa Top 1 MillionURLs & Domain Names

    CSV dataset with the most popular sites by Alexa.

  • AZSecure-dataWebApps

    The AZSecure-data PORTAL currently provides access to Web forums, Internet phishing websites, Twitter data, and other data.

Showing a sample of 54 resources. View the full list on GitHub →