awesome-crawler
github.com/brucedone/awesome-crawler ↗A collection of awesome web crawler,spider in different languages
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me c# resources from awesome-crawler"
Installation instructions →What's inside
C#
- Abot
C# web crawler built for speed and flexibility.
- ccrawler
Built in C# 3.5 version. it contains a simple extension of web content categorizer, which can separate between the web page depending on their content.
- DotnetSpider
This is a cross platfrom, ligth spider develop by C#.
- Hawk
Advanced Crawler and ETL tool written in C#/WPF.
- Infinity Crawler
A simple but powerful web crawler library in C#.
- SimpleCrawler
Simple spider base on mutithreading, regluar expression.
Java
- ACHE Crawler
An easy to use web crawler for domain-specific search.
- anthelion
A plugin for Apache Nutch to crawl semantic annotations within HTML pages.
- Apache Nutch
A plugin for Apache Nutch to crawl semantic annotations within HTML pages.
- Crawler4j
Simple and lightweight web crawler.
- Gecco
A easy to use lightweight web crawler
- Heritrix3
Extensible, web-scale, archival-quality web crawler project.
Go
- ants-go
A open source, distributed, restful crawler engine in golang.
- colly
Fast and Elegant Scraping Framework for Gophers.
- creeper
The Next Generation Crawler Framework (Go).
- Dataflow kit
Extract structured data from web pages. Web sites scraping.
- dht
BitTorrent DHT Protocol && DHT Spider.
- ferret
Declarative web scraping.
Python
- aspider
An async web scraping micro-framework based on asyncio.
- brownant
A lightweight web data extracting framework.
- CoCrawler
A versatile web crawler built using modern tools and concurrency.
- cola
A distributed crawling framework.
- crawley
Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.
- Demiurge
PyQuery-based scraping micro-framework.
Ruby
- Cobweb
Web crawler with very flexible crawling options, standalone or using sidekiq.
- mechanize
Automated web interaction & crawling.
- Nokogiri
A Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support.
- RubyRetriever
RubyRetriever is a Web Crawler, Scraper & File Harvester.
- Spidr
Spider a site, multiple domains, certain links or infinitely.
- upton
A batteries-included framework for easy web-scraping. Just add CSS(Or do more).
JavaScript
- crawlee
A web scraping and browser automation library for Node.js that helps you build reliable crawlers. Fast.
- headless-chrome-crawler
Headless Chrome crawls with jQuery support
- js-crawler
Web crawler for Node.JS, both HTTP and HTTPS are supported.
- node-crawler
Node-crawler has clean,simple api.
- node-osmosis
HTML/XML parser and web scraper for Node.js.
- scrape-it
A Node.js scraper for humans.
Scala
Showing a sample of 101 resources. View the full list on GitHub →