a list of some interesting repositories, tools
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me crawler resources from awesome"
Installation instructions →What's inside
Crawler
- amemv-crawler
下载指定的抖音号的视频,抖音爬虫
- Anti-Anti-Spider
处理反爬
- awesome-spider
爬虫集合
- course-crawler
中国大学MOOC、学堂在线、网易云课堂、好大学在线、爱课程 MOOC 课程下载
- Douyin-Bot
Python 抖音机器人,论如何在抖音上找到漂亮小姐姐
- ECommerceCrawlers
实战多种网站、电商数据爬虫
command line
- cleo
Cleo allows you to create beautiful and testable command-line interfaces.
Scrapy Distributed
- crawlab
基于Golang的分布式爬虫管理平台
- Gerapy
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
- scrapydweb
ScrapydWeb: Web app for Scrapyd cluster management
Scrapy Middleware
- crawlera
The World's Smartest Proxy Network
- scrapy-crawlera
Crawlera middleware for Scrapy
- scrapy-crawl-once
Scrapy middleware which allows to crawl only new content
- scrapy-deltafetch
Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls
- scrapy-fake-useragent
Random User-Agent middleware based on fake-useragent
- scrapy-magicfields
Scrapy middleware to add extra fields to items, like timestamp, response fields, spider attributes etc.
Tools
- qrcode
Python 艺术二维码生成器
utils
- queuelib
Collection of persistent (disk-based) queues
HTML parser
- scrapely
A pure-python HTML screen-scraping library
Showing a sample of 33 resources. View the full list on GitHub →