awesome-web-scraping
github.com/luminati-io/awesome-web-scraping ↗A list of libraries, tools, and APIs for web scraping and data processing. Find everything you need for extracting, managing, and processing data from the web, from HTTP libraries to browser automation tools and proxy services.
Use this list with your AI agent
Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:
"Show me recommended captcha solving services resources from awesome-web-scraping"
Installation instructions →What's inside
Recommended CAPTCHA Solving Services
Popular Web Scraping Videos (Bright Data's Collaborations)
- 3 million dollar project ideas for developers
- Build a fullstack SEO rank tracker app with Next.js and Bright Data
- Easiest way to web scraping using Playwright
- eCommerce web scraping tutorial (Puppeteer, Cheerio, and Node.js
- How to create custom datasets to train LLMs using Bright Data
- How to scrape any website (ft. scraping browser)
Free Dataset Samples
Recommended Proxy Types
- Datacenter Proxies
A cost-effective and high speed solution, suitable for large-scale scraping on less strict websites.
- Residential Proxies
The perfect solution for large-scale and complicated projects that require real user IPs.
Topics
- Go
A collection of Go tools and libraries for web scraping, parsing, and data automation, including HTTP clients, proxy integration, CAPTCHA solving, serialization, and task scheduling.
- Java
A collection of Java tools and libraries for web scraping, parsing, and automation, including HTTP clients, proxy integration, CAPTCHA solving, data processing, and scheduling.
- JavaScript
A collection of JavaScript resources for web scraping, data parsing, and automation, featuring libraries for HTTP clients, parsers, proxy integration, CAPTCHA solving, user-agent spoofing, and task scheduling.
- .NET
- Perl
A collecton of Perl tools and libraries for web scraping, data parsing, and automation, with tools for HTTP clients, proxy integration, CAPTCHA solving, and data export.
- PHP
A collection of PHP libraries, frameworks, and tools for web scraping, data parsing, export, and automation, featuring solutions for proxy integration, CAPTCHA solving, and task scheduling.
Showing a sample of 45 resources. View the full list on GitHub →