Skip to main content

A list of libraries, tools, and APIs for web scraping and data processing. Find everything you need for extracting, managing, and processing data from the web, from HTTP libraries to browser automation tools and proxy services.

18
GitHub Stars
45
Curated Resources
5
Categories
22 hours ago
Last Refreshed
TopicsRecommended CAPTCHA Solving ServicesRecommended Proxy TypesFree Dataset SamplesPopular Web Scraping Videos (Bright Data's Collaborations)

Use this list with your AI agent

Add the Context Awesome MCP server to Claude, Cursor, or any MCP client, then ask:

"Show me recommended captcha solving services resources from awesome-web-scraping"

Installation instructions →

What's inside

Recommended CAPTCHA Solving Services

Recommended Proxy Types

  • Datacenter Proxies

    A cost-effective and high speed solution, suitable for large-scale scraping on less strict websites.

  • Residential Proxies

    The perfect solution for large-scale and complicated projects that require real user IPs.

Topics

  • Go

    A collection of Go tools and libraries for web scraping, parsing, and data automation, including HTTP clients, proxy integration, CAPTCHA solving, serialization, and task scheduling.

  • Java

    A collection of Java tools and libraries for web scraping, parsing, and automation, including HTTP clients, proxy integration, CAPTCHA solving, data processing, and scheduling.

  • JavaScript

    A collection of JavaScript resources for web scraping, data parsing, and automation, featuring libraries for HTTP clients, parsers, proxy integration, CAPTCHA solving, user-agent spoofing, and task scheduling.

  • .NET

  • Perl

    A collecton of Perl tools and libraries for web scraping, data parsing, and automation, with tools for HTTP clients, proxy integration, CAPTCHA solving, and data export.

  • PHP

    A collection of PHP libraries, frameworks, and tools for web scraping, data parsing, export, and automation, featuring solutions for proxy integration, CAPTCHA solving, and task scheduling.

Showing a sample of 45 resources. View the full list on GitHub →