v0.1 · Backend Ready · Frontend Coming Soon

Web Scraping,
Self-Hosted.
No Limits.

A high-performance Go package and self-hosted interface for scraping, crawling, and structured data extraction. Built for developers who want full control.

100% Written in Go
MIT Open Source
0 deps External Bloat
Self-Hosted Your Data, Your Rules

Simple API, Powerful Results

Drop into any Go project. Start scraping in minutes with a clean, expressive API.

example_single_url.go
package main

import (
    "fmt"
    "time"
    "github.com/harshitbansal184507/CrawlScraper/pkg/scraper"
)

func main() {
    s := scraper.New(scraper.DefaultConfig())
    start := time.Now()

    result, _ := s.ScrapeURL("https://example.com")
    time_taken := time.Since(start)

    fmt.Println("Time taken for scraping :", time_taken)

    if result.Status == "success" {
        fmt.Printf("Title: %s\n", result.Data.Title)
        fmt.Printf("Paragraphs: %d\n", len(result.Data.Paragraphs))
        fmt.Printf("Images: %d\n", result.Data.Images)
    }
}

Everything You Need to Scrape the Web

Built on Go's concurrency model for fast, reliable data extraction at scale.

Concurrent Scraping

Leverage Go's goroutines for blazing-fast parallel scraping. Scrape hundreds of pages simultaneously.

🔧

Custom Headers

Set custom HTTP headers, user agents, cookies, and authentication for any target site or API.

🏠

Self-Hosted Interface Soon

A full visual UI for managing scraping jobs, viewing results, and scheduling tasks — running on your own server.

// installation

Up & Running in Minutes

A single Go module, dead simple to integrate.

STEP 01

Fork & Clone

Fork the repo and clone your fork locally.

git clone https://github.com/YOUR_USERNAME/CrawlScraper.git
STEP 02

Install Dependencies

Download all Go module dependencies.

go mod download
STEP 03

Run the Example

Test with the example file. Change the URL as needed.

go run examples/example_single_url.go

Help Build CrawlScraper 🕷️

Thanks for your interest in contributing! Every bug fix, feature, and doc improvement makes a difference.

🐛 Found a Bug?

Check existing issues first. If it's new, open one with what happened, what you expected, and a code snippet to reproduce it.

✨ Feature Ideas?

Open an issue to discuss it first. Get feedback from the community before starting, then submit a PR when ready.

📝 Docs?

Fix typos, add examples, clarify confusing parts. Documentation PRs are always welcome and highly appreciated.

Commit Message Convention

fix(HTTP CLIENT): timeout handling in HTTP client
feat(Parser): Add support for custom headers
docs(Readme): Update README with installation steps
test(scraper): Add unit tests for URL validation

Ready to Crawl?

Star the project, try it out, and become a part of the community.

★ Star on GitHub Open an Issue