Convert Figma logo to code with AI

headzoo logosurf

Stateful programmatic web browsing in Go.

1,496
160
1,496
56

Top Related Projects

5,549

A Chrome DevTools Protocol driver for web automation and scraping.

11,229

A faster, simpler way to drive browsers supporting the Chrome DevTools Protocol.

23,473

Elegant Scraper and Crawler Framework for Golang

14,154

A little like that j-thing, only in Go.

Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.

Selenium/Webdriver client for Go

Quick Overview

The headzoo/surf project is a web browser automation library for the Go programming language. It provides a high-level API for interacting with web pages, allowing developers to automate tasks such as web scraping, form filling, and navigation.

Pros

  • Powerful Automation: The library offers a comprehensive set of features for automating web interactions, making it a valuable tool for tasks like web scraping, testing, and data extraction.
  • Cross-Browser Compatibility: headzoo/surf supports multiple web browsers, including Chrome, Firefox, and Safari, allowing for cross-browser testing and compatibility.
  • Ease of Use: The library's API is designed to be intuitive and easy to use, with a focus on simplifying common web automation tasks.
  • Active Development: The project is actively maintained, with regular updates and bug fixes, ensuring its continued relevance and reliability.

Cons

  • Limited Browser Support: While the library supports multiple browsers, it may not work with all versions or configurations, potentially limiting its usefulness in certain scenarios.
  • Performance Overhead: Automating web interactions can be resource-intensive, and the library may introduce some performance overhead, especially for large-scale or complex tasks.
  • Dependency on External Libraries: headzoo/surf relies on several external libraries, which can increase the complexity of the project setup and maintenance.
  • Lack of Detailed Documentation: The project's documentation, while generally helpful, could be more comprehensive, especially for advanced use cases or edge cases.

Code Examples

Here are a few examples of how to use the headzoo/surf library:

  1. Navigating to a Web Page and Extracting Text:
package main

import (
    "fmt"
    "github.com/headzoo/surf"
)

func main() {
    // Create a new web browser instance
    bow := surf.NewBrowser()

    // Navigate to a web page
    err := bow.Open("https://www.example.com")
    if err != nil {
        panic(err)
    }

    // Extract the page title
    title := bow.Title()
    fmt.Println("Page Title:", title)

    // Extract the page body text
    body := bow.Body()
    fmt.Println("Page Body:", body)
}
  1. Filling a Form and Submitting:
package main

import (
    "github.com/headzoo/surf"
)

func main() {
    // Create a new web browser instance
    bow := surf.NewBrowser()

    // Navigate to a web page with a form
    err := bow.Open("https://www.example.com/form")
    if err != nil {
        panic(err)
    }

    // Fill in the form fields
    bow.Form().Input("name", "John Doe")
    bow.Form().Input("email", "john.doe@example.com")

    // Submit the form
    err = bow.Submit()
    if err != nil {
        panic(err)
    }
}
  1. Scraping Data from a Web Page:
package main

import (
    "fmt"
    "github.com/headzoo/surf"
    "github.com/PuerkitoBio/goquery"
)

func main() {
    // Create a new web browser instance
    bow := surf.NewBrowser()

    // Navigate to a web page
    err := bow.Open("https://www.example.com/products")
    if err != nil {
        panic(err)
    }

    // Use the goquery library to parse the HTML and extract data
    doc, err := goquery.NewDocumentFromReader(bow.Reader())
    if err != nil {
        panic(err)
    }

    // Extract product names and prices
    doc.Find(".product").Each(func(i int, s *goquery.Selection) {
        name := s.Find(".name").Text()
        price := s.Find(".price").Text()
        fmt.Printf("Product: %s, Price: %s\n", name, price)
    })
}

Getting Started

To get started with the headzoo/surf library, follow these steps

Competitor Comparisons

5,549

A Chrome DevTools Protocol driver for web automation and scraping.

Pros of Rod

  • Rod is a high-level, user-friendly web automation library that provides a simple and intuitive API for interacting with web pages.
  • Rod offers a wide range of features, including support for headless and non-headless browsers, automatic retries, and built-in support for common web tasks like form filling, clicking, and scraping.
  • Rod is highly performant and efficient, with a focus on speed and reliability.

Cons of Rod

  • Rod may have a steeper learning curve compared to Surf, especially for developers who are new to web automation.
  • Rod's focus on web automation may make it less suitable for general-purpose web development tasks compared to Surf.
  • Rod's dependency on the Chromium browser may limit its compatibility with other browsers.

Code Comparison

Surf:

s := surf.NewSession(&surf.Options{
    UserAgent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",
})

err := s.Open("https://www.example.com")
if err != nil {
    // Handle error
}

title, err := s.Title()
if err != nil {
    // Handle error
}

fmt.Println("Page title:", title)

Rod:

browser := rod.New().MustConnect()
defer browser.MustClose()

page := browser.MustPage("https://www.example.com")
defer page.MustClose()

title, err := page.Title()
if err != nil {
    // Handle error
}

fmt.Println("Page title:", title)
11,229

A faster, simpler way to drive browsers supporting the Chrome DevTools Protocol.

Pros of chromedp/chromedp

  • Supports a wide range of browser actions, including navigation, input, and screenshot capture.
  • Provides a high-level API that abstracts away the complexity of interacting with the Chrome DevTools Protocol.
  • Offers a flexible and extensible architecture, allowing users to customize and extend the functionality as needed.

Cons of chromedp/chromedp

  • Requires the installation and configuration of a Chrome or Chromium browser, which can be a dependency for some users.
  • May have a steeper learning curve compared to simpler web automation tools, especially for users new to the Chrome DevTools Protocol.
  • Focuses primarily on Chrome/Chromium-based browsers, limiting its applicability to other browser environments.

Code Comparison

Here's a brief code comparison between chromedp/chromedp and headzoo/surf:

chromedp/chromedp (navigating to a website and capturing a screenshot):

ctx, cancel := chromedp.NewContext(context.Background())
defer cancel()

var buf []byte
err := chromedp.Run(ctx,
    chromedp.Navigate("https://www.example.com"),
    chromedp.CaptureScreenshot(&buf),
)
if err != nil {
    // handle error
}

headzoo/surf (navigating to a website and printing the page title):

browser := surf.NewBrowser()
err := browser.Open("https://www.example.com")
if err != nil {
    // handle error
}
fmt.Println(browser.Title())
23,473

Elegant Scraper and Crawler Framework for Golang

Pros of Colly

  • Colly is a fast and efficient web scraping framework for Go, making it well-suited for large-scale web crawling projects.
  • Colly provides a modular and extensible design, allowing developers to easily customize and extend its functionality.
  • Colly has a strong focus on performance, with features like parallel request handling and automatic retries.

Cons of Colly

  • Colly may have a steeper learning curve compared to Surf, as it requires a deeper understanding of Go and web scraping concepts.
  • Colly's documentation, while comprehensive, may not be as beginner-friendly as Surf's.
  • Colly may have a more complex setup process, as it requires the installation of additional dependencies.

Code Comparison

Surf (JavaScript):

const surf = require('surf');

surf('https://example.com', (err, $) => {
  if (err) {
    console.error(err);
    return;
  }

  console.log($('title').text());
});

Colly (Go):

package main

import (
    "fmt"
    "github.com/gocolly/colly"
)

func main() {
    c := colly.NewCollector()
    c.OnHTML("title", func(e *colly.HTMLElement) {
        fmt.Println(e.Text)
    })
    c.Visit("https://example.com")
}
14,154

A little like that j-thing, only in Go.

Pros of goquery

  • goquery is a pure Go library, making it a more lightweight and efficient option compared to Surf.
  • goquery provides a familiar jQuery-like syntax for querying and manipulating HTML documents, which can be more intuitive for developers already familiar with jQuery.
  • goquery is actively maintained and has a larger community, with more contributors and a more extensive documentation.

Cons of goquery

  • Surf provides a more comprehensive set of features, including support for cookies, headers, and other advanced web browsing functionality.
  • Surf has a more flexible and extensible architecture, allowing for easier customization and integration with other libraries.
  • Surf may be a better choice for more complex web scraping tasks that require more advanced features and functionality.

Code Comparison

Surf (headzoo/surf):

browser := surf.NewBrowser()
err := browser.Open("https://example.com")
if err != nil {
    // Handle error
}

link, err := browser.Find("a.my-link").Attr("href")
if err != nil {
    // Handle error
}

fmt.Println(link)

goquery (PuerkitoBio/goquery):

doc, err := goquery.NewDocument("https://example.com")
if err != nil {
    // Handle error
}

link, _ := doc.Find("a.my-link").Attr("href")
fmt.Println(link)

Playwright for Go a browser automation library to control Chromium, Firefox and WebKit with a single API.

Pros of Playwright-Go

  • Playwright-Go provides a more comprehensive and feature-rich API for automating web browsers, including support for multiple browsers (Chromium, Firefox, and WebKit).
  • The Playwright-Go library is actively maintained and has a larger community of contributors, ensuring regular updates and bug fixes.
  • Playwright-Go offers better cross-browser compatibility and can handle more complex web interactions compared to Surf.

Cons of Playwright-Go

  • Playwright-Go has a larger footprint and may require more system dependencies, which can make it more challenging to set up and deploy in certain environments.
  • The Playwright-Go API may have a steeper learning curve for developers who are more familiar with simpler web automation libraries like Surf.

Code Comparison

Surf (headzoo/surf):

browser := surf.NewBrowser()
err := browser.Open("https://www.example.com")
if err != nil {
    // Handle error
}
fmt.Println(browser.Body())

Playwright-Go (playwright-community/playwright-go):

pw, err := playwright.Run()
if err != nil {
    // Handle error
}
defer pw.Stop()

browser, err := pw.Chromium.NewBrowser()
if err != nil {
    // Handle error
}
defer browser.Close()

page, err := browser.NewPage()
if err != nil {
    // Handle error
}
defer page.Close()

_, err = page.Navigate("https://www.example.com")
if err != nil {
    // Handle error
}
fmt.Println(page.Content())

Selenium/Webdriver client for Go

Pros of Selenium

  • Selenium is a widely-used and well-established library for web automation, with a large community and extensive documentation.
  • Selenium supports multiple programming languages, including Python, Java, and C#, making it a versatile choice.
  • Selenium can interact with a wide range of web browsers, including Chrome, Firefox, and Safari, providing cross-browser testing capabilities.

Cons of Selenium

  • Selenium can be more complex to set up and configure compared to simpler web automation libraries like Surf.
  • Selenium may have a steeper learning curve, especially for developers new to web automation.
  • Selenium can be more resource-intensive than some alternatives, as it requires managing browser instances and handling network communication.

Code Comparison

Surf:

from surf import Browser

browser = Browser()
browser.go("https://www.example.com")
print(browser.title)
browser.quit()

Selenium:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.example.com")
print(driver.title)
driver.quit()

Both code snippets perform a similar task of navigating to a website, retrieving the page title, and closing the browser instance. The main difference is the library used, with Surf providing a more concise and user-friendly API compared to the more verbose Selenium syntax.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Surf

Build Status GoDoc Documentation MIT License

Surf is a Go (golang) library that implements a virtual web browser that you control programmatically. Surf isn't just another Go solution for downloading content from the web. Surf is designed to behave like web browser, and includes: cookie management, history, bookmarking, user agent spoofing (with a nifty user agent builder), submitting forms, DOM selection and traversal via jQuery style CSS selectors, scraping assets like images, stylesheets, and other features.

Installation

Download the library using go. go get gopkg.in/headzoo/surf.v1

Import the library into your project. import "gopkg.in/headzoo/surf.v1"

Quick Start

package main

import (
	"gopkg.in/headzoo/surf.v1"
	"fmt"
)

func main() {
	bow := surf.NewBrowser()
	err := bow.Open("http://golang.org")
	if err != nil {
		panic(err)
	}

	// Outputs: "The Go Programming Language"
	fmt.Println(bow.Title())
}

Documentation

Complete documentation is available on Read the Docs.

Credits

Surf uses the awesome goquery by Martin Angers, and was written using Intellij and the golang plugin.

Contributions have been made to Surf by the following awesome developers:

The idea to create Surf was born in this Reddit thread.

Contributing

Issues and pull requests are always welcome.

See CONTRIBUTING.md for more information.

License

Surf is released open source software released under The MIT License (MIT). See LICENSE.md for more information.

NPM DownloadsLast 30 Days