Category: Scraping use cases

Web Scraping for SEO: Extract SERPs and Keyword Volume Using Python

11 mins read Created Date: June 27, 2025   Updated Date: June 27, 2025

Want better SEO results without paying for overpriced tools?

You’re not alone.

From keyword ideas to competitor traffic stats, the most useful SEO data is right there on the web, you just need a smart way to extract it.

In this guide, we’ll show you how to scrape Google Search, Ubersuggest, and SimilarWeb using Python and Scrape.do for the best SEO insights.

You’ll learn how to get:

  • Organic results and search ads directly from Google
  • Keyword volume, CPC, and suggestions from Ubersuggest
  • Domain-level traffic and backlink info from SimilarWeb

All without getting blocked. 🔑

Let’s start with setting everything up.

Prerequisites & Setup

You’ll only need a few libraries to get started:

pip install requests beautifulsoup4

We’re going to use:

  • requests to send HTTP requests
  • BeautifulSoup to parse and extract content from HTML
  • urllib.parse to safely encode URLs

All examples in this guide use Scrape.do to avoid getting blocked. Scrape.do handles:

  • proxy rotation and header spoofing
  • TLS fingerprinting and JavaScript rendering
  • CAPTCHA solving and session management

To follow along, sign up for free and grab your API key here.

Once you have your token, a basic request looks like this:

import requests
import urllib.parse

token = "<your_token>"
target_url = "https://www.google.com/search?q=seo+tools"
encoded_url = urllib.parse.quote_plus(target_url)

api_url = f"https://api.scrape.do/?token={token}&url={encoded_url}&render=true&super=true"
response = requests.get(api_url)

print(response.status_code)

If the setup is working correctly, you should see:

200

That means you’re ready to start scraping.

Scraping SERPs

Search engine results pages (SERPs) are the pulse of SEO.

They tell you:

  • who’s ranking for what,
  • which ads are running,
  • and what questions or related searches are tied to a keyword.

If you’re doing keyword research, content strategy, or tracking competitors, this is your most important data source.

There are two ways to scrape SERPs:

  1. Directly from Google Search
  2. From a service like Ubersuggest that aggregates and enriches it

We’ll cover both starting with Google.

Scraping Google Results

Let’s start with the core of Google Search scraping: the organic results.

Each organic search result is rendered inside a div with the class Ww4FFb, which acts as a reliable anchor when parsing the HTML. Within this block, you’ll find the title, URL, and description.

Here’s a simple script that does exactly that using Scrape.do and BeautifulSoup:

import requests
import urllib.parse
from bs4 import BeautifulSoup

# Your Scrape.do token and search query
scrape_token = "<your-token>"
query = "python web scraping"

# Encode the search query and build Google URL
encoded_query = urllib.parse.quote_plus(query)
google_url = f"https://www.google.com/search?q={encoded_query}&start=0"  # start=0 for first page

# Scrape.do wrapper URL - properly encode the Google URL
api_url = f"https://api.scrape.do/?token={scrape_token}&url={urllib.parse.quote(google_url, safe='')}"

# Send the request
response = requests.get(api_url)
response.raise_for_status()

# Parse the HTML
soup = BeautifulSoup(response.text, 'html.parser')

# Find all search results with Ww4FFb class
search_results = soup.find_all('div', class_=lambda x: x and 'Ww4FFb' in x)

# Extract data from each result
for position, result in enumerate(search_results, 1):
    # Get title from h3 tag
    title = result.find('h3').get_text(strip=True)

    # Get URL from link
    url = result.find('a').get('href')

    # Get description/snippet
    desc_element = result.find(class_='VwiC3b')
    description = desc_element.get_text(strip=True) if desc_element else "No description"

    print(f"{position}. {title}")
    print(f"   URL: {url}")
    print(f"   Description: {description}")
    print()

Example output:

1. Python Web Scraping Tutorial
   URL: https://www.geeksforgeeks.org/python-web-scraping-tutorial/
   Description: Jan 2, 2025Python web scraping refers to the process of extracting data from websites using Python...

2. How to start Web scraping with python? : r/learnpython
   URL: https://www.reddit.com/r/learnpython/comments/qzr8ir/how_to_start_web_scraping_with_python/
   Description: Learn the basic html elements that build up a website. Inspect the element on the webpage that...

3. Python Web Scraping: Full Tutorial With Examples (2025)
   URL: https://www.scrapingbee.com/blog/web-scraping-101-with-python/
   Description: May 27, 2025Learn about web scraping in Python with this step-by-step tutorial...

This gives you:

  • Position in the SERP
  • Title
  • Destination URL
  • Meta description

Scraping Ubersuggest for SERPs

While Google shows you raw results, platforms like Ubersuggest enrich those results with extra data like click estimates, domain authority, and SERP type.

As Ubersuggest offers free searches that you can scrape without logging in, it is a solid way to get SEO data.

However, the security measures are drastically improved switching our target from Google to Ubersuggest.

The frontend is entirely JavaScript-rendered, and the real data comes from internal API endpoints that are locked behind authorization tokens and hidden headers.

Instead of scraping the visual UI, we’ll go straight to the underlying API that powers Ubersuggest’s SERP tool.

This is faster, cleaner, and gives you more structured information than trying to parse a webpage.

But to access that API, we need two things:

  • A Scrape.do token (to avoid IP blocks, session issues, and browser challenges)
  • A Bearer token from Ubersuggest (to pass authentication on their private API)

💡 You can get the Bearer token by inspecting requests in your browser’s dev tools after visiting Ubersuggest and checking the “get-token” API call. It usually starts with app#unlogged__....

Check out our guide to scraping Ubersuggest to automate token acquisition.

Once you have both tokens, here’s the exact request:

import requests
import urllib.parse

# Scrape.do token and Bearer token
scrape_token = "<your-token>"
bearer_token = "app#unlogged__XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

# Target API URL
target_url = "https://app.neilpatel.com/api/serp_analysis?keyword=ubersuggest&locId=2840&language=en&refresh=false"
encoded_url = urllib.parse.quote_plus(target_url)

# Scrape.do API call with Authorization header
api_url = f"https://api.scrape.do/?token={scrape_token}&url={encoded_url}&extraHeaders=true"
headers = {
    "sd-Authorization": f"Bearer {bearer_token}"
}

# Make request and parse response
response = requests.get(api_url, headers=headers)
data = response.json()

# Extract and print structured search results
serp_entries = data.get("serpEntries", [])

for entry in serp_entries:
    print("—" * 60)
    print(f"Position: {entry.get('position')}")
    print(f"Title   : {entry.get('title') or 'N/A'}")
    print(f"Domain  : {entry.get('domain')}")
    print(f"URL     : {entry.get('url')}")
    print(f"Type    : {entry.get('type')}")

    clicks = entry.get("clicks")
    if clicks is not None:
        print(f"Clicks  : {clicks}")

    domain_authority = entry.get("domainAuthority")
    if domain_authority is not None:
        print(f"DA      : {domain_authority}")

Output:

————————————————————————————————————————————————————————————
Position: 1
Title   : Ubersuggest: Free Keyword Research Tool
Domain  : neilpatel.com
URL     : http://neilpatel.com/ubersuggest/
Type    : organic
Clicks  : 7783
DA      : 90
————————————————————————————————————————————————————————————
Position: 2
Title   : Ubersuggest
Domain  : app.neilpatel.com
URL     : http://app.neilpatel.com/
Type    : organic
Clicks  : 1584
DA      : 90
...
Position: 100
Title   : Ubersuggest Alternative  Google Keyword Research Tool
Domain  : social-contests.com
URL     : http://www.social-contests.com/google-keyword-tool/
Type    : organic
————————————————————————————————————————————————————————————

Unlike traditional scraping, there’s no parsing HTML here.

You get clean JSON directly from the source.

Scraping Keyword Information

In SEO, you always need to know more about the keyword before deciding on taking an action.

Volume, related keywords, questions, etc..

Again, there are two ways to get this information, from Google or from Ubersuggest.

Google renders FAQ sections dynamically, and related terms appear as structured blocks near the bottom of the results page.

To get them both, we’ll hit the regular search results page and extract specific elements using BeautifulSoup.

Here’s the combined script:

import requests
import urllib.parse
from bs4 import BeautifulSoup

# Your Scrape.do token and search query
scrape_token = "<your-token>"
query = "python web scraping"

# Encode the search query and build Google URL
encoded_query = urllib.parse.quote_plus(query)
google_url = f"https://www.google.com/search?q={encoded_query}"

# Scrape.do wrapper URL - properly encode the Google URL
api_url = f"https://api.scrape.do/?token={scrape_token}&url={urllib.parse.quote(google_url, safe='')}"

# Send the request
response = requests.get(api_url)
response.raise_for_status()

# Parse the HTML
soup = BeautifulSoup(response.text, 'html.parser')

# Find and extract FAQ questions
faq_results = soup.find_all('div', jsname='yEVEwb')

print("FAQ Questions:")
for position, faq in enumerate(faq_results, 1):
    question_element = faq.find('span')
    if question_element:
        question = question_element.get_text(strip=True)
        print(f"{position}. {question}")

print("\n")

# Find and extract related search terms
related_searches = soup.find_all('div', class_='b2Rnsc vIifob')

print(f"Found {len(related_searches)} related search terms\n")

for position, search_term in enumerate(related_searches, 1):
    term_text = search_term.get_text(strip=True)
    print(f"{position}. {term_text}")

Output:

FAQ Questions:
1. Is Python good for web scraping?
2. How to scrape the web with Python?
3. Why is data scraping illegal?
4. How long does it take to learn web scraping in Python?


Found 8 related search terms

1. Python web scrapingbook
2. Python web scrapinglibrary
3. Python web scrapingSelenium
4. Web scrapingusingPythonBeautifulSoup
5. Web scraping Pythonw3schools
6. Python web scrapinggithub
7. Web scrapingwithPythonPDF
8. Python web scrapingexamples

By combining these two elements—FAQs and related terms—you get a clearer picture of what people are searching for and how to structure your own content for better SEO alignment.

Scraping Ubersuggest for Keyword Information

If you want keyword data like search volume, CPC, or trend breakdowns, Ubersuggest exposes all of that through an internal API you can access directly.

We’ll again skip the UI entirely and go straight to the endpoint that returns keyword insights.

As with previous examples, you’ll need a Scrape.do and a Bearer token.

Here’s the exact script:

import requests
import urllib.parse

# Scrape.do token and Bearer token
scrape_token = "<your-token>"
bearer_token = "app#unlogged__XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"  # Replace with real token

# Target API URL
target_url = "https://app.neilpatel.com/api/keyword_info?keyword=ubersuggest&language=en&locId=2840&withGlobalSVBreakdown=true"
encoded_url = urllib.parse.quote_plus(target_url)

# Scrape.do API call with Authorization header
api_url = f"https://api.scrape.do/?token={scrape_token}&url={encoded_url}&extraHeaders=true"
headers = {
    "sd-Authorization": f"Bearer {bearer_token}"
}

# Make request and parse response
response = requests.get(api_url, headers=headers)
data = response.json()

# Extract and print volume and CPC
keyword_info = data.get("keywordInfo", {})
volume = keyword_info.get("volume")
cpc = keyword_info.get("cpc")

print("Search Volume:", volume)
print("CPC:", cpc)

Output:

Search Volume: 27100
CPC: 6.67

This gives you the essentials like search volume and CPC, but that’s not all.

This endpoint includes much more data in the response, like global volume by country, keyword difficulty, paid competition scores, and more.

If you’re running SEO at scale, it’s worth exploring the full JSON to extract what you need.

Scraping Domain Information

Knowing what keywords your competitors rank for is only part of the SEO game.

You also want to know how much traffic they get, where that traffic comes from, and how much of it is paid vs. organic.

This kind of domain-level intelligence is what tools like SimilarWeb offer.

Scraping Domain Info with SimilarWeb

SimilarWeb uses aggressive bot detection and heavy JavaScript rendering, so you can’t scrape SimilarWeb using plain requests, even with basic proxies.

But with super=true on Scrape.do (which enables residential/mobile IPs and browser-like fingerprinting), you can bypass these defenses and extract meaningful traffic data from public pages.

Here’s how to do it step-by-step:

import requests
import urllib.parse
import re
from bs4 import BeautifulSoup

# Scrape.do API token
token = "<your-token>"

# Target Similarweb page (Google.com)
url = "https://www.similarweb.com/website/google.com/"
encoded_url = urllib.parse.quote_plus(url)

# Scrape.do API endpoint (super=true for residential and mobile IPs)
api_url = f"https://api.scrape.do/?token={token}&url={encoded_url}&super=true"

# Send the request and parse the HTML with BeautifulSoup
html = requests.get(api_url).text
soup = BeautifulSoup(html, "html.parser")

# Extract the domain name (first word in the title)
domain = soup.title.get_text(strip=True).split()[0]

Up until this point it’s pretty simple.

We build our request and start by grabbing the domain from the <title> element.

Then, we extract total monthly visits, which SimilarWeb displays using shorthand like 3.2B or 84.7M.

To convert those into actual numbers, we define a helper function too:

# Function to convert shorthand numbers like 3.5M → 3,500,000
def convert_to_number(s):
    num, suffix = re.match(r"([\d\.]+)([BMK]?)", s).groups()
    return float(num) * {"B": 1e9, "M": 1e6, "K": 1e3}.get(suffix, 1)

# Extract total visits
total_visits_str = soup.find("p", class_="engagement-list__item-value").get_text(strip=True)
total_visits_num = convert_to_number(total_visits_str)
total_visits = int(total_visits_num) if total_visits_num.is_integer() else total_visits_num

We then parse the HTML and look for the traffic change indicator:

  • It tells us whether traffic increased or decreased since last month
  • We also detect if it’s up or down based on the CSS class
# Extract traffic change
engagement_list = soup.find("div", class_="engagement-list")
change_elem = engagement_list.find(
    "span",
    class_=lambda c: c and c.startswith("app-parameter-change app-parameter-change--")
)
raw_change = change_elem.get_text(strip=True)
classes = " ".join(change_elem["class"])
prefix = ("-" * ("change--down" in classes)) + ("+" * ("change--up" in classes))
traffic_change = prefix + raw_change

Next, we extract organic vs. paid traffic split from the visual chart legends:

# Extract organic vs. paid traffic percentages
organic = soup.find("div", class_="wa-keywords__organic-paid-legend-item wa-keywords__organic-paid-legend-item--organic")\
             .find("span", class_="wa-keywords__organic-paid-legend-item-value").get_text(strip=True)

paid = soup.find("div", class_="wa-keywords__organic-paid-legend-item wa-keywords__organic-paid-legend-item--paid")\
          .find("span", class_="wa-keywords__organic-paid-legend-item-value").get_text(strip=True)

And finally, we grab the top traffic source such as “Direct”, “Search”, “Referral” along with its percentage:

# Extract top traffic source and percentage
traffic_text = soup.find("p", class_="app-section__text wa-traffic-sources__section-text").get_text(strip=True)
pattern = r"is (.*?) traffic, driving ([\d\.]+%)"
match = re.search(pattern, traffic_text)
top_source_label = match.group(1)
top_source_pct = match.group(2)

Here’s the final printout:

# Print results
print("Domain:", domain)
print("Total Traffic (last month):", total_visits)
print("Traffic Change (vs. last month):", traffic_change)
print("Organic Traffic (compared to paid):", organic)
print("Paid Traffic (compared to organic):", paid)
print("Top Traffic Source:", top_source_label, "-", top_source_pct)

This is what you should get as output in the terminal:

Domain: google.com
Total Traffic (last month): 76300000000
Traffic Change (vs. last month): -9.72%
Organic Traffic (compared to paid): 94.11%
Paid Traffic (compared to organic): 5.89%
Top Traffic Source: Direct - 87.82%

This gives you a high-level overview of a domain’s performance:

  • Total monthly visits
  • Whether traffic is rising or falling
  • Split between paid and organic
  • Main acquisition channel

It’s the kind of intelligence SEOs use to benchmark competitors and guide content or ad strategies.

Scraping Ubersuggest API for Domain Overview

While SimilarWeb gives you an external view of domain traffic, Ubersuggest offers a more SEO-focused breakdown; showing organic keyword count, backlinks, and monthly search traffic.

Ubersuggest’s internal API also exposes this data.

As always, you’ll need two things:

  • Your Scrape.do token
  • A valid Bearer token from Ubersuggest (inspect requests from their app to get one)

Here’s the full working script:

import requests
import urllib.parse

# Scrape.do token and Bearer token
scrape_token = "<your-token>"  # Replace with your actual Scrape.do token
bearer_token = "app#unlogged__XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"  # Replace with your actual bearer token

# New domain overview endpoint
target_url = "https://app.neilpatel.com/api/domain_overview?domain=scraperapi.com&locId=2840&language=en&withKeywords=true"
encoded_url = urllib.parse.quote_plus(target_url)

# Scrape.do API endpoint with extra headers enabled
api_url = f"https://api.scrape.do/?token={scrape_token}&url={encoded_url}&extraHeaders=true"
headers = {
    "sd-Authorization": f"Bearer {bearer_token}"
}

# Send request
response = requests.get(api_url, headers=headers)
data = response.json()

# Extract and print required values
print(f"Website URL           : {data.get('domain')}")
print(f"Monthly Organic Traffic: {data.get('traffic')}")
print(f"Keywords              : {data.get('organic')}")
print(f"Backlinks             : {data.get('backlinks')}")

Output:

Website URL            : example.com
Monthly Organic Traffic: 42,300
Keywords               : 3,871
Backlinks              : 1,204,552

Just like with the keyword endpoints, this API returns more fields than we’re printing so it’s worth exploring the full JSON if you want to extract domain authority, top pages, or global reach.

Conclusion

Google, Ubersuggest, and SimilarWeb all have heavy anti-bot protection—JavaScript rendering, fingerprinting, WAFs, and aggressive rate limiting.

But with Scrape.do, none of that matters.

Scrape.do handles:

  • Browser rendering and TLS fingerprinting
  • Rotating residential/mobile proxies
  • Session headers and CAPTCHA solving
  • And a whole a lot more to ensure you don’t get blocked.

You just send the request and get clean, structured data back.

Start scraping with 1000 FREE monthly credits.


Raif Tekin

Raif Tekin

R&D Engineer


Hey, folks! As someone who has managed to develop himself in the field of back-end software for years, I offer data interpretation and collection services for eCommerce practices to brands. I am sure that my experience in this field will provide you with useful information.