Google News API
Scrape Google News articles, topic streams, story clusters, and publisher pages as structured JSON
The Google News API is a specialized plugin that returns Google News results (articles, topic clusters, menu navigation, related topics) as clean JSON. One HTTP call per request. Deeper browsing happens by chaining tokens (topic_token, section_token, story_token, publication_token) returned inside any response.
Credit Usage: Each successful request costs 10 credits. For bulk processing, use the Async API with plugins.
Key Features
- Six Driver Parameters: Search by keyword (
q), browse a topic (topic_token), drill into a section (section_token), expand a story cluster (story_token), view a publication's page (publication_token), or pivot through a Knowledge Graph entity (kgmid). - Direct Publisher URLs:
news_results[].linkis the publisher URL directly. No Google redirect to follow, no AMP wrapper to unwrap. - Per-Article Metadata: Bylines (
source.authors), publisher icons, relative date ("3 hours ago"), ISO 8601 timestamps, and thumbnails (full + small). - Story Clusters: Multi-outlet coverage of the same news story comes back as a cluster object with a
stories[]array, same schema as flat articles. - Sort by Relevance or Date:
so=0/so=1controls keyword and entity searches. Other drivers use Google's editorial ordering. - Localized Menus & Topics:
menu_links[](top nav: U.S., World, Business, Technology, …) andrelated_topics[]are returned localized forhl/gl. - No Pagination Quirks: A single response returns the first page (~100 articles). Deeper browsing is done by chaining tokens, with no manual offset arithmetic.
- No Blocks or CAPTCHAs: All anti-bot measures are handled automatically by Scrape.do.
Endpoint
GET https://api.scrape.do/plugin/google/newsRequest Parameters
Required
| Parameter | Type | Description |
|---|---|---|
token | string | Your Scrape.do API authentication token |
Plus exactly one driver:
| Driver | Type | Fetches |
|---|---|---|
q | string | Keyword search (e.g., q=openai) |
topic_token | string | A topic stream (U.S., World, Business, Technology, …) |
section_token | string | A section within a topic (Latest, For You, Opinion, …) |
story_token | string | Full-coverage page for a single news story |
publication_token | string | A publisher's page (CNN, BBC, Reuters, …) |
kgmid | string | A Knowledge Graph entity ID (e.g., /m/02_286 for New York City) |
Sending no driver returns 400 one of q, topic_token, section_token, story_token, publication_token, kgmid is required. Sending two returns 400 exactly one of ... may be set.
Tokens are returned inside news_results[], menu_links[], sub_menu_links[], related_topics[], and related_publications[] on every response. Chain them to navigate.
Tokens rotate occasionally. When a token stops working (502 unexpected response), fetch a fresh one from a recent response and retry.
Localization
| Parameter | Type | Default | Description |
|---|---|---|---|
hl | string | en | Language code (e.g., en, tr, de, fr, ja, pt-br) |
gl | string | us | Country code (e.g., us, gb, de, tr, jp, br) |
google_domain | string | google.com | Echoed back; Google News uses one global origin and the locale comes from hl / gl |
Sort (search mode only)
| Parameter | Values | Description |
|---|---|---|
so | 0 or 1 | 0 = by relevance (default), 1 = by date. Only valid with q or kgmid |
Example Usage
Keyword Search
curl --location --request GET 'https://api.scrape.do/plugin/google/news?token=<SDO-token>&q=openai'import requests
import json
token = "<SDO-token>"
url = f"https://api.scrape.do/plugin/google/news?token={token}&q=openai"
response = requests.request("GET", url)
print(json.dumps(response.json(), indent=2))const axios = require('axios');
const token = "<SDO-token>";
const url = `https://api.scrape.do/plugin/google/news?token=${token}&q=openai`;
axios.get(url)
.then(response => {
console.log(JSON.stringify(response.data, null, 2));
})
.catch(error => {
console.error(error);
});package main
import (
"fmt"
"io/ioutil"
"net/http"
)
func main() {
token := "<SDO-token>"
url := fmt.Sprintf(
"https://api.scrape.do/plugin/google/news?token=%s&q=openai",
token,
)
resp, err := http.Get(url)
if err != nil {
panic(err)
}
defer resp.Body.Close()
body, _ := ioutil.ReadAll(resp.Body)
fmt.Println(string(body))
}require 'net/http'
require 'json'
token = "<SDO-token>"
url = URI("https://api.scrape.do/plugin/google/news?token=#{token}&q=openai")
response = Net::HTTP.get(url)
puts JSON.pretty_generate(JSON.parse(response))import java.net.HttpURLConnection;
import java.net.URL;
import java.io.BufferedReader;
import java.io.InputStreamReader;
public class GoogleNews {
public static void main(String[] args) throws Exception {
String token = "<SDO-token>";
String url = String.format(
"https://api.scrape.do/plugin/google/news?token=%s&q=openai",
token
);
HttpURLConnection conn = (HttpURLConnection) new URL(url).openConnection();
conn.setRequestMethod("GET");
BufferedReader reader = new BufferedReader(
new InputStreamReader(conn.getInputStream())
);
String line;
StringBuilder response = new StringBuilder();
while ((line = reader.readLine()) != null) {
response.append(line);
}
reader.close();
System.out.println(response.toString());
}
}using System;
using System.Net.Http;
using System.Threading.Tasks;
class Program
{
static async Task Main()
{
string token = "<SDO-token>";
string url = $"https://api.scrape.do/plugin/google/news?token={token}&q=openai";
using HttpClient client = new HttpClient();
string response = await client.GetStringAsync(url);
Console.WriteLine(response);
}
}<?php
$token = "<SDO-token>";
$url = "https://api.scrape.do/plugin/google/news?token={$token}&q=openai";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
echo json_encode(json_decode($response), JSON_PRETTY_PRINT);
?>curl "https://api.scrape.do/plugin/google/news?q=openai&hl=en&gl=us&token=$TOKEN"Localized Keyword Search
curl "https://api.scrape.do/plugin/google/news?q=bundesliga&hl=de&gl=de&token=$TOKEN"Sort by Date
curl "https://api.scrape.do/plugin/google/news?q=openai&so=1&token=$TOKEN"Topic Stream
curl "https://api.scrape.do/plugin/google/news?topic_token=CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB&token=$TOKEN"Story Cluster
curl "https://api.scrape.do/plugin/google/news?story_token=CAAqNggKIjBDQklTSGpvSmMzUnZjbmt0TXpZd1NoRUtEd2lLOWRfNUVCRllrZlFnbTlaN3F5Z0FQAQ&token=$TOKEN"Knowledge Graph Entity
# /m/02_286 = New York City
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/02_286&so=1&token=$TOKEN"Response
Top-Level Shape
{
"search_parameters": { ... },
"title": "U.S.",
"news_results": [ ... ],
"menu_links": [ ... ],
"sub_menu_links": [ ... ],
"related_topics": [ ... ],
"related_publications": [ ... ]
}news_results is always an array (empty array when no results, never null). Other fields are present contextually:
title: populated on topic and publication responses (e.g."Technology","CNN").menu_links: top navigation strip; same on every page (localized byhl/gl).sub_menu_links: sections within the current topic / publication.related_topics: populated on keyword searches when the query resolves to a known entity.related_publications: populated on publication pages.
search_parameters
{
"engine": "google_news",
"q": "openai",
"google_domain": "google.com",
"hl": "en",
"gl": "us"
}Optional fields (topic_token, section_token, story_token, publication_token, kgmid, so) appear when set.
news_results[]
Each entry is either a flat article or a cluster (a single story covered by multiple outlets).
Flat article
{
"position": 2,
"title": "OpenAI Takes Aim at Google with New Image Model",
"link": "https://www.theinformation.com/newsletters/ai-agenda/openai-takes-aim-google-new-image-model",
"source": {
"name": "The Information",
"title": "The Information",
"icon": "https://encrypted-tbn3.gstatic.com/faviconV2?...",
"authors": ["Stephanie Palazzolo"]
},
"date": "20 hours ago",
"iso_date": "2026-04-20T14:00:00Z",
"thumbnail": "https://tii.imgix.net/production/articles/16959/46bed976.png",
"thumbnail_small": "https://tii.imgix.net/production/articles/16959/46bed976.png",
"topic_token": "CAAqKAgKIiJDQkFTRXdvTkwyY3ZNVEZuWW1oeGNqaHhlaElDWlc0b0FBUAE",
"publication_token": "CAAqLggKIihDQklTR0FnTWFoUUtFblJvWldsdVptOXliV0YwYVc5dUxtTnZiU2dBUAE"
}Cluster
The outer entry holds only the cluster headline; individual articles live in stories[].
{
"position": 1,
"title": "Federal Reserve signals rate cut ahead",
"stories": [
{
"position": 1,
"title": "Federal Reserve signals rate cut ahead",
"link": "https://www.reuters.com/markets/us/federal-reserve-signals-rate-cut-2026-04-21/",
"source": { "name": "Reuters", "icon": "..." },
"date": "2 hours ago",
"iso_date": "2026-04-21T12:10:50Z",
"thumbnail": "..."
},
{
"position": 2,
"title": "Wall Street rallies on Fed signal",
"link": "https://www.wsj.com/...",
"source": { "name": "The Wall Street Journal" },
"date": "3 hours ago",
"iso_date": "2026-04-21T11:00:00Z"
}
]
}Field reference
| Field | Type | Description |
|---|---|---|
position | int | 1-based position |
title | string | Article (or cluster) headline |
link | string | Direct publisher URL with no redirect to follow |
source.name | string | Publisher name |
source.title | string | Publisher display title |
source.icon | string | Publisher favicon URL |
source.authors | string[] | Bylines when Google exposes them |
date | string | Relative time ("3 hours ago", "2 days ago") |
iso_date | string | RFC 3339 publication timestamp |
thumbnail | string | Article image URL when available |
thumbnail_small | string | Small-variant image URL |
topic_token | string | Token for the parent topic. Pass back as topic_token |
story_token | string | Token for the story cluster. Pass back as story_token |
publication_token | string | Token for the publisher. Pass back as publication_token |
section_token | string | Token for the section. Pass back as section_token |
stories | array | Cluster entries only: related articles covering the same story |
menu_links[]
The top navigation strip; same on every page (localized by hl / gl).
[
{ "position": 1, "title": "U.S.", "topic_token": "CAAq..." },
{ "position": 2, "title": "World", "topic_token": "CAAq..." },
{ "position": 3, "title": "Business", "topic_token": "CAAq..." },
{ "position": 4, "title": "Technology","topic_token": "CAAq..." }
]sub_menu_links[]
Present on topic and publication pages: sections within the current scope (Latest, For You, Opinion, …), each with a section_token.
related_topics[]
Populated on keyword searches when the query resolves to a known entity (person, place, organization). Each item carries a topic_token you can use to pivot to that entity's topic stream.
related_publications[]
Populated on publication pages: adjacent publishers Google suggests.
Notes
- No redirects.
linkis the publisher URL directly. - Tokens are opaque. Pass them back exactly as returned. Don't parse, construct, or cache them across days.
- Tokens rotate. When a token stops working (
502 unexpected response), fetch a fresh one from a recent search response and retry. - No pagination. A single response is the first page (~100 articles for keyword searches). Deeper browsing happens by chaining tokens.
Navigating with Tokens
Google News doesn't paginate. Instead, it exposes a graph of topics, sections, stories, and publications. You navigate by chaining the opaque *_token values returned in any response.
Token Types
| Token | Where you find it | What it fetches |
|---|---|---|
topic_token | news_results[].topic_token, menu_links[].topic_token, related_topics[].topic_token | A topic stream (U.S., Business, Technology, Sports, …) |
section_token | sub_menu_links[].section_token | A section within a topic (Latest, For You, Opinion, …) |
story_token | news_results[].story_token (on cluster entries) | Full coverage for a single news story across multiple outlets |
publication_token | news_results[].publication_token, related_publications[].publication_token | A publisher's page (CNN, BBC, Reuters, …) |
You also have:
q: keyword search (not a token; a string).kgmid: a Knowledge Graph entity ID (e.g.,/m/02_286for New York City). Stable across responses.
Tokens are opaque. Pass them back exactly as returned. Don't parse, construct, or cache them across days. Tokens rotate occasionally; when one stops working (502 unexpected response), fetch a fresh one from a recent response.
Workflow Patterns
Search → Topic Pivot
A keyword search response includes related_topics[] when the query resolves to an entity. Pivot into the topic to get the editorial feed for that entity instead of search relevance.
# Step 1: search
curl -s "https://api.scrape.do/plugin/google/news?q=apple&token=$TOKEN" \
| jq '.related_topics'
# Output:
# [ { "title": "Apple Inc.", "topic_token": "CAAq..." } ]
# Step 2: pivot into the topic
TOPIC=$(curl -s "https://api.scrape.do/plugin/google/news?q=apple&token=$TOKEN" \
| jq -r '.related_topics[0].topic_token')
curl "https://api.scrape.do/plugin/google/news?topic_token=$TOPIC&token=$TOKEN"Topic → Section
A topic response includes sub_menu_links[] with sections like Latest, For You, Opinion, etc. Drill into one for that section's feed.
SECTION=$(curl -s "https://api.scrape.do/plugin/google/news?topic_token=$TOPIC&token=$TOKEN" \
| jq -r '.sub_menu_links[] | select(.title=="Latest") | .section_token')
curl "https://api.scrape.do/plugin/google/news?section_token=$SECTION&token=$TOKEN"Article → Story Cluster
Search and topic results sometimes return cluster entries: entries with a stories[] array and a story_token. Pass story_token as the driver to get the full cluster page (often more outlets than the inline stories[] snapshot).
STORY=$(curl -s "https://api.scrape.do/plugin/google/news?q=fed+rate+cut&token=$TOKEN" \
| jq -r '.news_results[] | select(.story_token) | .story_token' \
| head -n 1)
curl "https://api.scrape.do/plugin/google/news?story_token=$STORY&token=$TOKEN"Article → Publication
Each article carries a publication_token. Pivot to the publisher's page to see their recent coverage.
PUB=$(curl -s "https://api.scrape.do/plugin/google/news?q=openai&token=$TOKEN" \
| jq -r '.news_results[0].publication_token')
curl "https://api.scrape.do/plugin/google/news?publication_token=$PUB&token=$TOKEN"Top-Level Menu
Every response includes menu_links[], Google News's top navigation strip (U.S., World, Business, Technology, Sports, Entertainment, Science, Health). Each entry exposes a topic_token. Use it to navigate without first searching.
TECH=$(curl -s "https://api.scrape.do/plugin/google/news?q=anything&token=$TOKEN" \
| jq -r '.menu_links[] | select(.title=="Technology") | .topic_token')
curl "https://api.scrape.do/plugin/google/news?topic_token=$TECH&token=$TOKEN"Knowledge Graph Pivot
kgmid is a stable identifier for a Knowledge Graph entity (a person, place, organization). Unlike the *_token values, it doesn't rotate, so you can store it.
# /m/02_286 = New York City, stable across responses
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/02_286&so=1&token=$TOKEN"
# /m/0k8z = Apple Inc.
curl "https://api.scrape.do/plugin/google/news?kgmid=/m/0k8z&token=$TOKEN"kgmid accepts the so (sort) parameter; use so=1 for date-sorted entity feeds.
Sort Order Across Drivers
| Driver | so accepted | Default ordering |
|---|---|---|
q | ✅ | Relevance (so=0); switch with so=1 |
kgmid | ✅ | Same as q |
topic_token | ❌ | Google's editorial ordering (recency-weighted) |
section_token | ❌ | Same |
story_token | ❌ | Same |
publication_token | ❌ | Same |
Sending so with a non-search driver returns 400 so is only valid with q or kgmid (search mode).
Handling Stale Tokens
Tokens rotate occasionally. Symptoms:
502 unexpected responsefrom the API.- A previously-working token suddenly returns no results.
Recovery: fetch a fresh token from a recent response and retry. Don't store tokens long-term; treat them as ephemeral cursors. The two stable identifiers are q (keyword strings) and kgmid (entity IDs).
# Pseudocode for a resilient navigator
fresh_response=$(curl -s "https://api.scrape.do/plugin/google/news?q=$KEYWORD&token=$TOKEN")
fresh_topic=$(echo "$fresh_response" | jq -r '.related_topics[0].topic_token // empty')
# Use the fresh token even if you had one cached
curl "https://api.scrape.do/plugin/google/news?topic_token=$fresh_topic&token=$TOKEN"Error Handling
{ "error": "error_code", "message": "Human readable error message" }Common Error Codes
| Status | Error | Description |
|---|---|---|
400 | token is required | Missing API token |
400 | one of q, topic_token, section_token, story_token, publication_token, kgmid is required | No driver parameter set |
400 | exactly one of q, topic_token, section_token, story_token, publication_token, kgmid may be set | More than one driver parameter set |
400 | invalid google_domain | Unrecognized Google domain |
400 | so must be 0 (relevance) or 1 (date) | Invalid so value |
400 | so is only valid with q or kgmid (search mode) | so passed with a non-search driver |
502 | request failed | Transient. Retry |
502 | unexpected response | Upstream returned an unexpected page (often a stale *_token). Fetch a fresh token from a recent response and retry |
500 | decompression failed / failed to parse news results | Transient. Retry |

