logo

Asynchronous API

Process large-scale scraping jobs with the Async API

What is Asynchronous Scrape.do?

Asynchronous Scrape.do allows you to scrape websites asynchronously, meaning you can send multiple requests to the same website at the same time, and the requests will be processed in parallel. This is particularly useful for:

  • Scraping websites with large amounts of content
  • Processing batch scraping operations efficiently
  • Handling slow-loading websites without blocking your application
  • Managing large-scale data extraction projects
  • Running structured plugin scraping for Amazon, Google Search, and Google Trends at scale

Instead of waiting for each request to complete, you create a job, receive a job ID immediately, and poll for results when ready.

The Async-API runs in a separate background thread pool that's fully independent from your main API concurrency. It uses a capacity equivalent to 30% of your plan's concurrency limit, but this is an additional pool and not deducted from your main concurrency.

You can use the Default API and Async-API simultaneously and without interference. Even if your main API is at full capacity, the Async-API continues processing requests in parallel, keeping your workloads running smoothly.

Plan TypeAPI Concurrency Limit(Separate) Async Concurrency Limit
Free52
Hobby52
Pro155
Business4012
Advanced20060
Custom / EnterpriseCustom30% of plan limit

Base URL

https://q.scrape.do

Authentication

All Async API requests require authentication via the X-Token header:

curl --location 'https://q.scrape.do/api/v1/jobs' \
  --header 'X-Token: YOUR_TOKEN' \
  --header 'Content-Type: application/json'

Endpoints


Complete Example

Here's a complete workflow:

# 1. Create a job
curl --location 'https://q.scrape.do/api/v1/jobs' \
  --header 'Content-Type: application/json' \
  --header 'X-Token: YOUR_TOKEN' \
  --data '{
    "Targets": ["https://httpbin.co/anything"],
    "Super": true,
    "GeoCode": "us"
  }'

# Response: {"JobID": "550e8400...", "TaskIDs": ["660e8400..."]}

# 2. Check job status
curl --location 'https://q.scrape.do/api/v1/jobs/550e8400...' \
  --header 'X-Token: YOUR_TOKEN'

# 3. Get task results
curl --location 'https://q.scrape.do/api/v1/jobs/550e8400.../660e8400...' \
  --header 'X-Token: YOUR_TOKEN'

# 4. Check your account status
curl --location 'https://q.scrape.do/api/v1/me' \
  --header 'X-Token: YOUR_TOKEN'

Error Responses

All endpoints may return error responses in the following format:

{
  "Error": "Error message description",
  "Code": 400
}

Common Status Codes:

  • 400 - Invalid request (bad parameters)
  • 401 - Unauthorized (invalid or missing token)
  • 404 - Resource not found
  • 406 - Not acceptable (e.g., trying to cancel completed job)
  • 429 - Too many requests (rate limited)
  • 500 - Internal server error

Best Practices

  1. Polling: When checking job status, implement exponential backoff to avoid excessive API calls
  2. Webhooks: For production use, configure WebhookURL to receive results automatically instead of polling
  3. Error Handling: Always check the Status field in task responses and handle errors appropriately
  4. Concurrency: Monitor your FreeConcurrency to ensure you don't exceed your account limits
  5. Task Expiration: Retrieve task results before the ExpiresAt timestamp (results are stored temporarily)

On this page