Table of Contents
Trying to scrape a modern web app and getting an empty <div id="root"></div>? Your target is a Single-Page Application (SPA). Don’t rush to spin up Playwright or Selenium.
- For raw HTML rendering at scale: use Scrape.do.
- For extracting clean text, metadata, or screenshots (LLM-ready data): use Geekflare API.
Here is the exact Python setup I use in 2026 to parse dynamic JavaScript websites without melting my server’s RAM.
Hey, it’s Max. If you’ve been doing web scraping for more than a week, you’ve hit this wall. You find a juicy target, fire up your standard Python requests script, pass the response to BeautifulSoup, and… nothing. The data you saw in your browser is completely missing from the terminal output.
Welcome to the world of Single-Page Applications (SPAs) built on React, Vue, or Angular.
The Problem with SPAs
Unlike traditional websites that send a fully assembled HTML document from the server, SPAs send a barebones skeleton. The actual data is loaded dynamically via JavaScript after the browser executes the code.
Since the Python requests library is just an HTTP client, not a browser, it doesn’t execute JavaScript. It just grabs the skeleton and stops.
The Old Way: Headless Browsers (Don’t do this)
Two years ago, my go-to solution was Selenium or Playwright. You boot up a headless Chrome instance in Python, wait for the JS to render, and extract the page source.
Why I stopped:
- Resource Heavy: A single Playwright instance consumes 200-300MB of RAM. Try running 50 concurrent scrapers on a cheap VPS. It crashes instantly.
- Easily Blocked: As I mentioned in my Ultimate Cloudflare Bypass Guide, modern anti-bot systems will detect a vanilla headless browser in milliseconds.
The Modern Way: API Rendering
Instead of maintaining a complex proxy pool and headless browser infrastructure, I now outsource the rendering layer. Here is how I use two different APIs depending on the exact data I need.
Scenario A: I just need the rendered HTML (Using Scrape.do)
When I’m doing bulk scraping (e.g., monitoring prices across 10,000 product pages), I just need the JavaScript to execute so I can parse the resulting HTML with BeautifulSoup.
For this, I use Scrape.do. You simply append &render=true to your API request, and their cluster handles the headless browser execution under the hood.
The Python Script:
import requests
from bs4 import BeautifulSoup
from urllib.parse import quote
API_TOKEN = "YOUR_SCRAPE_DO_TOKEN"
TARGET_URL = "https://spa-target-example.com/products"
def scrape_dynamic_html():
# render=true tells the API to wait for JS execution
api_url = f"http://api.scrape.do?token={API_TOKEN}&url={quote(TARGET_URL)}&render=true"
response = requests.get(api_url, timeout=45)
if response.status_code == 200:
# Now BeautifulSoup will see the fully rendered DOM
soup = BeautifulSoup(response.text, 'html.parser')
# Example: Find dynamically loaded product titles
products = soup.find_all('h2', class_='product-title')
for item in products:
print("Found:", item.text)
else:
print("Failed:", response.status_code)
if __name__ == "__main__":
scrape_dynamic_html()
Code language: PHP (php)
Scenario B: I need AI-Ready Data, Screenshots, or Meta (Using Geekflare)
Sometimes I don’t want to deal with writing complex XPath or CSS selectors at all. If I’m building an AI agent that needs clean text from an SPA, or if I need to capture full-page screenshots of a competitor’s dashboard, I switch to the Geekflare API.
Geekflare is more of a developer’s swiss-army knife. Their Web Scraping API endpoint returns beautifully structured JSON, handling the JS rendering automatically.
The Python Script (Geekflare Meta & Text Extraction):
import requests
import json
GEEKFLARE_API_KEY = "YOUR_GEEKFLARE_KEY"
TARGET_URL = "https://spa-target-example.com/about"
def extract_clean_data():
headers = {
"x-api-key": GEEKFLARE_API_KEY,
"Content-Type": "application/json"
}
payload = {
"url": TARGET_URL,
"renderJS": True, # Crucial for SPA
"extractText": True # Get clean text for LLMs
}
response = requests.post(
"https://api.geekflare.com/webscraping",
json=payload,
headers=headers
)
if response.status_code == 200:
data = response.json()
print("✅ Success! Clean text extracted:")
print(data.get('data', {}).get('text')[:500]) # Print first 500 chars
else:
print("Error:", response.text)
if __name__ == "__main__":
extract_clean_data()
Code language: PHP (php)
The Verdict
Stop fighting JavaScript rendering on your own server.
If you are building a traditional scraper pipeline and want maximum success rates at scale, integrate Scrape.do with render=true.
If you are building an AI aggregator, need to take automated screenshots, or just want clean, parsed metadata without writing a single line of BeautifulSoup code, grab an API key from Geekflare.
Got a specific SPA that’s resisting your scrapers? Let me know on X (Twitter) and we can debug it.

