Monster is one of the larger job boards on the web, with thousands of active listings spread across industries and regions. Each public posting carries the kind of structured signal that powers labor-market research, recruiter competitive analysis, and custom job-search tools: a job title, the hiring company, a location, a posted date, and a link to the full posting. The catch is that Monster builds its search results in the browser with JavaScript and lazy-loads more cards as you scroll, so a plain HTTP request hands you a near-empty shell instead of the listings you came for.
This guide shows you how to scrape Monster jobs with Python the reliable way. You build a small, runnable scraper that fetches a rendered search page through the Crawling API, parses each job card with BeautifulSoup, and prints clean structured output. We keep the whole walkthrough scoped to public job-listing data, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.
What you will build
A Python script that takes a public Monster search URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for every posting on the page. We will use a developer-jobs search as the running example and pull these fields per posting:
- Job title the role being advertised, for example "Java Developer".
- Company the employer behind the listing.
- Location where the job is based, like "New York, NY".
- Posted date how recently the listing went up, when Monster exposes it.
- Job URL the link to the full posting, so you can follow up per role.
Why a plain fetch fails on Monster
If you request a Monster search URL with a bare HTTP client, you get a response with status 200 and almost none of the listing data in the body. Two things work against you. First, Monster renders its job cards in the browser with JavaScript, so the initial HTML is a shell that only fills in after the page's scripts run. Second, the results page uses scroll-based pagination: more cards load as a real user scrolls down, so even a rendered snapshot taken too early captures only the first handful of jobs.
So a working Monster scraper needs three things in one request: a browser that actually renders the page, an IP the platform reads as a real visitor, and a way to drive the scroll so the lazy-loaded cards appear. You can assemble that yourself with a headless browser, a scroll script, and a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds all three into a single call: you send it the URL with a JavaScript token, it renders and scrolls the page behind a trusted IP, and it returns finished HTML for you to parse.
Crawlbase offers two token types. The normal token fetches static HTML; the JavaScript (JS) token renders the page in a real browser first. Monster is client-side rendered, so you need the JS token here. Using the normal token returns the same empty shell a plain fetch would, and there is nothing to parse out of it.
Prerequisites
You need a few things in place before writing any code. None of them take long.
Basic Python. You should be comfortable writing and running a Python script and installing packages with pip. If you are new to parsing HTML, our primer on how to use BeautifulSoup in Python covers the selector basics this tutorial leans on.
Python 3.8 or later. Confirm your version with python --version. If you do not have it, install it from python.org or through a distribution like Anaconda.
A Crawlbase account and JS token. Sign up, open your dashboard, and copy your JavaScript (JS) token from the account docs page. Treat the token like a password: it authenticates your requests, so keep it out of version control. The free tier includes enough requests to follow this guide end to end.
Set up the project
Create a virtual environment so project dependencies stay isolated, then install the two libraries the scraper needs.
python --version python -m venv monster_env source monster_env/bin/activate pip install crawlbase beautifulsoup4
On Windows, activate the environment with monster_env\Scripts\activate instead of the source line. Two dependencies do the work: crawlbase is the official client for the Crawling API, and beautifulsoup4 parses the returned HTML so you can pull out individual fields by CSS selector.
Step 1: Fetch the rendered search page
Start by getting the finished page. Import the CrawlingAPI class, initialize it with your JS token, and request the search URL. Checking the status code before you parse keeps failures loud instead of silent. Note the two wait options: ajax_wait tells the API to wait for asynchronous content to finish loading, and page_wait holds for a fixed number of milliseconds so late-rendering cards appear before the page is captured.
from crawlbase import CrawlingAPI api = CrawlingAPI({"token": "YOUR_CRAWLBASE_JS_TOKEN"}) def crawl(page_url): options = {"ajax_wait": "true", "page_wait": 5000} response = api.get(page_url, options) if response["status_code"] == 200: return response["body"].decode("utf-8") print(f"Request failed: {response['status_code']}") return None if __name__ == "__main__": page_url = "https://www.monster.com/jobs/search?q=Java+Developer&where=New+York" html = crawl(page_url) print(html[:500] if html else "No HTML returned")
Run the script with python scraper.py and you should see real job-card markup, not the empty shell a plain fetch returns. Five seconds is a reasonable starting page_wait; raise it if the cards come back empty. That confirms rendering works before you write a single selector.
Monster needs a rendered, scrolled page behind a trusted IP, in one call. The Crawling API takes a JS token, runs the page in a real browser, drives the scroll so lazy-loaded cards appear, and rotates through residential IPs server-side, so you skip running a headless fleet and a proxy pool yourself. Point it at a public search page on the free tier first.
Step 2: Inspect the job-card structure
Before writing selectors, open a Monster search page in your browser and inspect a job card with the dev tools. At the time of writing, each listing sits in an <article> carrying a data-testid="svx_jobCard" attribute, grouped inside a container with the id JobCardGrid. Within each card, the fields you want hang off stable test ids:
-
Job title and URL live on an
<a data-testid="jobTitle">: the link text is the title, thehrefis the posting URL. -
Company sits in a
<span data-testid="company">. -
Location sits in a
<span data-testid="jobDetailLocation">. -
Posted date appears in a
<span data-testid="jobDetailDateRecency">when Monster shows it for that listing.
Targeting data-testid attributes rather than CSS class names is deliberate. Monster ships hashed, build-generated class names that churn on every deploy, while test ids tend to stay put because the site's own test suite depends on them. They are the most durable hooks the page gives you.
Step 3: Parse the job cards with BeautifulSoup
With rendered HTML in hand, load it into BeautifulSoup, select every job card, and pull each field off the card by its test id. Wrap each field read in a small helper that returns an empty string when an element is missing, so one absent field never crashes the run.
from bs4 import BeautifulSoup def text_of(card, selector): el = card.select_one(selector) return el.get_text(strip=True) if el else "" def parse_jobs(html): soup = BeautifulSoup(html, "html.parser") cards = soup.select('div#JobCardGrid article[data-testid="svx_jobCard"]') jobs = [] for card in cards: title_link = card.select_one('a[data-testid="jobTitle"]') jobs.append({ "title": title_link.get_text(strip=True) if title_link else "", "company": text_of(card, 'span[data-testid="company"]'), "location": text_of(card, 'span[data-testid="jobDetailLocation"]'), "posted": text_of(card, 'span[data-testid="jobDetailDateRecency"]'), "url": title_link["href"] if title_link else "", }) return jobs
The title link is read once and reused, since it carries both the title text and the posting URL on its href. Everything else flows through the text_of helper, which queries a single element and returns an empty string when it is missing instead of throwing on a .get_text() call against nothing. That keeps extraction resilient when a card omits a field, which is common since not every listing exposes a posted date.
Monster's markup changes without notice, and the data-testid values above can be renamed in a future redesign. Treat them as a starting template, not a contract. When a field comes back empty across every card, re-inspect a live search page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.
Step 4: Capture the full result page with scrolling
The fetch in Step 1 renders the page, but Monster only loads the first batch of cards until the user scrolls. To pull the whole result set in one request, hand the scroll work to the Crawling API instead of running a headless browser yourself. Two options control it: scroll turns on scroll-based pagination, and scroll_interval sets how long the API keeps scrolling, in seconds, up to a maximum of 60. When scrolling is enabled there is no need to also set page_wait, since the scroll window already gives content time to load.
def crawl_with_scroll(page_url): options = { "ajax_wait": "true", "scroll": "true", "scroll_interval": "60", } response = api.get(page_url, options) if response["status_code"] == 200: return response["body"].decode("utf-8") print(f"Request failed: {response['status_code']}") return None
Swap crawl_with_scroll in for the plain crawl when you want the full page rather than just the first cards. The parser from Step 3 does not change: it still selects every svx_jobCard in the returned HTML, only now there are more of them.
Step 5: Put it together and save to JSON
Now wire the scrolled fetch, the parser, and a small JSON writer into one runnable script. Fetch the rendered and scrolled HTML, hand it to the parser, and write the structured records to disk.
import json from crawlbase import CrawlingAPI from bs4 import BeautifulSoup api = CrawlingAPI({"token": "YOUR_CRAWLBASE_JS_TOKEN"}) def crawl_with_scroll(page_url): options = {"ajax_wait": "true", "scroll": "true", "scroll_interval": "60"} response = api.get(page_url, options) if response["status_code"] == 200: return response["body"].decode("utf-8") print(f"Request failed: {response['status_code']}") return None def text_of(card, selector): el = card.select_one(selector) return el.get_text(strip=True) if el else "" def parse_jobs(html): soup = BeautifulSoup(html, "html.parser") cards = soup.select('div#JobCardGrid article[data-testid="svx_jobCard"]') jobs = [] for card in cards: title_link = card.select_one('a[data-testid="jobTitle"]') jobs.append({ "title": title_link.get_text(strip=True) if title_link else "", "company": text_of(card, 'span[data-testid="company"]'), "location": text_of(card, 'span[data-testid="jobDetailLocation"]'), "posted": text_of(card, 'span[data-testid="jobDetailDateRecency"]'), "url": title_link["href"] if title_link else "", }) return jobs def main(): page_url = "https://www.monster.com/jobs/search?q=Java+Developer&where=New+York" html = crawl_with_scroll(page_url) if not html: return jobs = parse_jobs(html) with open("monster_jobs.json", "w") as f: json.dump(jobs, f, indent=2) print(f"Saved {len(jobs)} jobs to monster_jobs.json") if __name__ == "__main__": main()
What the output looks like
Run the full script with python scraper.py and you get a list of clean structured records, one per posting, ready to write to CSV or a database.
[ { "title": "Java Developer (Core Java)", "company": "Georgia IT Inc.", "location": "New York, NY", "posted": "2 days ago", "url": "https://www.monster.com/job-openings/java-developer-core-java-new-york-ny" }, { "title": "Java Backend Developer", "company": "Diverse Lynx", "location": "Manhattan, NY", "posted": "5 days ago", "url": "https://www.monster.com/job-openings/java-backend-developer-manhattan-ny" } ]
Looping result pages
One search page is a demo; a real job runs across many pages and queries. Monster paginates with a page query parameter, so you can walk pages by incrementing it and reusing the same fetch-and-parse pair. Because every result page shares the same card structure, the parser you already wrote works across all of them without changes. Pace the loop so you are not firing requests back to back.
import time def scrape_pages(query, where, pages=3): all_jobs = [] for page in range(1, pages + 1): url = ( "https://www.monster.com/jobs/search" f"?q={query}&where={where}&page={page}" ) html = crawl_with_scroll(url) if html: all_jobs.extend(parse_jobs(html)) time.sleep(2) return all_jobs
The time.sleep(2) between pages is deliberate. Hammering the search in a tight loop is the fastest way to get throttled, even with rendering and rotation handled for you. Spread requests out, and stop early once a page returns no new cards.
Staying unblocked
Even with rendering and scrolling handled, Monster watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.
- Pace your requests. Spread requests out and vary your queries instead of crawling one search path at full speed.
- Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
- Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.
For the broader playbook, see how to scrape websites without getting blocked and the deeper dive on how to bypass captchas while web scraping. If your project also touches professional networks, our guide on how to scrape LinkedIn covers a comparable login-aware target. And if you would rather route your own traffic through a rotating pool instead of using the managed API, the Smart AI Proxy (also called the AI Proxy) gives you the same residential IP rotation as a drop-in proxy endpoint.
Is it legal to scrape Monster?
Whether scraping Monster is allowed depends on Monster's terms of service, your jurisdiction, and what you do with the data. Monster's terms restrict automated access, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read the Monster Terms of Service and its robots.txt, and treat both as the boundary for what you collect.
A few lines worth holding to. Collect only public job-listing data: the job title, company, location, posted date, and posting URL that anyone can see on a public search page without signing in. Respect Monster's stated rate expectations and keep your request volume low enough that you are not straining its servers. Scope your work strictly to public job postings.
This guide is deliberately limited to public job listings because that is the line that keeps the work defensible. It does not cover applicant or recruiter personal data, resumes or candidate profiles, anything behind a login or paid tier, or any attempt to bypass authentication. Job seekers' and recruiters' personal information is exactly the kind of data to leave alone. If your project needs more than public postings, an official data agreement or Monster's own employer tooling is the correct path, not a cleverer scraper.
Key takeaways
- Monster is client-side rendered. A plain fetch returns an empty shell, so you must render the page before you parse it.
-
Rendering, a trusted IP, and scrolling go together. The Crawling API with a JS token does all three in one call;
scrollandscroll_intervaldrive the lazy-loaded cards into view. -
BeautifulSoup does the extraction. Map title, company, location, posted date, and URL to the card's
data-testidhooks, and expect those hooks to drift. -
Scale by looping pages. Walk Monster's
pageparameter with the same parser, and pace the loop so you are not throttled. - Stay on public listings. Respect Monster's ToS and robots.txt, and never touch applicant or recruiter personal data, resumes, or login-walled pages.
Frequently Asked Questions (FAQs)
Can I scrape Monster jobs with just requests and BeautifulSoup?
Not reliably. Monster renders its job cards in the browser with JavaScript and lazy-loads more as you scroll, so a raw requests call returns status 200 with the listings blank. You need something that renders the page and drives the scroll first, which is what the Crawling API's JS token plus the scroll options handle before BeautifulSoup ever sees the HTML.
Do I need the normal token or the JS token for Monster?
The JS token. The normal token fetches static HTML, which on Monster is the same empty shell a plain fetch returns. The JS token renders the page in a real browser before handing back the HTML, so the job cards are present when BeautifulSoup parses them.
How do I handle Monster's scroll pagination?
Pass scroll: "true" and scroll_interval: "60" in the request options. The Crawling API then scrolls the page for up to 60 seconds, the maximum, so the lazy-loaded cards render before the HTML is captured. With scrolling enabled you do not also need page_wait. To go beyond one result page, increment Monster's page query parameter and reuse the same fetch-and-parse pair.
My selectors return empty strings. What changed?
Almost certainly Monster's markup. The data-testid hooks the parser relies on can be renamed in a redesign, and the build-generated class names churn constantly. Re-inspect a live search page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.
Can I scrape resumes or recruiter contact details from Monster?
No, and this guide does not cover it. Resumes, candidate profiles, and recruiter or applicant personal data sit behind a login or are personal information, not public job-listing data. Scraping login-walled content or bypassing authentication to reach it is out of scope here and runs against Monster's terms. Keep your scope to the public postings on search and listing pages.
How do I avoid getting blocked while scraping Monster?
Keep your per-IP request rate low, vary your queries instead of looping one search path, and route through rotating residential IPs so no single address trips a rate limit. The Crawling API manages rotation and a trusted IP pool for you; if you build your own stack, that is the part to invest in. Watch the status codes and back off when you start seeing challenges.
Crawl any site at scale, without fighting infrastructure.
Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.
