Upwork is one of the largest freelancing marketplaces on the open web, and its public job search pages are a steady signal of what clients are hiring for. Every posting carries a title, a budget or hourly rate, a list of required skills, and a posted time, and that data is what recruiters, market researchers, and freelancers themselves use to track demand: which skills are in short supply, what rates a category commands, and where new project work is appearing.
This guide shows you how to scrape Upwork jobs with JavaScript and Node.js using cheerio. You build a small, runnable scraper that fetches a public Upwork job search page through the Crawling API, parses each posting's title, budget or rate, required skills, posted time, and link, and exports the result as JSON and CSV. The whole walkthrough stays scoped to public job postings. It does not touch freelancer personal profiles, contact details, or anything behind a login, and the legality section near the end is not boilerplate, so read it before you point this at any real volume.
What you will build
A Node.js script that takes a public Upwork job search URL, retrieves the rendered HTML through the Crawling API, and extracts a structured record for every job posting on the results page. We use a search for "SEO expert" as the running example and pull these fields per posting:
- Title the job posting title, for example "SEO Expert for Technical Site Audit".
- Budget or rate the fixed-price budget or hourly range shown on the card, like "$1,000" or "$25.00-$50.00".
- Skills the list of required skill tags attached to the posting.
- Posted time the relative posted time, for instance "Posted 3 hours ago".
- Link the URL to the individual job posting page.
Why a plain request fails on Upwork
If you request an Upwork job search URL with a bare HTTP client, you rarely get the job cards back. Two things work against you. First, Upwork renders the search results in the browser with JavaScript, so the initial HTML is a near-empty shell until the page's scripts run. Second, Upwork flags automated traffic aggressively: datacenter IPs and request patterns that do not look like a real browser get challenged, rate-limited, or blocked before they reach the rendered listings.
So a working Upwork scraper needs two things in one request: a browser that actually renders the page, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL, it renders the page behind a trusted IP, and it returns finished HTML for you to parse with cheerio. For more on why client-side rendering breaks naive scrapers, see how to crawl JavaScript websites.
The Crawling API gives you two tokens: a normal token and a JavaScript token. Upwork search pages are rendered client-side, so you need the JavaScript token (real-browser rendering) for the page to come back populated. A normal-token request will usually return an empty shell.
Prerequisites
You need a few things in place before writing any code. None of them take long.
Basic JavaScript and Node.js. You should be comfortable writing and running a Node script and installing packages with npm. If you are new to Node, the guide to building a web scraper with Node.js covers the fundamentals this tutorial assumes.
Node.js 16 or later. Confirm your version with node --version. If you do not have it, install it from the Node.js website or through a version manager like nvm.
A Crawlbase account and token. Sign up, open your dashboard, and copy your JavaScript token from the account docs page. The free tier gives you 1,000 requests with no card. Treat the token like a password: it authenticates your requests, so keep it out of version control.
Set up the project
Create a project folder, initialize it, and install the two libraries the scraper needs.
node --version mkdir upwork-jobs-scraper && cd upwork-jobs-scraper npm init -y npm install crawlbase cheerio
Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull out individual fields by CSS selector. Create a file named upwork-scraper.js in this folder and add the code from the steps below.
Step 1: Fetch the rendered job search page
Start by getting the finished page. Import the CrawlingAPI class, initialize it with your JavaScript token, and request the search URL. Because Upwork renders client-side, pass the API a short wait so its scripts finish before the HTML is captured. Checking the status code before you parse keeps failures loud instead of silent.
const { CrawlingAPI } = require('crawlbase'); const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' }); const upworkSearchURL = 'https://www.upwork.com/nx/search/jobs/?q=seo%20expert'; // Render the page in a real browser, then wait for scripts to settle const options = { ajax_wait: 'true', page_wait: 3000 }; api .get(upworkSearchURL, options) .then((response) => { if (response.statusCode === 200) { console.log(response.body.slice(0, 500)); } }) .catch((error) => console.error('API request error:', error));
Run the script with node upwork-scraper.js and you should see real Upwork job markup at the top of the body, not a stripped-down shell. That confirms rendering works before you write a single selector. The ajax_wait and page_wait options tell the API to run the page in a real browser and give its scripts time to populate the job cards. If you would rather have the API parse common fields for you instead of writing selectors by hand, pass the autoparse option and you get structured JSON back directly.
// Ask the API to render and parse, returning JSON const options = { ajax_wait: 'true', page_wait: 3000, autoparse: 'true' }; api .get(upworkSearchURL, options) .then((response) => { if (response.statusCode === 200) { console.log(JSON.parse(response.body)); } }) .catch((error) => console.error('API request error:', error));
That first request just returned a fully rendered Upwork search page without a headless browser or a proxy on your side. The Crawling API runs the page in a real browser, waits for its JavaScript to populate the job cards, rotates through residential IPs server-side, and handles the challenges Upwork throws at scrapers, so you get finished HTML (or autoparsed JSON) from one call. Point it at a public job search on the free tier first.
Step 2: Parse each job posting with cheerio
With rendered HTML in hand, load it into cheerio and walk the job cards. Upwork lays each posting out in a repeating article container on the search results page, so you select every card, then read the title, budget or rate, skills, posted time, and link from inside it. Reading each field defensively keeps one missing value from crashing the run.
const cheerio = require('cheerio'); function parseJobs(html) { const $ = cheerio.load(html); const jobs = []; const cards = $('article[data-test="JobTile"]'); cards.each((index, element) => { const card = $(element); const job = {}; // Title and link share one anchor const titleLink = card.find('[data-test="job-tile-title-link"]'); job.title = titleLink.text().trim(); const href = titleLink.attr('href'); job.link = href ? new URL(href, 'https://www.upwork.com').href : ''; // Posted time, e.g. "Posted 3 hours ago" job.posted = card .find('[data-test="job-pubilshed-date"]') .text() .replace(/\s+/g, ' ') .trim(); // Budget (fixed price) or hourly rate range const budget = card .find('[data-test="is-fixed-price"] strong') .last() .text() .trim(); const hourly = card .find('[data-test="job-type-label"]') .text() .trim(); job.budget = budget || hourly || 'Not specified'; // Required skill tags job.skills = card .find('[data-test="token"] span') .map((i, el) => $(el).text().trim()) .get() .filter(Boolean); if (job.title) jobs.push(job); }); return jobs; }
A few details keep this faithful to the page. The title and link come from the same job-tile-title-link anchor, so one lookup gives you both, and the relative href is resolved to an absolute URL so it works outside the page. The posted time is read from the job-pubilshed-date attribute (Upwork's own spelling, kept as is so the selector matches), with whitespace collapsed. Budget handling covers both posting types: a fixed-price job exposes its amount inside the is-fixed-price block, while an hourly job carries its rate label in job-type-label, so the parser takes whichever is present. Skills are collected from the token tags into a clean array.
Upwork's data-test attributes are more stable than generated class names, but they still change without notice. Treat the selectors above as a starting template, not a contract. When a field comes back empty, re-inspect the live page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.
Step 3: Assemble the full script with JSON and CSV export
Now wire the fetch and the parse into one runnable script, then write the records to disk as both JSON and CSV. Fetching with autoparse off returns raw rendered HTML, which is what the cheerio parser expects.
const fs = require('fs'); const { CrawlingAPI } = require('crawlbase'); const cheerio = require('cheerio'); const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' }); async function crawl(pageUrl) { const options = { ajax_wait: 'true', page_wait: 3000 }; const response = await api.get(pageUrl, options); if (response.statusCode === 200) return response.body; console.error(`Request failed: ${response.statusCode}`); return null; } function toCsv(rows) { const headers = ['title', 'budget', 'skills', 'posted', 'link']; const escape = (value) => `"${String(value).replace(/"/g, '""')}"`; const lines = [headers.join(',')]; for (const row of rows) { const flat = { ...row, skills: row.skills.join('; ') }; lines.push(headers.map((h) => escape(flat[h])).join(',')); } return lines.join('\n'); } async function main() { const url = 'https://www.upwork.com/nx/search/jobs/?q=seo%20expert'; const html = await crawl(url); if (!html) return; const jobs = parseJobs(html); fs.writeFileSync('upwork-jobs.json', JSON.stringify(jobs, null, 2)); fs.writeFileSync('upwork-jobs.csv', toCsv(jobs)); console.log(`Saved ${jobs.length} job postings to JSON and CSV`); } main();
Paste the parseJobs function from Step 2 into the same file so main can call it. Run it with node upwork-scraper.js and you get two files: upwork-jobs.json with the full structured records and upwork-jobs.csv ready to open in a spreadsheet. The toCsv helper flattens the skills array into a single semicolon-delimited cell, quotes every field, and doubles any embedded quotes, which matters because job titles often contain commas.
What the output looks like
The JSON file holds one object per posting in search-results order, each with the title, budget or rate, skills, posted time, and link.
[ { "title": "SEO Expert for Technical Site Audit", "budget": "$1,000", "skills": ["SEO", "Technical SEO", "On-Page SEO"], "posted": "Posted 3 hours ago", "link": "https://www.upwork.com/jobs/SEO-Expert-Technical-Site-Audit_~012abc" }, { "title": "Ongoing SEO Content Strategy", "budget": "Hourly: $25.00-$50.00", "skills": ["SEO", "Content Writing", "Keyword Research"], "posted": "Posted yesterday", "link": "https://www.upwork.com/jobs/Ongoing-SEO-Content-Strategy_~034def" } ]
The CSV mirrors the same rows with a header line, so it drops straight into Excel, Google Sheets, or any data pipeline that reads delimited files.
title,budget,skills,posted,link "SEO Expert for Technical Site Audit","$1,000","SEO; Technical SEO; On-Page SEO","Posted 3 hours ago","https://www.upwork.com/jobs/SEO-Expert-Technical-Site-Audit_~012abc" "Ongoing SEO Content Strategy","Hourly: $25.00-$50.00","SEO; Content Writing; Keyword Research","Posted yesterday","https://www.upwork.com/jobs/Ongoing-SEO-Content-Strategy_~034def"
Scale across searches and pages
One search is a demo; a real job walks several queries and several result pages. Upwork's search URL takes a q query parameter and a page parameter, so you can build a list of searches, page through each one, parse every page with the same function, and tag every row with its query before you export. Because every search results page shares the same card structure, the parser you already wrote works across all of them without changes.
const sleep = (ms) => new Promise((r) => setTimeout(r, ms)); async function scrapeSearch(query, pages) { const all = []; for (let page = 1; page <= pages; page++) { const url = `https://www.upwork.com/nx/search/jobs/?q=${encodeURIComponent( query )}&page=${page}`; const html = await crawl(url); if (!html) continue; const rows = parseJobs(html).map((j) => ({ query, ...j })); all.push(...rows); await sleep(2000); } return all; } scrapeSearch('seo expert', 3).then((rows) => { console.log(`Collected ${rows.length} postings across pages`); });
This pattern carries straight into job-market research and recruiting work. For the same approach on other hiring platforms, see how to scrape Indeed job posts and how to scrape Monster jobs with Python, which cover the results-page side of those sites.
Staying unblocked
Even with rendering handled, Upwork watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard commercial target.
-
Pace your requests. Introduce a delay between page fetches rather than hammering pages in a tight loop, as the
sleepcall above does. Spreading requests out is the single biggest factor in staying under Upwork's rate limits. - Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a limit or a challenge. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
- Read the status codes. A run that starts returning challenges or non-200 responses is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.
For the broader playbook, see how to scrape websites without getting blocked.
Is it legal to scrape Upwork?
Whether scraping Upwork is allowed depends on Upwork's terms of service, your jurisdiction, and what you do with the data. Upwork's terms restrict automated access and scraping, so collecting data can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read Upwork's Terms of Service and its robots.txt, and treat both as the boundary for what you collect. Respect Upwork's stated rate expectations and keep your request volume low enough that you are not straining its servers.
This guide is deliberately scoped to public job postings only: the title, budget or rate, required skills, posted time, and posting link that anyone can see on a public search page without logging in. It does not touch freelancer profiles, names, photos, contact details, work history, earnings, or any other personal data, and it does not cover anything behind a login. That line matters legally. Job postings are public business listings, but freelancer profiles are personal data, and the moment you collect data about identifiable individuals you fall under privacy law. Under the GDPR in the EU and the CCPA in California, scraping and storing personal data needs a lawful basis, and people have rights to access and deletion. The safe and defensible position is to stay on the public postings, never build profiles of individuals, and never harvest contact details for unsolicited outreach.
If your project needs more than public postings, or you want guaranteed structure and commercial rights, look for an official channel rather than a cleverer scraper. Upwork offers an enterprise and API program for sanctioned data access with clear usage terms, which is the right tool when you need volume or a contractual basis. For aggregate, non-personal market research (which skills are in demand, what rates a category commands, how posting volume trends), public postings stripped of any individual identity are usually enough, and that is exactly what this scraper collects.
Key takeaways
- Upwork renders search results client-side and blocks hard. A plain request returns an empty shell or a challenge, so you must render the page behind a trusted IP before you parse it.
-
The Crawling API does both in one call. Use the JavaScript token with
ajax_waitandpage_waitso the job cards populate; take raw HTML to parse yourself or passautoparse: 'true'for JSON. - cheerio extracts the fields. Select every job tile, then read title, budget or rate, skills, posted time, and link, and expect the selectors to drift over time.
- Export to JSON and CSV. Write structured records to both formats, flattening the skills array and quoting CSV fields so comma-heavy titles stay intact.
- Stay on public postings only. Collect public job data, never freelancer profiles or personal details, respect Upwork's ToS and robots.txt, and mind GDPR and CCPA whenever personal data is involved.
Frequently Asked Questions (FAQs)
What data can I scrape from Upwork legally?
Stick to public job postings: the title, budget or hourly rate, required skills, posted time, and the link to each posting, all of which appear on a public search page without logging in. Avoid freelancer profiles, names, photos, contact details, work history, and earnings, since those are personal data and fall under privacy law. This guide is scoped to public postings only for exactly that reason.
Why does a plain request return incomplete data from Upwork?
Because Upwork renders its search results client-side with JavaScript and challenges automated traffic. A raw HTTP request from a datacenter IP usually returns an empty shell or a block page rather than the job cards. To get a complete page you have to render it in a real browser behind a trusted IP, which is what the Crawling API handles for you when you use the JavaScript token.
Do I need the normal token or the JavaScript token?
Use the JavaScript token. Upwork search pages are rendered in the browser, so a normal-token request returns an unpopulated shell. The JavaScript token runs the page in a real browser, and passing ajax_wait with a short page_wait gives the page's scripts time to fill in the job cards before the HTML is captured.
My selectors return empty values. What changed?
Almost certainly Upwork's markup. Even the data-test attributes used here can change without notice, so selectors that worked last month can break. Re-inspect a live page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.
Can I scrape freelancer profiles or contact details from Upwork?
No, and this guide does not cover it. Freelancer profiles, names, contact details, work history, and earnings are personal data, not public business listings, so scraping them raises GDPR and CCPA obligations and runs against Upwork's terms. Keep your collection to public job postings, never build profiles of individuals, and never harvest contact details for unsolicited outreach. For sanctioned access to more than public postings, use Upwork's official API or enterprise program.
How do I avoid getting blocked while scraping Upwork?
Keep your per-IP request rate low, add delays between page fetches, and route through rotating residential IPs so no single address trips a rate limit or a challenge. The Crawling API manages rotation, a trusted IP pool, and challenge handling for you; if you build your own stack, that is the part to invest in. Watch the status codes and back off when you start seeing non-200 responses.
Crawl any site at scale, without fighting infrastructure.
Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.
