A public YouTube channel is a compact dataset hiding in plain sight. Open any creator's Videos tab and you can see, for every upload, the title, the view count, how long ago it went live, the runtime, and a link to the watch page. Aggregated across a channel, that public metadata powers competitive content analysis, publishing-cadence research, and trend tracking, all without touching a single private account.

This guide shows you how to build a YouTube channel scraper in JavaScript and Node.js. You build a small, runnable scraper that fetches a public channel's Videos page through the Crawling API with JavaScript rendering, parses each video card with cheerio, and exports the results as JSON or CSV. The whole walkthrough stays scoped to public, non-personal data: video listings on a channel anyone can open, never private playlists, comments, or anything behind a sign-in. The legality section near the end is not boilerplate, so read it before you point this at any real volume.

What you will build

A Node.js script that takes a public YouTube channel's Videos URL, retrieves the JavaScript-rendered HTML through the Crawling API, and extracts a structured record for each video on that page. We use a public channel's Videos tab as the running example and pull these fields per video:

  • Title the video title as shown on the card, for example "Official Trailer".
  • Views the public view count text, like "56K views".
  • Upload date the relative published date, such as "6 days ago".
  • Duration the runtime badge on the thumbnail, like "2:14".
  • Link the absolute URL to the individual watch page.

We also read a little channel-level context (the channel title and handle) so each export is self-describing, then write the video list out to JSON and CSV.

Why a plain request fails on YouTube

If you request a channel's Videos URL with a bare HTTP client, you get a status 200 and almost none of the data you came for. YouTube builds the video grid in the browser: the initial HTML is a thin shell, and the real card markup is injected by JavaScript after the page's scripts run and the watch data loads. A raw fetch captures the page before any of that happens, so cheerio finds an empty grid.

On top of rendering, YouTube watches for automated traffic. Datacenter IPs and request patterns that do not look like a real browser get challenged or rate-limited quickly, so even a headless browser on a bare server tends to stall. A working channel scraper therefore needs two things in one request: a browser that actually renders the grid, and an IP the platform reads as a real visitor. You can assemble that yourself with a headless browser plus a pool of rotating residential proxies, but stitching those together and keeping them healthy is most of the work. The Crawling API folds both into a single call: you send it the URL with JavaScript rendering enabled, it renders the page behind a trusted IP, and it returns finished HTML for you to parse.

Why render the page

Crawlbase offers two request types. A normal request fetches static HTML; a JavaScript request renders the page in a real browser first. YouTube builds the video grid client-side, so the JavaScript request is what gives you a populated page here. A normal request returns the empty shell, leaving cheerio nothing to parse. The ajax_wait and page_wait options let you hold for the grid to finish loading.

Prerequisites

You need a few things in place before writing any code. None of them take long.

Basic JavaScript and Node.js. You should be comfortable writing and running a Node script and installing packages with npm. If you are new to Node, the official docs and any beginner course will get you to the level this tutorial assumes. For a fuller walkthrough, see our guide on how to build a web scraper with Node.js.

Node.js 16 or later. Confirm your version with node --version. If you do not have it, install it from the Node.js website or through a version manager like nvm.

A Crawlbase account and token. Sign up, open your dashboard, and copy your JavaScript request token from the account docs page. Treat the token like a password: it authenticates your requests, so keep it out of version control. New accounts include a batch of free requests, and you pay only for successful ones, which is plenty to follow along here.

Set up the project

Create a project folder, initialize it, and install the two libraries the scraper needs.

bash
node --version

mkdir youtube-channel-scraper && cd youtube-channel-scraper
npm init -y

npm install crawlbase cheerio

Two dependencies do the work: crawlbase is the official Node client for the Crawling API, and cheerio parses the returned HTML with a jQuery-style API so you can pull out individual fields by CSS selector. If selectors are new to you, the primer on crawling JavaScript websites pairs well with this guide.

Step 1: Fetch the rendered Videos page

Start by getting the finished page. Import the CrawlingAPI class, initialize it with your token, and request the channel's Videos URL. Because the grid renders client-side, pass ajax_wait and a page_wait so late-loading cards appear before the page is captured. Checking the status code before you parse keeps failures loud instead of silent.

javascript
const { CrawlingAPI } = require('crawlbase');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

async function crawl(pageUrl) {
  const options = { ajax_wait: 'true', page_wait: 7000 };
  const response = await api.get(pageUrl, options);
  if (response.statusCode === 200) {
    return response.body;
  }
  console.error(`Request failed: ${response.statusCode}`);
  return null;
}

const channelUrl = 'https://www.youtube.com/@Netflix/videos';
crawl(channelUrl).then((html) => {
  console.log(html ? html.slice(0, 500) : 'No HTML returned');
});

The two wait options matter for a client-rendered target like this. ajax_wait tells the API to wait for asynchronous content to finish loading, and page_wait holds for a fixed number of milliseconds after load so the late-rendering video grid appears before the page is captured. Seven seconds is a reasonable start for YouTube; raise it if the grid comes back empty. Run the script with node scraper.js and you should see real channel markup, not a stripped-down shell. That confirms rendering works before you write a single selector.

Crawlbase Crawling API

That YouTube grid needs a rendered page behind a trusted IP, in one call. The Crawling API takes your token, runs the page in a real browser, waits out the ajax_wait and page_wait window you just set, rotates through residential IPs server-side, and hands you finished HTML, so you skip running a headless browser fleet and a proxy pool yourself. Point it at a public channel on the free tier first.

Step 2: Parse each video with cheerio

With rendered HTML in hand, load it into cheerio and walk the video cards. The rendered Videos tab lays each upload out in a repeating grid renderer, so you select every card, then read the title, views, upload date, duration, and watch link from inside it. Reading each field defensively keeps one missing value from crashing the run, and the relative URL on the title anchor needs to be made absolute.

javascript
const cheerio = require('cheerio');

function parseChannel(html) {
  const $ = cheerio.load(html);
  const channel = {
    title: $('#inner-header-container .ytd-channel-name .ytd-channel-name:first').text().trim(),
    channelHandle: $('.meta-item #channel-handle').text().trim(),
    videos: [],
  };

  $('#contents ytd-rich-item-renderer').each((_, el) => {
    const card = $(el);
    const title = card.find('#video-title').text().trim();
    if (!title) return;

    const meta = card.find('#metadata-line span');
    const href = card.find('a#video-title-link').attr('href')
      || card.find('a#thumbnail').attr('href');

    channel.videos.push({
      title,
      views: $(meta).eq(0).text().trim() || null,
      uploadDate: $(meta).eq(1).text().trim() || null,
      duration: card.find('#overlays #text').text().trim() || null,
      link: href ? new URL(href, 'https://www.youtube.com').href : null,
    });
  });

  return channel;
}

A couple of details keep this resilient. The video grid uses YouTube's ytd-rich-item-renderer cards, and inside each one #video-title holds the title while #metadata-line span holds the two metadata spans: the first is the view count, the second is the relative upload date. The duration badge sits in the thumbnail overlay under #overlays #text. The watch link is read from the anchor's href with attr rather than its text, then passed through the URL constructor so a relative /watch?v=... path becomes an absolute link. Each field falls back to null when the element is missing.

Selectors drift

YouTube's element ids and renderer tags (ytd-rich-item-renderer, #video-title, #metadata-line, and the rest) change without notice, and the markup differs between the Videos tab, Shorts, and the channel home. Treat the selectors above as a starting template, not a contract. When a field comes back as null, re-inspect the live rendered page in your browser's dev tools and update the selector. Periodic selector maintenance is normal for any production scraper, not a sign something is broken.

Step 3: Put it together

Now wire the fetch and the parse into one runnable script. Fetch the rendered HTML, hand it to the parser, and print the structured channel record.

javascript
const { CrawlingAPI } = require('crawlbase');
const cheerio = require('cheerio');

const api = new CrawlingAPI({ token: 'YOUR_CRAWLBASE_TOKEN' });

async function crawl(pageUrl) {
  const options = { ajax_wait: 'true', page_wait: 7000 };
  const response = await api.get(pageUrl, options);
  if (response.statusCode === 200) return response.body;
  console.error(`Request failed: ${response.statusCode}`);
  return null;
}

function parseChannel(html) {
  const $ = cheerio.load(html);
  const channel = {
    title: $('#inner-header-container .ytd-channel-name .ytd-channel-name:first').text().trim(),
    channelHandle: $('.meta-item #channel-handle').text().trim(),
    videos: [],
  };
  $('#contents ytd-rich-item-renderer').each((_, el) => {
    const card = $(el);
    const title = card.find('#video-title').text().trim();
    if (!title) return;
    const meta = card.find('#metadata-line span');
    const href = card.find('a#video-title-link').attr('href')
      || card.find('a#thumbnail').attr('href');
    channel.videos.push({
      title,
      views: $(meta).eq(0).text().trim() || null,
      uploadDate: $(meta).eq(1).text().trim() || null,
      duration: card.find('#overlays #text').text().trim() || null,
      link: href ? new URL(href, 'https://www.youtube.com').href : null,
    });
  });
  return channel;
}

async function main() {
  const channelUrl = 'https://www.youtube.com/@Netflix/videos';
  const html = await crawl(channelUrl);
  if (!html) return;
  const channel = parseChannel(html);
  console.log(`${channel.title} (${channel.channelHandle}): ${channel.videos.length} videos`);
  console.log(JSON.stringify(channel.videos.slice(0, 3), null, 2));
}

main();

What the output looks like

Run the full script with node scraper.js and you get a clean array of records, one per video, ready to write to JSON, CSV, or a database. The view counts and upload dates are the public strings exactly as YouTube renders them.

json
[
  {
    "title": "Ilary Blasi: The one and only | Official Trailer | Netflix",
    "views": "56K views",
    "uploadDate": "6 days ago",
    "duration": "2:14",
    "link": "https://www.youtube.com/watch?v=pNY3NkGX6e8"
  },
  {
    "title": "Verified Stand-Up | Official Trailer | Netflix",
    "views": "50K views",
    "uploadDate": "11 days ago",
    "duration": "1:58",
    "link": "https://www.youtube.com/watch?v=gMIvGpHd2dk"
  }
]

Export to JSON and CSV

Printing to the console is fine for a check, but you usually want the data on disk. Writing JSON is a one-liner; a small CSV writer keeps each video on its own row with the fields escaped so commas in a title do not break the columns.

javascript
const fs = require('fs');

function saveJson(channel, file) {
  fs.writeFileSync(file, JSON.stringify(channel, null, 2));
}

function saveCsv(videos, file) {
  const headers = ['title', 'views', 'uploadDate', 'duration', 'link'];
  const cell = (v) => `"${(v ?? '').replace(/"/g, '""')}"`;
  const rows = videos.map((v) => headers.map((h) => cell(v[h])).join(','));
  fs.writeFileSync(file, [headers.join(','), ...rows].join('\n'));
}

// in main(), after parseChannel():
saveJson(channel, 'channel.json');
saveCsv(channel.videos, 'channel.csv');

The cell helper wraps every value in quotes and doubles any embedded quote, the standard CSV escaping, so a title containing a comma or a quotation mark stays in a single column. JSON keeps the nested channel object; CSV flattens just the video rows, which is the shape most spreadsheets and BI tools expect.

Handle the full channel listing

The rendered Videos page gives you the most recent batch of uploads, not the entire back catalogue, because YouTube lazy-loads more cards as you scroll. One rendered page is a solid demo and is often all you need for recent-activity research. To reach deeper into a channel, you have a couple of options that keep the same fetch-then-parse pattern.

  • Scroll with the render. Pass a longer page_wait and use the Crawling API's scrolling options so the page loads more rows before it is captured, then run the same parseChannel over the larger grid.
  • Walk per-video links. Take each link from the parsed list and fetch that watch page through the same crawl function to enrich a row with public details from the video page itself, then write a small parser for that layout.

Whichever you choose, the structure does not change: render, then parse. Keep your request volume modest so you are reading a channel, not stress-testing it.

Staying unblocked

Even with rendering handled, YouTube watches for scraper-shaped traffic. A few habits keep a run healthy, and they apply to any hard, popular target.

  • Pace your requests. Hammering channels in a tight loop is the fastest way to get throttled. Spread requests out and vary the channels you fetch instead of crawling one path at full speed.
  • Lean on rotation. A pool of residential IPs spreads requests across many real-user addresses so no single one trips a rate limit. The Crawling API handles this for you; if you roll your own stack, this is the part to get right.
  • Read the status codes. A run that starts returning challenges or errors is telling you the current rate or IP tier is no longer enough. Treat that as signal to back off, not noise to ignore.

For the broader playbook, see how to scrape websites without getting blocked. If you would rather route your own traffic through a rotating pool instead of using the managed API, the Smart AI Proxy gives you the same residential IP rotation as a drop-in proxy endpoint. For a different shape of YouTube data, video and search results rather than a channel grid, see our guide on how to scrape YouTube data.

Whether scraping YouTube is allowed depends on YouTube's Terms of Service, your jurisdiction, and what you do with the data. YouTube's terms restrict automated access to the site, so scraping can run against those terms regardless of how careful your tooling is. None of the code here changes that; it just makes the technical part work. Read YouTube's Terms of Service and its robots.txt, respect its rate expectations, and treat both as the boundary for what you collect.

Keep the scope tight. Collect only public, non-personal data: the video titles, view counts, upload dates, durations, and watch links that anyone can see on a public channel without signing in. Aggregate where you can (publishing cadence, view distributions, topic trends) rather than building a profile of an individual creator. Do not touch anything behind a login, private or unlisted videos, comments, or the personal data of viewers and creators. Where any personal data is involved, privacy laws such as the GDPR and the CCPA apply: you need a lawful basis to process it and you must honor deletion and access requests. Do not redistribute copyrighted video content you pull links to; a link to a public watch page is not a license to rehost the media.

For anything beyond ad-hoc public research, YouTube offers an official, sanctioned path: the YouTube Data API. It returns channel and video metadata in a stable, structured form with a clear quota and terms, and it is the right tool when you need volume, reliability, or commercial rights. This guide is deliberately scoped to public channel listing pages because that is the line that keeps the work defensible. It does not cover private videos, comments, viewer data, account-gated content, or any attempt to bypass authentication. If your project needs more than public listings, the YouTube Data API or a data agreement is the correct route, not a cleverer scraper.

Recap

Key takeaways

  • YouTube renders the grid client-side. A plain request returns an empty shell, so you must render the Videos page before you parse it.
  • You need rendering and a trusted IP together. The Crawling API with a JavaScript request does both in one call; ajax_wait and page_wait control how long it waits for the cards.
  • cheerio does the extraction. Select every ytd-rich-item-renderer card, then map title, views, upload date, duration, and the watch link to current selectors, and expect those selectors to drift.
  • Export and scale carefully. Write JSON for structure and CSV for spreadsheets, and reach deeper into a channel with longer renders or per-video fetches plus sensible pacing.
  • Stay on public data. Respect YouTube's ToS and robots.txt, prefer the official YouTube Data API for volume or commercial use, mind GDPR and CCPA when any personal data is involved, and never touch logins, private videos, or comments.

Frequently Asked Questions (FAQs)

Why does a plain request return an empty video grid from YouTube?

Because YouTube builds the video grid in the browser with JavaScript. The initial HTML is a thin shell, and the real card markup is injected only after the page's scripts run and the watch data loads, so a raw HTTP request returns status 200 with the grid empty. To get a populated page you have to render it first, which is what the Crawling API's JavaScript request handles for you.

Do I need a normal or a JavaScript request for YouTube?

Use a JavaScript request. A normal request fetches static HTML, which on a YouTube channel comes back as the empty shell with no cards. The JavaScript request renders the page in a real browser before handing back the HTML, and the ajax_wait and page_wait options give the grid time to finish loading so cheerio has cards to parse.

My selectors return null. What changed?

Almost certainly YouTube's markup. Its renderer tags and element ids (ytd-rich-item-renderer, #video-title, #metadata-line, and the rest) change without notice, and they differ between the Videos tab, Shorts, and the channel home, so selectors that worked last month can break. Re-inspect a live rendered page in your browser's dev tools and update the selectors. Periodic selector maintenance is normal for any production scraper.

How do I get every video, not just the latest batch?

The first render returns the most recent uploads because YouTube lazy-loads more cards on scroll. To reach further back, pass a longer page_wait with the Crawling API's scrolling options so more rows load before capture, then run the same parser over the larger grid. For deep back catalogues, the official YouTube Data API paginates channel uploads cleanly and is the better fit.

Should I use the YouTube Data API or scrape the site?

If you need volume, guaranteed structure, or commercial reuse rights, use the official YouTube Data API. It is built for that and keeps you on the right side of YouTube's terms. Scraping public channel listings with the approach in this guide fits smaller, public-data research where no API access is in place, as long as you respect the ToS, robots.txt, and rate limits.

Can I scrape comments or viewer data from a channel?

No, and this guide does not cover it. Comments are user-written personal content, and viewer data is not public, so both sit outside the public, non-personal scope here. Where personal data is involved, the GDPR and CCPA apply and you need a lawful basis plus a way to honor deletion requests. For sanctioned access to that kind of data, the YouTube Data API with the proper consent and terms is the correct route.

Start Building

Crawl any site at scale, without fighting infrastructure.

Crawlbase handles proxies, fingerprints, and CAPTCHAs so your team ships data pipelines instead of maintaining crawl plumbing. 1,000 requests free, no card required.

Self-serve · No sales call required · Enterprise crawl volumes available