Gary Sieling

Node: get all text within a div

There are a lot of examples that get you parts of the text on a page, but most of them don’t seem to be able to get nested text.

The easiest way to do this is with jsDom, which is also the heaviest one:

let jsdom = require('jsdom');

let file = 'D:/projects/image-annotation/data/talks/pages/talk200.html';
jsdom.env(
  file,
  ["http://code.jquery.com/jquery.js"],
  (err, window) => {
    console.log(
      window.$(".transcript-text").text()
    );
  }
);

This will get you the entire text contents within a section of the page, which is ideal if you’re doing scraping.

Exit mobile version