Page tree
Skip to end of metadata
Go to start of metadata


ELEMENTS_BY_SELECTOR_QUERY(<string [containing HTML elements]>;<selector query>)


Returns all elements that match the selector query as a list.

For more information on selector queries, see 


Download the example file: HTML_File_Example.html

Given the following excerpt from the HTML file:

<table border="1" rules="groups">
			<th>Association 1</th>
			<th>Association 2</th>
			<th>Association 3</th>
			<td><i>affected:<br>4 Million People</i></td>
			<td><i>affected:<br>2 Million People</i></td>
			<td><i>affected:<br>1 Million People</i></td>
			<td>New York</td>
			<td>San Francisco</td>

The goal is to extract only the table data content that is located in the table body. Looking at the jsoup documentation on defining queries, a possible query to use is:

ancestor child: child elements that descend from ancestor

In this case, first extract the ancestor table body and then the child table data.

tbody td

The results are the table data <td> elements that are located in the table body <tbody> tag.

[<td>New York</td>, <td>San Francisco</td>, <td>Atlanta</td>, <td>Bread</td>, <td>Biscuits</td>, <td>Rolls</td>, <td>Sandwich</td>, <td>Soup</td>, <td>Salad</td>]

  • No labels