Jsoup removing elements automatically? -


i've been using jsoup while encountered bug jsoup automatically remove "table" element , can not find workaround...

document doc = jsoup.connect("http://www.planet-series.tv/dr-house/").get(); system.out.println(doc); 

if navigate link in piece of code, can see there multiple element "table" (for example: under "saison 01 (vf)", there 22 table elements containing "episode x"), yet absent in jsoup output...

expected

table elements

result

imgur

i tried document simple httpclient, print (table elements there), parse jsoup, reprint (table elements gone) know it's not javascript issue or whatever , jsoup indeed causing it.

can tell me missing please?

some websites perform optimization/restriction based on user-agent data (the header browser attach request inform website type of browser). website block content if user agent not set.

you try use simplified mozilla user agent simulate real browser , fetch data:

document doc = jsoup.connect("http://www.planet-series.tv/dr-house/")                 .useragent("mozilla").get(); system.out.println(doc); 

if not work, , hit bug of jsoup, fetch data using httpclient , create document using:

document doc = jsoup.parse(html); 

where html string containing page content.


Comments

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

javascript - Highcharts multi-color line -

javascript - Enter key does not work in search box -