python - lxml data from two tags -
my code this
response = urllib2.urlopen("file:///c:/data20140801.html") page = response.read() tree = etree.html(page) data = tree.xpath("//p/span/text()")
html page have structures
<span style="font-size:10.0pt">something</span>
html page have structures
<p class="normal"> <span style="font-size:10.0pt">some</span> <span style="font-size:10.0pt">thing<span> </p>
using same code both want "something"
the xpath expression returns list of values:
>>> lxml.html import etree >>> tree = etree.html('''\ ... <p class="normal"> ... <span style="font-size:10.0pt">some</span> ... <span style="font-size:10.0pt">thing<span> ... </p> ... ''') >>> tree.xpath("//p/span/text()") ['some', 'thing']
use ''.join()
combine 2 strings one:
>>> ''.join(tree.xpath("//p/span/text()")) 'something'
Comments
Post a Comment