parsing - Bash parse HTML -

July 15, 2011

i have html lots of data , part interested in:

<tr valign=top> <td><b>total</b></td> <td align=right><b>54<b></td> <td align=right><b>1<b></td> <td align=right>0 (0/0)</td> <td align=right><b>0<b></td> </tr>

i try use awk is:

awk -f "</*b>|</td>" '/<[b]>.*[0-9]/ {print $1, $2, $3 }' "index.html"

but want have:

54 1 0 0

right getting:

'<td align=right> 54' '<td align=right> 1' '<td align=right> 0'

any suggestions?

awk  -f '[<>]' '/<td / { gsub(/<b>/, ""); sub(/ .*/, "", $3); print $3 } ' file

output:

54 1 0 0

another:

awk  -f '[<>]' ' /<td><b>total<\/b><\/td>/ {     while (getline > 0 && /<td /) {         gsub(/<b>/, ""); sub(/ .*/, "", $3)         print $3     }     exit }' file

Search This Blog

O9

parsing - Bash parse HTML -

Comments

Post a Comment

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

Error while updating a record in APEX screen -

ios - Xcode 5 "No such file or directory" -