parsing - Bash parse HTML -
i have html lots of data , part interested in:
<tr valign=top> <td><b>total</b></td> <td align=right><b>54<b></td> <td align=right><b>1<b></td> <td align=right>0 (0/0)</td> <td align=right><b>0<b></td> </tr>
i try use awk is:
awk -f "</*b>|</td>" '/<[b]>.*[0-9]/ {print $1, $2, $3 }' "index.html"
but want have:
54 1 0 0
right getting:
'<td align=right> 54' '<td align=right> 1' '<td align=right> 0'
any suggestions?
awk -f '[<>]' '/<td / { gsub(/<b>/, ""); sub(/ .*/, "", $3); print $3 } ' file
output:
54 1 0 0
another:
awk -f '[<>]' ' /<td><b>total<\/b><\/td>/ { while (getline > 0 && /<td /) { gsub(/<b>/, ""); sub(/ .*/, "", $3) print $3 } exit }' file
Comments
Post a Comment