c# - Extracting data from a large file with regex -


i have close 800 mb file consists of several (header followed content). header looks m=013;x=rast;645.jpg while content binary of jpg file.

so file looks

m=013;x=rast;645.jpgnulœdüŠˆ.....m=217;x=rast;113.jpgnulÿñÿÿ&åbÿås....m=217;x=rast;1108.jpgnul]_ÿ×ÉcË/... 

the header can occur in 1 line or across 2 lines.

i need parse file , pop out several jpg images.

since big file, please suggest efficient way? hoping use streamreader not have experience regular expressions use it.

regex:
/(m=.+?;x=.+?;.+?\.jpg)(.+?(?=(?1)|$))/gs *with recursion (not supported in .net)

.net regex workaround:
/(m=.+?;x=.+?;.+?\.jpg)(.+?(?=m=.+?;x=.+?;.+?\.jpg|$))/gs
replaced (?1) recursion group contents inside 1st capture group

live demo , explanation of regexp: http://regex101.com/r/nq3pe0/1

you'll want use 2nd capture group binary contents, 1st group match header , expression needs know stop.

*edited in italic


Comments

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

javascript - Highcharts multi-color line -

javascript - Enter key does not work in search box -