javascript - regex: find all uncommented tags -


this question has answer here:

i want extract nodes html or xml file not commented out. following regex currect approach.

my regex

/<span.*?>([\s\s]*?)<\/span>/gi 

here example xml

<div> <p>     <span style="font-size: 20px;">hello</span>     <span style="font-size: 20px;">world</span> </p> <p>     <!--     <span>hello</span>      <span>world</span>     --> </p> <p>     <span>hello</span>     <span>world</span> </p> <!-- <p>     <span>hello</span>      <span>world</span> </p> --> 

i appreciate help.

best regards, michael

well, can remove comments decent parser (domdocument in case) , analyze remaining part afterwards. consider following code (mind changed numbers in hello world strings make clear being removed):

<?php  $html = '<div> <p>     <span style="font-size: 20px;">hello</span>     <span style="font-size: 20px;">world</span> </p> <p>     <!--     <span>hello2</span>      <span>world2</span>     --> </p> <p>     <span>hello3</span>     <span>world3</span> </p> <!-- <p>     <span>hello4</span>      <span>world4</span> </p> --> </div> ';  $dom = new domdocument; $dom->loadhtml($html); $xpath = new domxpath($dom); foreach ($xpath->query('//comment()') $comment)     $comment->parentnode->removechild($comment); $body = $xpath->query('//body')->item(0); echo $dom->savexml($body); # yields hello world , hello world3 ?> 

now commented tags have been removed. obviously, can fiddle around xpath more precise.


Comments