I have a content which has content along with HTML tags inside the content. I am trying to identify <ins></ins>
and <del></del>
with the conditions mentioned in the image
http://i.stack.imgur.com/8iNWl.png
The regex is https://regex101.com/r/cE4mE3/30
It is failing in only single case, that is when there an HTML tag or special character inside <ins></ins>
its not identifying correctly. In the above regex there is a </ins></ins>
inside another <ins></ins>
and hence it is breaking before the start of open <ins>
tag. The regex identification must stop only when there is fullstop or comma or space between an <ins></ins>
. But if there is any HTML tag or another <ins></ins>
tag itself inside another <ins></ins>
the identification must continue.
In the above regex the groups which are to be selected are
1. <ins class="ins">ff</ins><del class="del">C</del>om<del class="del"> </del><ins class="ins"><ins class="ins">g</ins></ins><del class="del"> g</del>gp<del class="del">a</del>n<del class="del">y</del>
and
2. test<del class="del">test</del><ins class="ins">tik</ins><del class="del">peop</del>man<del class="del"> </del></i><del class="del"> g</del>gp<del class="del">a</del>n<del class="del">y</del>
But as there are HTML tags between the identification is stopping near the HTML tag in 1 and 2 groups.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…