I am working with documents that contain valid xml only to a certain depth once that depth is reached there might be invalid xml however one can be sure that inside the invalid text the previous xml tags won’t occur. (I guess one could assume there is binary inside )
Therefore I need to parse the document only to a certain depth the rest should be handled as text even if it might contain other tags.
Is this possible with lxml, writing my own lexer would definitely be an overkill
question from:
https://stackoverflow.com/questions/65950050/parsing-xml-with-lxml-only-certain-depth 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…