I'm trying to drop nodes from a very large XLF-File (which is basically an XML file) via powershell.
The structure is (simplified) always the following:
<?xml version="1.0" encoding="utf-8"?>
<xliff>
<file>
<body>
<group>
<trans-unit>
<source>asd</source>
<target>asd</target>
</trans-unit>
<trans-unit>
<source> </source>
<target> </target>
</trans-unit>
<trans-unit>
<source>asd</source>
<target>asdf</target>
</trans-unit>
</group>
</body>
</file>
</xliff>
Now i want to remove all nodes in this file where source and target are equal.
Here is what i have so far:
Match function:
function Match{
param(
$sourceNode,$targetNode
)
#do this because empty string as xml value is of type xmlElement and fails to compare
if ($sourceNode.innerText -eq " ") {
$source = $sourceNode.innerText
}
else {
$source = $sourceNode
}
if ($targetNode.innerText -eq " ") {
$target = $targetNode.innerText
}
else {
$target = $targetNode
}
return $source -eq $target
}
Code to remove nodes:
$xml = [xml]((Get-Content $xmlPath -Encoding UTF8).Replace("trans-unit", "transunit"))
$xml.xliff.file.body.group.transunit | ForEach-Object {
if (Match $_.source $_.target) {
$_.parentNode.RemoveChild($_) | Out-Null
}
}
$xml = [xml]($xml.OuterXml.Replace("transunit", "trans-unit"))
$xml.Save($outPath)
This works, but unfortunately it is very slow as the file has roughly 300 000 nodes. It is important that the nodes keep their attributes while saving to further process the file later.
A faster approach which I was not able to finish is the following:
$xml = [xml]([System.IO.File]::ReadAllText($xmlPath).Replace("trans-unit", "transunit"))
$filteredNodes = $xml.xliff.file.body.group.transunit | Where-Object {
!(Match $_.source $_.target)
}
???
$xml = [xml]($xml.OuterXml.Replace("transunit", "trans-unit"))
$xml.Save($outPath)
to get a List containing all XmlNodes where target and source are different, but unfortunately I was not able to pass this list back into the xml document
Is there a faster way to remove those matching nodes from the file?
question from:
https://stackoverflow.com/questions/65904307/powershell-faster-version-of-xml-parentnode-removechild 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…