Properly reading an XML document in Powershell works like this:
$doc = New-Object xml
$doc.Load( (Convert-Path bookstore.xml) )
XML can come in numerous file encodings, and using the XmlDocument.Load
method makes sure the file is read properly without prior knowledge of the encoding.
Not reading a file with the correct encoding will result in mangled data or errors except in very basic or very lucky cases.
The often-seen method of using Get-Content
and casting the resulting string to [xml]
is the wrong way of dealing with XML for this very reason. So don't do that.
You can get a correct result with Get-Content
, but that requires
- Prior knowledge of the file encoding (e.g.
Get-Content bookstore.xml -Encoding UTF8
)
- Hard-coding the file encoding into your script (meaning it will break if the XML encoding ever changes unexpectedly)
- Limiting yourself to the very few file encodings that
Get-Content
supports (XML supports more)
It means you put yourself in a position where you have to manually think about and solve a problem that XML has been specifically designed to automatically handle for you.
Doing things correctly with Get-Content
is a lot of unnecessary extra work and limitations. And doing things incorrectly is pointless when doing it right is so easy.
Examples, after loading $doc
like shown above.
$doc.bookstore.book
prints a list of <book>
elements and their properties
genre : novel
publicationdate : 1997
ISBN : 1-861001-57-8
title : Pride And Prejudice
author : author
price : 24.95
genre : novel
publicationdate : 1992
ISBN : 1-861002-30-1
title : The Handmaid's Tale
author : author
price : 29.95
genre : novel
publicationdate : 1991
ISBN : 1-861001-57-6
title : Emma
author : author
price : 19.95
genre : novel
publicationdate : 1982
ISBN : 1-861001-45-3
title : Sense and Sensibility
author : author
price : 19.95
$doc.bookstore.book | Format-Table
prints the same thing as a table
genre publicationdate ISBN title author price
----- --------------- ---- ----- ------ -----
novel 1997 1-861001-57-8 Pride And Prejudice author 24.95
novel 1992 1-861002-30-1 The Handmaid's Tale author 29.95
novel 1991 1-861001-57-6 Emma author 19.95
novel 1982 1-861001-45-3 Sense and Sensibility author 19.95
$doc.bookstore.book | Where-Object publicationdate -lt 1992 | Format-Table
filters the data
genre publicationdate ISBN title author price
----- --------------- ---- ----- ------ -----
novel 1991 1-861001-57-6 Emma author 19.95
novel 1982 1-861001-45-3 Sense and Sensibility author 19.95
$doc.bookstore.book | Where-Object publicationdate -lt 1992 | Sort publicationdate | select title
sorts and prints only the <title>
field
title
-----
Sense and Sensibility
Emma
There are many more ways of slicing and dicing the data, it all depends on what you want to do.