Pages

Sunday, September 11, 2016

PowerShell v3 Some XML Basic Tasks

XML is one of my weaksuits, particularly when it comes to XSLT, XPath, etc. A current project forced me to stop long enough to get my head around a few simple tasks. for starters, here is a list of some essential uses of PowerShells Select-Xml cmdlet. To get started, we create a simple XML structure:
[xml] $xml = @"
     
            1
            2
     
     
            3
            4
     
"@
This is a basic XML structure with 1 root node (nodes), two node elements, each of which have two child elements (a and b, and, c and d). Select-Xml has two key parameters: 1) XPath and 2)  Xml. These alone can give you a functional use of Select-Xml. So, here we will explore three different uses of Select-Xml that yield identical results:
Select-Xml -XPath "/nodes/node/a" -Xml $xml
Select-Xml -XPath "//node/a" -Xml $xml
Select-Xml -XPath "//a" -Xml $xml
all return the folowing:
Node               Path                Pattern                                 
----               ----                -------                                             
a                  InputStream         /nodes/node/a                                       
When you use the // syntax, as opposed to the / syntax, it essential starts searching at that depth as if it were the root. So, to search from the node element as if it were the absolute root, I use //node/a. Similarly, to search from a as if it were the root node I would use //a. So, if you have an XML structure you can use double slashes and start from there to search across all nodes. This was useful to realize.

Next in my explorations was the use of the brackets, []. In XPath, as indicated in the link below.
http://www.w3schools.com/xpath/xpath_syntax.asp
the notation of [a=1] is used to match nodes where the attribute (or element) a has a value of 1. In this case, the syntax, can be //nodes/node[a=1] or //node[a=1] based on what was covered in the previous paragraph. It is important to note this is useful to narrowing down a set of nodes to a specific item (or set of items) and increase precision when matching. Similar to what was shown int he last paragraph, the two following statements are equivalent,
# To find specific elements
Select-Xml -XPath "//node[a=1]" -Xml $xml
Select-Xml -XPath "/nodes/node[a=1]" -Xml $xml
Both produce:
Node               Path                Pattern                                  
----               ----                -------                                              
node               InputStream         /nodes/node[a=1]
What is important here is that it focuses in on specific nodes instead of all nodes with the element a.

Next, is the use of the .SelectSingleNode() method. This approach is useful for finding singular resulting nodes. Building on our previous query, we can use this approach to find the node where c = 3,
# To remove specific elements - in this case,
# the node that has an element c with a value 3
# Demonstrate which node is selected with this
# XPath query with Select *
$xml.SelectSingleNode("//nodes/node[c=3]") |
Select-Object *
The cool thing about the use of Select-Object, in conjunction with .SelectSingleNode, is that you can see the full scope of what is returned by the cmdlet. For example, the above returns this:
id              : 2
c               : 3
d               : 4
Name            : node
LocalName       : node
NamespaceURI    :
Prefix          :
NodeType        : Element
ParentNode      : nodes
OwnerDocument   : #document
IsEmpty         : False
Attributes      : {id}
HasAttributes   : True
SchemaInfo      : System.Xml.XmlName
InnerXml        : 34
InnerText       : 34
NextSibling     :
PreviousSibling : node
Value           :
ChildNodes      : {c, d}
FirstChild      : c
LastChild       : d
HasChildNodes   : True
IsReadOnly      : False
OuterXml        : 34
BaseURI         :
If you never pipelined the output to Select-Object * you may never realize there are so many options from which one can choose after using the Select-Xml cmdlet. Addtionally, the OuterXml indicates the larger picture of what branch of the XML structure you are zeroing in on. Similarly, the InnerXml property shows the immediate element.

 One of the last items I want to highlight is how to remove a single node. Say, for instance, I want to remove the node where c = 3. Extending the .SelectSingleNode() method call from the previous step, I could call this:
# Remove node identified in previous step
$xml.nodes.RemoveChild($xml.SelectSingleNode("//nodes/node[c=3]"))
After running this, we can look at $xml and see what remains.
# See what remains from original XML structure
$xml |
Select-Object *
The output from this is somewhat lengthy,
nodes             : nodes
NodeType          : Document
ParentNode        :
DocumentType      :
Implementation    : System.Xml.XmlImplementation
Name              : #document
LocalName         : #document
DocumentElement   : nodes
OwnerDocument     :
Schemas           : System.Xml.Schema.XmlSchemaSet
XmlResolver       :
NameTable         : System.Xml.NameTable
PreserveWhitespace: False
IsReadOnly        : False
InnerText         :
InnerXml          : 12
SchemaInfo        : System.Xml.Schema.XmlSchemaInfo
BaseURI           :
Value             :
ChildNodes        : {nodes}
PreviousSibling   :
NextSibling       :
Attributes        :
FirstChild        : nodes
LastChild         : nodes
HasChildNodes     : True
NamespaceURI      :
Prefix            :
OuterXml          : 12
However, as noted by the OuterXml property, the remaining branches of the original $xml structure is 12. This tells us that the node containing the element where c = 3 was indeed removed.


There are a ton of other things we could explore with regards to XML and PowerShell, but, these simple tasks offer a simple set of how to steps that help to show XML is not impossible to figure out. It just takes some experimentation and reading.

Related Posts by Categories

0 comments:

Post a Comment