Sunday, January 14, 2018

Comparing PowerShell objects for equality using a custom class

I wanted a simple way to do better-behaved object comparisons in PowerShell. So I built one.

-eq and -ne operators

Comparing PowerShell objects is something we do from the very first day we start working with PowerShell.

If ( $ObjectA -eq $ObjectB ) { Do-Something }

For simple comparisons, such as of numbers and strings, the equals operator -eq and the not equals operator -ne work just fine.

$NumberA = 6
$NumberB = 6
$NumberC = 7
$NumberA -eq $NumberB  # yields $True
$NumberA -eq $NumberC  # yields $False

The -eq and -ne operators work by leveraging the .Equals() method that is built-in to all .Net objects. (And everything you work with in PowerShell is a .Net object.) Integers and strings and version objects and a lot of other object classes have their own specific variation of the .Equals() method defined for comparing that particular type of object.

The PowerShell engine adds a layer of functionality on top of the .Net engine that allows us to compare certain unlike objects as well. In C#, you can’t compare an integer and a floating point number or a number and a string, but in PowerShell we can.

$IntegerA = 7
$DoubleB = 7.0
$StringC = '7'
$IntegerA -eq $DoubleB  # yields $True
$IntegerA -eq $StringC  # yields $True

But for most other types of objects, -eq and -ne do not work. Or at least they do not work as expected.
Objects which do not have their own unique .Equals() method inherit the method defined for the System.Object class. This .Equals method does not compare the objects at all. Instead it checks to see if the two objects are in fact the same object.

Think of it like comparing people. $MyFather and $MySistersFather are considered equal because they are the same person. $MyNephewHenry and $MyNephewHarvey are considered not equal because despite being twins, they are two separate people.

Most objects are compared this way. It doesn’t matter if some or all of their properties are identical. If they are two separate objects, they are not equal.

$FileA = Get-Item C:\Temp\Test.ps1
$FileB = Get-Item C:\Temp\Test.ps1
$FileA -eq $FileB  # yields false

In the above case, even though the two fileinfo objects are identical in every respect and refer to the same file, they are evaluated as not equal because they are two separate file objects.

You might think that will always be the case, and that such comparison is therefore completely useless. But if you think different objects are always different objects, you can run up against one of the gotchas of PowerShell variables.

We tend to think of a variable as a box with a value in it. That analogy that is generally useful, and sometimes it’s true. When $IntegerA = 7, the integer 7 is sitting in box IntegerA. (Well, not really, but close enough.)

But most non-simple objects in PowerShell are reference objects. With reference objects, what sits in the box is a pointer to where the value is sitting somewhere else in memory.

Sometimes, two different variables can hold identical pointers which point to the same, single object.

$FileA = Get-Item C:\Temp\Test.ps1
$FileB = $FileA  # As with most object types, this creates a copy of the pointer, not the object
$FileA -eq $FileB  # yields $True

When we humans want to compare complex objects, we usually mean that two objects with properties with the same values are equal.

So we can’t use -eq and -ne for comparing most objects in PowerShell.

Additionally, arrays and arraylists can’t be compared using -eq and -ne because PowerShell highjacks them and uses them for a different purpose.

In PowerShell, $Array -eq $Object is roughly the same as $Array | Where { $_ -eq $Object }

(Except the -eq inside the Where filter is a normal -eq and not the modified behavior -eq, allowing nested arrays to be ignored.)

Compare-Object

One alternative which is sometimes useful is the Compare-Object command.

At its core, Compare-Object compares two arrays of strings, and returns those strings which only exist in one of the two lists. A null result means the two arrays contain the same strings.

This is certainly useful when comparing the contents of files. Or when comparing lists of file names or computer names or whatever.

But it doesn’t handle other objects well. Actually, without some hints as to how to do it, it doesn’t handle other objects at all. If it gets anything other than strings as input, it converts them to strings using the .ToString() method built into every object.

Sometimes that is of limited use.

$ProcessListA = Get-Process
notepad.exe
$ProcessListB = Get-Process
Compare-Object -ReferenceObject $ProcessListA -DifferenceObject $ProcessListB
# yields the additional notepad process (sort of)

For a System.Diagnostics.Process object, .ToString() results in the string “System.Diagnostics.Process (process name)” with the value of the object’s Name property in parentheses. All properties other than the process name are ignored when Compare-Object makes its comparison.

$P = Start-Process notepad -PassThru
$ProcessListA = Get-Process
$P | Stop-Process
$Q = Start-Process notepad -PassThru
$ProcessListB = Get-Process
Compare-Object -ReferenceObject $ProcessListA -DifferenceObject $ProcessListB
# Yields no differences

The above yields no differences, because even though it is a different notepad process with a different process ID, Compare-Object looks only at the name and declares them equal.

Many other objects have .ToString() methods that are of no use at all. The .ToString() method on a hashtable, for example, always returns the string “System.Collections.Hashtable”.

Thus, Compare-Object believes all hashtables to be equivalent.

$HashtableA = @{ A = 2; B = 3 }
$HashtableB = @{ A = 46; C = ‘Banana’ }
Compare-Object -ReferenceObject $ HashtableA -DifferenceObject $ HashtableB
# Yields no differences

And even when Compare-Object is useful, it is very clunky to use, and not very intuitive to decipher.

If ( -not ( Compare-Object -ReferenceObject $BeforeList -DifferenceObject $RestoredList ) )
{ <# All done #> }

We can make better use of Compare-Object by using the -Property parameter to tell it which properties to compare.

$ProcessA = Start-Process notepad -PassThru
$ProcessB = Start-Process notepad -PassThru
Compare-Object -ReferenceObject $ProcessA -DifferenceObject $ProcessB
# yields no differences

Compare-Object -ReferenceObject $ProcessA -DifferenceObject $ProcessB -Property Name, ID
#  Shows us they are different processes

Sometimes that gives us what we need. But it does make it even clunkier to use.

And it still doesn’t work with hashtables.

And it requires work on the part of the scripter to figure out what properties can and should be compared. If we wanted to do work, we wouldn’t be scripting in the first place.

Compare-Object is designed for comparing arrays of strings, and it does it well.

But for arrays of any other type of object, Compare-Object converts the elements (or the specified properties of the elements) to strings. Any object type that it can’t compare well as a single object won’t be compared well as an array element either.

My custom compare class

I wanted something simple that works intuitively with more types of objects.

I could have written a function that improves upon Compare-Object, but I wanted simpler and instead decided to improve upon the -eq and -ne operators.

To modify the behavior of -eq and -ne, we need new definitions for the .Equals() method on all objects we want to behave differently. In PowerShell, we could actually modify the .Equals() method on System.Object, which would in turn be inherited by all objects that don’t override it. But .Equals() is used by a lot of things behind the scenes, and there could be many, odd, unforeseen consequences to doing so.

So instead we are going to create a custom class to which we can convert any object for purposes of comparison, with an .Equals() method override defined by us.

The syntax for using the custom class will be as simple as:

If ( [Compare]$ObjectA -eq $ObjectB ) { Do-Something }

Casting $ObjectA as a custom [Compare] object gives it the magic .Equals() method and extracts comparable properties (that is, any property of a type that can be easily compared) for comparison. $ObjectB does not need to be explicity cast; that will happen implicitly behind the scenes during the comparison.

This won’t work perfectly for all object classes and all circumstances. And it is not recursive in that it ignores properties that are not themselves comparable. But it does make object comparisons much easier in most cases that need to be easier.

Custom Compare class definition

In our class definition, we are going to define a single property to hold the extracted properties of a source object, constructors to describe how to convert objects to the custom class, and an .Equals() method for doing the comparison.

To define the class, we start with the keyword “class” followed by the name of the class.

class Compare
    {

We define one property, named .Properties, as a hashtable. This is where we will store properties extracted from a source object. We are using a hashtable instead of named properties because every object class will have different relevant properties. We set the default value to an empty hashtable @{} so that we can use its .Add() method later, which we couldn’t do if the default were implicitly $Null.

    [HashTable]$Properties = @{}

Then we define a constructor specific to hashtables.

    # Constructor for hashtable
    Compare ( [hashtable]$Hashtable )
        {

We loop through the key/value pairs of the $Hashtable and convert each value to a [Compare] object before adding it to .Properties. This will allow us to compare the contents of hashtables.

        # For each property...
        ForEach ( $Key in $Hashtable.Keys )
            {
            # Convert the value to a Compare object and add it to .Properties
            $This.Properties.Add( $Key, [Compare]$Hashtable.$Key )
            }
        }

Then we define a constructor specific to arrays. PowerShell will also automatically convert arraylists to arrays and feed them to this same constructor.

    # Constructor for array and arraylists
    Compare ( [array]$Array )
        {

If the $Array or arraylist is empty, we add an empty array to .Properties.

        # If array is empty
        # Add an empty array to Properties
        If ( $Array.Count -eq 0 )
            {
            $This.Properties.Add( 'Value', @() )
            }

If the $Array is not empty, we loop through each element, convert it to a [Compare] object and add it to .Properties. This allows us to compare arrays. The array elements will need to be in the same order to be considered equal.

        # Else (array is not empty)
        # Cast each element to Compare and add to properties
        Else
            {
            ForEach ( $i in 0..($Array.Count - 1) )
                {
                $This.Properties.Add( "$i", [Compare]$Array[$i] )
                }
            }
        }

And then a constructor for all other source objects.

    # Constructor to convert any object
    Compare ( $Object )
        {

First we look at the methods of source object. If it natively has an override defined for .Equals(), we didn’t really need to convert to [Compare], but too late now, so we’ll drop it into .Properties as is.

        # If object is already comparable (built-in Equals method)
        # Put the value of the object in Properties
        If ( @( $Object.GetType().DeclaredMethods.Name ) -contains 'Equals' )
            {
            $This.Properties.Add( 'Value', $Object )
            }

Otherwise, we need to look at the properties of the source object and decide what to do with them.

        # Else (no built in Equals method )
        # Extract the comparable properties
        Else
            {


We use Get-Member to get the names of the properties. (If we were to use the .DeclaredPropeties of the results of .GetType(), we would only get the properties defined within the class itself, and we would miss any properties inherited from the base class, or added by PowerShell.) By specifying -MemberType Properties instead of Property, we also get all of the NoteProperty, CodeProperty, and ScriptProperty members.

We loop through the $Property names

            # For each object property...
            ForEach ( $Property in $Object | Get-Member -MemberType Properties |
                    Select-Object -ExpandProperty Name )
                {

If the property value is $Null, we’ll skip it. We don’t know at this point if we would have wanted the value if it wasn’t Null, but skipping doesn’t change the functionality if we did in fact want it. A missing property and a property with a Null value are both evaluated as not equal to a property with a non-null value. And not skipping a null value for a property that we would have skipped if it were non-null would change the functionality when compared to a non-null property. As skipping null-valued properties doesn’t hurt functionality, it is an acceptable alternative to the extra work required to more fully evaluate such properties.

                # If the property value is not null...
                If ( $Object.$Property -ne $Null )
                    {

If this property is an enum (enumeration), we want it. Stick it in .Properties.

                    # If the property is an enumeration
                    # Add it to Properties
                    If ( $Object.$Property -is [enum] )
                        {
                        $This.Properties.Add( $Property, $Object.$Property )
                        }

If a property can be easily used for comparison, we will include it in the list of properties we are comparing. That is, if the property value has a non-inherited .Equals() method, and therefore works well with -eq and -ne, we add it to .Properties.

                    # If the object has an override for method .Equals()
                    # Add it to Properties
                    ElseIf ( $Object.$Property.GetType().DeclaredMethods.Name -contains 'Equals' )
                        {
                        $This.Properties.Add( $Property, $Object.$Property )
                        }
                    # Ignore any property that does not have an Equals method
                    # (thereby ignoring such properties when comparing objects)
                    }
                }
            }
        }

We are not going to “recurse” this functionality. That is, we are just going to ignore any properties that that are not easily compared. I experimented with recursing and comparing more properties, but even at a minimal recursion depth it quickly leads to problems. (For example, if you dig too deeply into a FileInfo object, you end up comparing not only metadata of the file, but also the metadata of the folder it’s in. Thus, changing an unrelated file in the same folder would make a FileInfo object not equal because the .Directory.LastWriteTime value had changed.)

This does put some limitations which objects we can usefully compare. But not many.

And then, we need to define the .Equals() method that will actually perform the comparison. Half of the work was done by the constructor when it extracted the relevant properties from the object. Now we compare the extracted properties of $This object to the extracted properties of the $InputObject.

    # Equals override
    # This is the method that is used by -eq and -ne operators
    [boolean] Equals ( $InputObject )
        {


If the $InputObject has already been converted to a [Compare] object, we can proceed with the comparison.

        # If the input object is already a Compare object...
        If ( $InputObject -is [Compare] )
            {

If the names of the extracted properties of the two objects differ, the objects are not equal. Return $False. (Unlike in PowerShell scripts and functions, in a PowerShell class method, only what is specified by the Return keyword is sent as output, and the method exits immediately. No further code is run.)

            # If the two objects have different property names
            # Return not equal
            If ( Compare-Object -ReferenceObject $This.Properties.Keys -DifferenceObject $InputObject.Properties.Keys )
                {
                return $False
                }

If the property names do match, we then loop through each property name.

            # For each property...
            ForEach ( $Key in $This.Properties.Keys )
                {

If any property value doesn’t match, we return $False and exit the method.

                # If the property values are not equal
                # Return not equal
                If ( $This.Properties.$Key -ne $InputObject.Properties.$Key )
                    {
                    return $False
                    }
                }

If an exit wasn’t triggered above, all of the extracted properties are equal. Return $True.

            # If we got this far, all tests above passed
            # Return equal
            return $True
            }

If the $InputObject is not already a [Compare] object, we cast it as a [Compare] object, and recursively call the .Equals() method.

        # Else (input object is not a Compare object)
        # Convert input object to Compare and recurse
        Else
            {
            return $This.Equals( [Compare]$InputObject )
            }
        }
    }

Lastly we are going to add a .ToString() override. This isn’t needed for the normal functioning of [Compare], but it can help when troubleshooting. Without this override, it is sometimes harder to see what properties we have when their values are nested [Compare] objects. For example, ([Compare]@( 46, 'Banana', $FileA ) ).Properties would results in this:

Name                           Value
----                           -----
2                              Compare
1                              Compare
0                              Compare

So we implement an override for .ToString() that displays the values of simple [Compare] objects and only resorts to ‘{Compare}’ for objects that can’t easily be reduced to a single string.

    # ToString override
    # ( This helps in troubleshooting by making it easier to see what properties
    # are extracted by using ([compare]$Object).Properties )
    [string] ToString()
        {
        If ( $This.Properties.Keys.Count -eq 1 -and $This.Properties.Value )
            {
            return $This.Properties.Value
            }
        Else
            {
            return '{Compare}'
            }
        }
    }

With the .ToString() override, ([Compare]@( 46, 'Banana', $FileA ) ).Properties now results in this:

Name                           Value
----                           -----
2                              {Compare}
1                              Banana
0                              46

The full method definition can be found at the end of the article.

Using the custom class to compare objects for equality

Here is the syntax for using the custom class:

If ( [Compare]$ObjectA -eq $ObjectB ) { Do-Something }

Let’s test it.

It was designed for comparing complex objects, but first let’s ensure it works as expected with simple objects.

$NumberA = 6
$NumberB = 6
$NumberC = 7
[Compare]$NumberA -eq $NumberB  # yields $True
[Compare]$NumberA -eq $NumberC  # yields $False

$IntegerA = 7
$DoubleB = 7.0
$StringC = '7'
[Compare]$IntegerA -eq $DoubleB  # yields $True
[Compare]$IntegerA -eq $StringC  # yields $True

So far, so good.

How does it handle file info objects?

$FileA = Get-Item C:\Temp\Test.ps1
$FileB = Get-Item C:\Temp\Test.ps1
[Compare]$FileA -eq $FileB  # yields $True

That’s what we want. Unlike using -eq without [Compare], two objects with the same data about the same file show as equal.

But remember that we are comparing all comparable properties. That allows us to do this.

$FileName = 'C:\Temp\Test.txt'
'abcd' | Out-File $FileName -Force
$FileA = Get-Item $FileName
$FileA.Refresh()
'efgh' | Out-File $FileName -Append
$FileB = Get-Item $FileName
[Compare]$FileA -eq $FileB  # yields $False

Those two objects are pointing to the same file path, but they have different values for LastWriteTime. (You have to dig past the display values to see the difference as they only differ by a few milliseconds.)

(FileInfo and DirectoryInfo objects have a quirk wherein the metadata is not queried and populated until you first ask for it. Thus we had to use the .Refresh() to populate the metadata for $FileA before changing the file to see the difference. I talk about that in detail here. )

It also works with hashtables.

$HashtableA = @{ A = 2; B = 3 }
$HashtableB = @{ A = 2; B = 3 }
$HashtableC = @{ A = 46; C = 'Banana' }
[Compare]$HashtableA -eq $HashtableB  # yields $True
[Compare]$HashtableA -eq $HashtableC  # yields $False

And because we recurse hashtable values, it works with more complex hashtable values.

$HashtableA = @{ A = @{ B = 3 } }
$HashtableB = @{ A = @{ B = 3 } }
[Compare]$HashtableA -eq $HashtableB  # yields $True

And it knows different processes are different.

$ProcessA = Start-Process notepad -PassThru
$ProcessB = Start-Process notepad -PassThru
[Compare]$ProcessA -eq $ProcessB  # yields $False

However, we also get this result:

$ProcessA = Start-Process notepad -PassThru
$ProcessB = Get-Process -Id $ProcessA.Id
[Compare]$ProcessA -eq $ProcessB  # yields $False

Despite the fact that $ProcessA and $ProcessB both refer to the same object, using [Compare] sees them as different because we are comparing ALL of the comparable values, including the amount of CPU, memory, and handles in use, which varies from one microsecond to the next. [Compare] isn’t the ideal solution for every scenario.

And it works for arrays.

$ArrayA = @( 2, 3 )
$ArrayB = @( 2, 3 )
$ArrayC = @( 46, 'Banana' )
$ArrayD = @( 'Banana', 46 )
[Compare]$ArrayA -eq $ArrayB  # yields $True
[Compare]$ArrayA -eq $ArrayC  # yields $False
[Compare]$ArrayC -eq $ArrayD  # yields $False

With [Compare], order matters. Two arrays are not equal simply because they have equals elements. The elements have to be in the right order. If you don’t care about order, you can try sorting the arrays before comparing, which will work for objects that are easily sortable.

[Compare]$ArrayC -eq $ArrayD                        # yields $False
[Compare]$ArrayC | Sort ) -eq ( $ArrayD | Sort )  # yields $True

Full Compare class definition

class Compare
    {
    [HashTable]$Properties = @{}

    # Constructor for hashtable
    Compare ( [hashtable]$Hashtable )
        {
        # For each property...
        ForEach ( $Key in $Hashtable.Keys )
            {
            # Convert the value to a Compare object and add it to .Properties
            $This.Properties.Add( $Key, [Compare]$Hashtable.$Key )
            }
        }

    # Constructor for array and arraylists
    Compare ( [array]$Array )
        {
        # If array is empty
        # Add an empty array to Properties
        If ( $Array.Count -eq 0 )
            {
            $This.Properties.Add( 'Value', @() )
            }
              
        # Else (array is not empty)
        # Cast each element to Compare and add to properties
        Else
            {
            ForEach ( $i in 0..($Array.Count - 1) )
                {
                $This.Properties.Add( "$i", [Compare]$Array[$i] )
                }
            }
        }

    # Constructor to convert any object
    Compare ( $Object )
        {
        # If object is already comparable (built-in Equals method)
        # Put the value of the object in Properties
        If ( @( $Object.GetType().DeclaredMethods.Name ) -contains 'Equals' )
            {
            $This.Properties.Add( 'Value', $Object )
            }

        # Else (no built in Equals method )
        # Extract the comparable properties
        Else
            {
            # For each object property...
            ForEach ( $Property in $Object | Get-Member -MemberType Properties |
                    Select-Object -ExpandProperty Name )
                {
                # If the property value is not null...
                If ( $Object.$Property -ne $Null )
                    {
                    # If the property is an enumeration
                    # Add it to Properties
                    If ( $Object.$Property -is [enum] )
                        {
                        $This.Properties.Add( $Property, $Object.$Property )
                        }

                    # If the object has an override for method .Equals()
                    # Add it to Properties
                    ElseIf ( $Object.$Property.GetType().DeclaredMethods.Name -contains 'Equals' )
                        {
                        $This.Properties.Add( $Property, $Object.$Property )
                        }
                    # Ignore any property that does not have an Equals method
                    # (thereby ignoring such properties when comparing objects)
                    }
                }
            }
        }

    # Equals override
    # This is the method that is used by -eq and -ne operators
    [boolean] Equals ( $InputObject )
        {
        # If the input object is already a Compare object...
        If ( $InputObject -is [Compare] )
            {
            # If the two objects have different property names
            # Return not equal
            If ( Compare-Object -ReferenceObject $This.Properties.Keys -DifferenceObject $InputObject.Properties.Keys )
                {
                return $False
                }

            # For each property...
            ForEach ( $Key in $This.Properties.Keys )
                {
                # If the property values are not equal
                # Return not equal
                If ( $This.Properties.$Key -ne $InputObject.Properties.$Key )
                    {
                    return $False
                    }
                }

            # If we got this far, all tests above passed
            # Return equal
            return $True
            }

        # Else (input object is not a Compare object)
        # Convert input object to Compare and recurse
        Else
            {
            return $This.Equals( [Compare]$InputObject )
            }
        }

    # ToString override
    # ( This helps in troubleshooting by making it easier to see what properties
    # are extracted by using ([compare]$Object).Properties )
    [string] ToString()
        {
        If ( $This.Properties.Keys.Count -eq 1 -and $This.Properties.Value )
            {
            return $This.Properties.Value
            }
        Else
            {
            return '{Compare}'
            }
        }
    }

Usage

If ( [Compare]$ObjectA -eq $ObjectB ) { Do-Something }

2 comments:

  1. Didn't work for my objects.

    I pass two objects, $a and $b, that differ in here:

    $a.parameters[0].type

    The difference was not caught and object claimed to be identical.

    I use this method instead:

    function Compare-UsingJson {
    param(
    [parameter(Mandatory=$true)]
    [PSCustomObject] $firstObject,
    [parameter(Mandatory=$true)]
    [PSCustomObject] $secondObject
    )
    $first = $firstObject | ConvertTo-Json -Depth 20 -Compress
    $second = $secondObject | ConvertTo-Json -Depth 20 -Compress
    return ($first -eq $second)
    }

    Primitive, wasteful, but gets the job done (for me).

    ReplyDelete
  2. Excellent post. Really well explains the implementation and the short-comings of posh (at least v5.1) when comparing complex objects. This behaviour is pretty unexpected. Good stuff. Nice [Compare] class to. Really works well.

    ReplyDelete