Sunday, January 7, 2018

FileInfo and DirectoryInfo objects are not populated upon creation in PowerShell

When we use Get-Item or Get-ChildItem to get files and folders in PowerShell, most of the properties of the resulting FileInfo and DirectoryInfo objects are not actually populated until one of them is used for the first time. To capture point in time information, the .Refresh() method can be used to force the data to populate or repopulate the objects.

Unintuitive normal behavior

Let’s imagine a scenario where we want to capture the LastWriteTime of a file before and after a change.

This seems like it would be easy. Just capture Get-Item before and after and then compare the LastWriteTimes.

# Define file name
$FileName = 'C:\Temp\Test.txt'

# Create file
'abcd' | Out-File $FileName -Force

# Get Before FileInfo object
$FileBefore = Get-Item $FileName

# Modify file
'efgh' | Out-File $FileName -Append

# Get After FileInfo object
$FileAfter = Get-Item $FileName

# Show results
$FileBefore.LastWriteTime.ToString( 'h:mm:ss.fffffff' )
$FileAfter. LastWriteTime.ToString( 'h:mm:ss.fffffff' )

12:12:06.9729853
12:12:06.9729853

But the results show the two file objects have identical LastWriteTimes, within 100 nanoseconds of each other.

That doesn’t seem right, so let’s add a few seconds between the two events.

# Define file name
$FileName = 'C:\Temp\Test.txt'

# Create file
'abcd' | Out-File $FileName -Force

# Get Before FileInfo object
$FileBefore = Get-Item $FileName

# Pause for a bit
Start-Sleep -Seconds 2

# Modify file
'efgh' | Out-File $FileName -Append

# Get After FileInfo object
$FileAfter = Get-Item $FileName

# Show results
$FileBefore.LastWriteTime.ToString( 'h:mm:ss.fffffff' )
$FileAfter. LastWriteTime.ToString( 'h:mm:ss.fffffff' )

12:13:55.2325050
12:13:55.2325050

PowerShell still seems to be claiming that two events that actually happened at least 2 seconds apart were simultaneous. As they both happened at the same location, Einstein’s relativity doesn’t apply, so something else must be going on here.

Explanation

After experimenting for a bit and completely stumped, I went to the source. The Microsoft .Net Framework 4.7.1 Reference Source, that is, where the FileInfo and DirectoryInfo objects are actually defined. The code is in files fileinfo.cs and directoryinfo.cs as well as in filesysteminfo.cs which is inherited by the other two. https://referencesource.microsoft.com/#mscorlib/system/io/fileinfo.cs

It turns out that when a FileInfo or DirectoryInfo object is first created, it is empty, with only the FullName defined. All other properties are empty.

The first time you access one of the other properties, the object internally runs its .Refresh() method to populate all of the property values.

So in the above code, FileInfo object $FileBefore did not have a LastWriteTime until we asked it for the LastWriteTime, at which point it queried the file system for the current LastWriteTime (and other properties), and not the LastWriteTime as it existed when we first created $FileBefore.

We can show a more extreme, non-intuitive result.

# Define file name
$FileName = 'C:\Temp\Test.txt'

# Create file
'abcd' | Out-File $FileName -Force

# Get Before FileInfo object
$FileBefore = Get-Item $FileName

# Pause for a bit
Start-Sleep -Seconds 2

# Modify file
'efgh' | Out-File $FileName -Append

# Get After FileInfo object
$FileAfter = Get-Item $FileName

# Show results
$FileAfter.LastWriteTime.ToString( 'h:mm:ss.fffffff' )

# Pause for a bit
Start-Sleep -Seconds 2

# Modify file
'ijkl' | Out-File $FileName -Append

# Show results
$FileBefore.LastWriteTime.ToString( 'h:mm:ss.fffffff' )

12:15:18.6591734
12:15:20.6783904

In the above code, we created $FileBefore before $FileAfter, but $FileBefore shows a LastWriteTime after $FileAfter because of when we first queried their respective properties.

They implemented this behavior so that when you do Get-ChildItem C:\ -Recurse, it doesn’t take the time and memory to load the details on 400,000 files unless and until you actually need it later in the script.

Workaround

But in our scenario, we don’t want that behavior.

We can work around it by accessing one or more properties sooner. If we don’t need to access any properties sooner, we can directly do what accessing the first property does, which is to call the .Refresh() method to populate the data.

That looks like this.

# Get Before FileInfo object
$FileBefore = Get-Item $FileName
$FileBefore.Refresh()


Or like this.

# Get Before FileInfo object
$FileBefore = Get-Item $FileName ).Refresh()

So now our modified code works as expected.

# Define file name
$FileName = 'C:\Temp\Test.txt'

# Create file
'abcd' | Out-File $FileName -Force

# Get Before FileInfo object
$FileBefore = Get-Item $FileName
$FileBefore.Refresh()

# Pause for a bit
Start-Sleep -Seconds 2

# Modify file
'efgh' | Out-File $FileName -Append

# Get After FileInfo object
$FileAfter = Get-Item $FileName

# Show results
$FileBefore.LastWriteTime.ToString( 'h:mm:ss.fffffff' )
$FileAfter. LastWriteTime.ToString( 'h:mm:ss.fffffff' )

12:17:00.9741507
12:17:02.9781176

( Note: If you take out the 2-second Sleep, this code sometimes shows the times as simultaneous again. But that’s simply because computers are very fast and their time keeping isn’t really as precise as the 100 nanosecond precision implied by the display values. Sometimes the time difference is too short for the computer to measure.)
Going further

.Refresh() doesn’t just work for populating the properties. We can use it to update the properties at any time.

This means we don’t have to run a new Get-Item and create the $FileAfter variable at all. We can simply update the properties for a single $File variable.

# Define file name
$FileName = 'C:\Temp\Test.txt'

# Create file
'abcd' | Out-File $FileName -Force

# Get FileInfo object
$File = Get-Item $FileName
$File.Refresh()
$TimeBefore = $File.LastWriteTime

# Pause for a bit
Start-Sleep -Seconds 2

# Modify file
'efgh' | Out-File $FileName -Append

# Update FileInfo object
$File.Refresh()
$TimeAfter = $File.LastWriteTime

# Show results
$TimeBefore.ToString( 'h:mm:ss.fffffff' )
$TimeAfter. ToString( 'h:mm:ss.fffffff' )

12:26:48.2342763
12:26:50.2495910

What happens if we delete the file?

# Define file name
$FileName = 'C:\Temp\Test.txt'

# Create file
'abcd' | Out-File $FileName -Force

# Get FileInfo object
$FileBefore = Get-Item $FileName
$FileBefore.Refresh()

# Delete file
$FileBefore.Delete()

# Update FileInfo object
$FileBefore.Refresh()

# Show results
$FileBefore | Format-List *

PSPath : Microsoft.PowerShell.Core\FileSystem::C:\Temp\Test.txt
PSParentPath : Microsoft.PowerShell.Core\FileSystem::C:\Temp
PSChildName : Test.txt
PSDrive : C
PSProvider : Microsoft.PowerShell.Core\FileSystem
PSIsContainer : False
Mode : darhsl
VersionInfo :
BaseName : Test
Target :
LinkType :
Name : Test.txt
Length :
DirectoryName : C:\Temp
Directory : C:\Temp
IsReadOnly : True
Exists : False
FullName : C:\Temp\Test.txt
Extension : .txt
CreationTime : 1/7/2018 12:12:06 PM
CreationTimeUtc : 1/7/2018 6:12:06 PM
LastAccessTime : 1/7/2018 12:12:06 PM
LastAccessTimeUtc : 1/7/2018 6:12:06 PM
LastWriteTime : 1/7/2018 12:18:45 PM
LastWriteTimeUtc : 1/7/2018 6:18:45 PM
Attributes : -1

If we populated the properties before the delete, when we .Refresh() them after the delete, three properties are updated. .Exists is set to $False, .Attributes is set to -1, and .Mode is set to “darhsl”. All of the dates remain as they were in the previous refresh.

# Define file name
$FileName = 'C:\Temp\Test.txt'

# Create file
'abcd' | Out-File $FileName -Force

# Get Before FileInfo object
$FileBefore = Get-Item $FileName

# Delete file
$FileBefore.Delete()

# Update FileInfo object
$FileBefore.Refresh()

# Show results
$FileBefore | Format-List *

PSPath : Microsoft.PowerShell.Core\FileSystem::C:\Temp\Test.txt
PSParentPath : Microsoft.PowerShell.Core\FileSystem::C:\Temp
PSChildName : Test.txt
PSDrive : C
PSProvider : Microsoft.PowerShell.Core\FileSystem
PSIsContainer : False
Mode : darhsl
VersionInfo :
BaseName : Test
Target :
LinkType :
Name : Test.txt
Length :
DirectoryName : C:\Temp
Directory : C:\Temp
IsReadOnly : True
Exists : False
FullName : C:\Temp\Test.txt
Extension : .txt
CreationTime : 12/31/1600 6:00:00 PM
CreationTimeUtc : 1/1/1601 12:00:00 AM
LastAccessTime : 12/31/1600 6:00:00 PM
LastAccessTimeUtc : 1/1/1601 12:00:00 AM
LastWriteTime : 12/31/1600 6:00:00 PM
LastWriteTimeUtc : 1/1/1601 12:00:00 AM
Attributes : -1

If we did not populate the properties until after the delete, .Exists, .Attributes, and .Mode are again set to $False, -1, and “darhsl”, respectively. All dates are set to the beginning of the Windows file system time epoch, which is midnight, 1/1/1600, UTC.

Conclusion

Computers are complicated.

No comments:

Post a Comment