Tuesday, August 13, 2013

Delete all blank lines from a text file using PowerShell

Get-Content $FilePath ) | Where { $_ } | Set-Content $FilePath

Or for those inclined towards brevity:

(GC $FP)|?{$_}|SC $FP

We generally use PowerShell for larger, more complex tasks.  But don’t you love elegant one-liners?
This one line will strip out all of the blank lines from a text file.

How it works

In short, we are reading all of the lines of the file, and then writing them back to the file, skipping any lines that are blank.

We presume variable $FilePath contains the string with the full path to the file we are working with.

$FilePath = "D:\Test\LogFile.txt"

We put the parenthesis around the Get-Content statement, because if we don’t , the end of the pipeline will be trying to write to the same file that the beginning of the pipeline is still trying to read from, and throw an error.  The parentheses force it to finish loading the contents into an object before sending them down the pipeline.  (If we were writing to a different file than we were reading from, we could speed up the command by eliminating the parenthesis, thus allowing us to read from the one and write to the other simultaneously.)

The individual items traveling through the pipeline will be strings representing each line of the contents.  The Where { $_ } clause effectively converts each string to a boolean value for testing.  Every non-empty string evaluates as $True and is passed on to be rewritten back to the file.  Every empty string evaluates as $False, and is not passed on.

Our original version of the code will only strip out completely empty lines:

( Get-Content $FilePath ) | Where { $_ } | Set-Content $FilePath

But if our "blank" lines are filled with spaces and tabs, we need to strip those characters out as well. 

In PowerShell, every variable is an object, and objects come with "methods", which are built-in pieces of code that we can leverage.  If we pipe a string to Get-Member, it gives us a list of methods for strings.

$X = ""
$X.GetType()

Or just…

("").GetType()

One of the string methods is Trim, which can be used to strip undesired characters from the start and end of a string.

$X.Trim(" ")
Will delete any spaces at the start or end of string variable $X.

$X.Trim("`t")
Will delete any tabs.

$X.Trim(" `t")
Will delete all spaces and all tabs.

$X.Trim(" `t")
, when converted to a Boolean value, will evaluate to $False if $X has nothing but spaces and or tabs in it.  (Or nothing in it.)

So, to strip out all blank lines, including those with spaces and tabs, from a text file, we can use:

( Get-Content $FilePath ) | Where { $_.Trim(" `t" } | Set-Content $FilePath

Or

(GC $FP)|?{$_.Trim(" `t")}|SC $FP

(If you need to get fancier, one or more Select-String statements can be used instead of our Where-Object command, but getting fancy with the "regular expressions" that Select-String uses can get quite complicated and confusing.  When all we need is simple, let’s keep it simple.)

1 comment:

  1. OMG thank you for such an under-engineered, to-the-point article. Thank you for using the absolute simplest technique, and for explaining each little piece. Nice surprise as I was dreading yet another long-winded, 15-line block to accomplish the same thing. #Bookmarked.

    ReplyDelete