Regex to split a Pascal-case string

Postcard from the bowels of the regex beast!

I want to derive some exception classes and pass in a message that comes from the relevant value of ErrorCategory.

ErrorCategory is an enum with 32 values that are no-whitespace strings in Pascal case:

[Enum]::GetValues(
    [System.Management.Automation.ErrorCategory])

NotSpecified
OpenError
CloseError
DeviceError
DeadlockDetected
# ...etc

That [System.Enum]::GetValues() method works on any enum, e.g. [System.DayOfWeek]

I don’t want to text-edit them all myself, we have computers for that.

Step 1, split them:

[System.Enum]::GetValues(
    [System.Management.Automation.ErrorCategory]) |
    select -First 1 | foreach {
        [regex]::Matches(
            $_,
            '[A-Z][a-z]*'
        ).Value
    }

Not
Specified

Does anyone else select only the first item while work is in progress? Saves some scrolling. On which note, sorry about the awkward spacing, this blog theme makes horizontal space very precious.

Complete snippet:

[System.Enum]::GetValues(
    [System.Management.Automation.ErrorCategory]) |
    foreach {
        $Words = [regex]::Matches(
            $_,
            '[A-Z][a-z]*'
        ).Value

        $Words = @($Words[0]) + ($Words | select -Skip 1 | %{$_.ToLower()})
        $Words -join ' '
    }

Not specified
Open error
Close error
Device error
Deadlock detected
# ...etc

My love/hate relationship with regex continues.

Principles for exception-handling design

I came across the answer by mikera to this question on stackexchange, and thought it worth recycling (the rest of this post is her/his words):

Consider exceptions as part of the interface to each function / modules - i.e. document them and where appropriate / if your language supports it then use checked exceptions.
Never fudge a failure - if it failed, don’t try to continue with some attempt to “assume” how to proceed. For example, special handling for null cases is often a code smell to me: if your method needs a non-null value, it should be throwing an exception immediately if it encounters a null (either NullPointerException or ideally a more descriptive IllegalArgumentException).
Write unit tests for exceptional cases as well as normal ones - sometimes such tests can be tricky to set up but it is worth it when you want to be sure that your system is robust to failures
Log at the point where the error was caught and handled (assuming the error was of sufficient severity to be logged) . The reason for this is that it implies you understood the cause and had an approach to handle the error, so you can make the log message meaningful…..
Use exceptions only for truly unexpected conditions/failures. If your function “fails” in a way that is actually expected (e.g. polling to see if more input is available, and finding none) then it should return a normal response (“no input available”), not throw an exception
Fail gracefully (credit to Steven Lowe!) - clean up before terminating if at all possible, typically by unwinding changes, rolling back transaction or freeing resources in a “finally” statement or equivalent. The clean-up should ideally happen at the same level at which the resources were committed for the sake of clarity and logical consistency.
If you have to fail, fail loudly - an exception that no part of your code was able to handle (i.e. percolated to the top level without being caught + handled) should cause an immediate, loud and visible failure that directs your attention to it. I typically halt the program or task and write a full exception report to System.out.

Using delegates in Powershell, part 2

In part 1, I described the problem of tangling up normal logic with error-handling logic and how this leads to complex code. I hinted at a solution involving passing a scriptblock that gets invoked on exceptions.

Here is some perfectly-functional code that follows on from part 1:

#ApiClient.psm1

function Invoke-ApiCall {
    [CmdletBinding()]
    param(
        [Action[System.Management.Automation.ErrorRecord, ref]]$ErrorCallback
    )

    $IsRetryable = $false
    do {

        try {
            #Fake out an authentication error from Invoke-RestMethod
            if ([bool](Get-Random -Minimum 0 -Maximum 2)) {
                return [psobject]@{
                    prop1 = 'data1'
                    prop2 = 'data2'
                }

            } else {
                throw "401"
            }

        } catch {
            if ($PSBoundParameters.ContainsKey('ErrorCallback')) {
                $ErrorCallback.Invoke($_, ([ref]$IsRetryable))
            } else {
                throw
            }
        }

    } while ($IsRetryable)
}

#CallingCode.psm1

function Get-AuthToken {
    #Dummy, for demonstration purposes
    #This would presumably prompt the user for credentials
    Write-Verbose "Prompting user for creds"
}

function Get-Objects {
    param(
        $AuthToken
    )

    begin {
        $OutputArray = @()
    }

    process {
        $Retry = $false
        $MaxRetries = 3
        do {
            try {
                $Objects = Invoke-ApiCall
                $Retry = $false
            } catch {
                Write-Warning $_.Exception.Message
                if ($_ -match '401') {$Retryable = $true}
                if ($Retryable) {
                    $MaxRetries -= 1
                    if ($MaxRetries -gt 0) {
                        $Retry = $true
                    }
                }
                if ($_ -match '401') {
                    $AuthToken = Get-AuthToken -Renew
                }
            }
        } while ($Retry)
    }

    end {
        return $Objects
    }
}

In the code above, our retry logic and our error-handling logic are inextricably tied up with our normal processing. As business requirements change and we need to add more logic, this will get worse. That’s the problem we want to solve.

Below, we have added an ErrorCallback parameter to the ApiClient module. We’ve added a little complexity there, but not too much. Note that the higher-level module can slot in error-handling or not, as it pleases:

#ApiClient.psm1

function Invoke-ApiCall {
    [CmdletBinding()]
    param(
        [Action[System.Management.Automation.ErrorRecord, ref]]$ErrorCallback
    )

    $IsRetryable = $false
    do {

        try {
            #Fake out an authentication error from Invoke-RestMethod
            if (Get-Random $true, $false) {
                return [psobject]@{
                    prop1 = 'data1'
                    prop2 = 'data2'
                }

            } else {
                throw "401"
            }

        } catch {
            if ($PSBoundParameters.ContainsKey('ErrorCallback')) {
                $ErrorCallback.Invoke($_, ([ref]$IsRetryable))
            } else {
                throw
            }
        }

    } while ($IsRetryable)
}

We create a function inside the higher-level module that holds all the logic for error-handling and retrying. Note that we don’t export this function; that keeps our interface surface small.

#CallingCode.psm1

[uint16]$Script:RetryCount = 0

function Handle-Error {
    [CmdletBinding()]
    [OutputType([void])]
    param(
        [System.Management.Automation.ErrorRecord]$ErrorRecord,
        [ref]$IsRetryable
    )

    if ($ErrorRecord.Exception.Message -match '401') {
        if ($Script:RetryCount -lt 3) {
            Get-AuthToken
            $Script:RetryCount += 1
            $IsRetryable.Value = $true
        } else {
            $Script:RetryCount = 0
            $IsRetryable.Value = $false
            Write-Verbose "Retry count exceeded"
        }
    }
}


function Get-AuthToken {
    #Dummy, for demonstration purposes
    #This would presumably prompt the user for credentials
    Write-Verbose "Prompting user for creds"
}


function Get-Objects {
    param(
        $AuthToken
    )

    begin {
        $OutputArray = @()
    }

    process {
        $Objects = Invoke-ApiCall -ErrorCallback (Get-Item Function:\Handle-Error).ScriptBlock
    }

    end {
        return $Objects
    }
}

Export-ModuleMember Get-Objects

Benefits:

Although the complexity is comparable now, we can change the code in future without significantly increasing complexity
Changes in one module don’t require changes in the other
That’s because we have a small interface surface

Side note: C# devs will ask why I didn’t use Func, since I’m defining an out parameter? The answer is that, as far as I can see, you can’t do that in Powershell. Casting the Handle-Error scriptblock to Func gives [lambda_method(System.Runtime.CompilerServices.Closure, System.Management.Automation.ErrorRecord)] and the out parameter gets silently discarded ¯\_(ツ)_/¯

Let me cover this [Action[System.Management.Automation.ErrorRecord, ref]] type. From a Powershell perspective, this defines a scriptblock with two parameters:

ErrorRecord (which is what you get anyway from $_ in a catch block)
Ref. Ref types aren’t used that much in PS, but they do get their own topic. Objects are always ref anyway, but declaring a parameter as ref means that we can send value types such as string or bool as a reference, too; the key point being that when we change the value in the callback, it simultaneously gets changed in the ApiClient. That’s how we let the higher-level module update IsRetryable without having to export it as a variable and letting any old function update it.

This is the nature of delegates (in this use case, anyway): ApiClient can let external code alter its behaviour, but it gets to choose who and when. This keeps everything under control.

We’re gaining code safety, for very little extra complexity. Adding code like this may add an hour or two at dev time while you figure it out, but may save bugs in production when you start hitting edge cases. Bugs in production are more expensive than hours spent in dev, right? If you don’t believe me, try pushing some sloppy code ;-)

Migrating the blog to Github Pages

I started blogging on wordpress.com. It was not a carefully considered decision.

I didn’t like the interface much, or working directly in HTML. Posting code blocks was utter hell because the app would munch whitespace sequences, like a greedy housemate who eats all the cake except for one tiny slice.

I wanted to get away from it almost as soon as I started.

I LIKE TO MOVE IT, MOVE IT

Github Pages is great, because it fits in the workflow of a git user, and because you write posts in markdown instead of html. And this is a cool theme, no?

Any pushes to your blog repo on github.com result in Jekyll being invoked, in Github, to render your .md into .html. So it’s very simple once you’re up and running - but as a Windows person without prior experience of Ruby, I had some figuring out to do.

Existing guides

Exisiting guides mostly fall short for my scenario. You can follow steps to install Ruby, follow steps to create a github.io repo, follow steps to structure your folders… but this is the wrong approach. Here’s what worked for me:

1

Choose a theme. Just google for ‘jekyll theme site:github.com’ - they will all be hosted in Github.

Your blog posts are highly portable, your theme is maybe not so much. Find a Jekyll theme that you like on github, and fork it. Rename your fork to <username>.github.io, like this. Clone it locally.

2

Edit the stock text - for example, you’ll want to change any names or twitter handles, customise your “about” page, etc. This will vary according to the theme you pick, but it’s safe to say that there will be some existing .html and/or .md files when you clone your theme. Go through those and edit them.

3

Write a blogpost in .md, save it in the _posts folder, commit to master, and push. A fully-featured and themed blogpost will automatically appear at your new url! Unless… you fail to meet the…

Requirements

Your .md blogpost file must meet some syntax requirements for Jekyll to process it into html.

File name: this must exactly match the format laid out in your theme. I’m not sure if this is customisable, but my site requires my filenames to be in the _posts directory and look like yyyy-mm-dd-all-lowercase-title-with-dashes.md

Post header: header text must match the sample posts that your theme probably included. Mine looks like this:

 ---
 layout: post
 title: Migrating the blog to Github Pages
 date: 2017-10-11 01:04
 author: freddiesackur
 comments: true
 tags: [Jekyll, Ruby, meta]
 ---

Build errors?

If you don’t see your page appear after 5 minutes, then:

log into your github.io repo on Github
under Settings, scroll down to the Github Pages section and look at Build Errors

Migrating posts from wordpress

I exported my site from Wordpress and then used wpxml2jekyll to convert it. I copied all the resulting .md files into the _posts folder and… still had a lot to do. The pages are still full of static html, possibly because of the amount I had done to try to make my code look nice on WP.

There are many native importers from other blog sites on the Jekyll site, but they all assume that you have a functional Ruby installation. I got very headbangy trying to make that happen in a hurry, so please see my steps below and use chocolatey to install Ruby, not the Ruby downloads page.

A combination of the following was enough to convert any remaining html tags into their equivalent markdown:

VS Code’s Find-and-replace in all files
Regex matching for HTML tags, e.g. </?em> replace with **
Disregard for my own leisure time

Fortunately, it was a one-time task. Hopefully this won’t be so bad for you.

Building your site locally

One thing that isn’t great is that you have to publish your posts to see them. That doesn’t have to be the case if you run jekyll on a local ruby installation. Please, Windows users, do yourself a favour, and use chocolatey to install.

Install chocolatey, if you don’t already have it:

PS C:\dev\fsackur.github.io> Set-ExecutionPolicy Bypass; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
#Refresh your path variable
PS C:\dev\fsackur.github.io> $Env:Path = [Environment]::GetEnvironmentVariable('Path', 'Machine')

Install ruby:

PS C:\dev\fsackur.github.io> choco install ruby -y

Chocolatey v0.10.5
Installing the following packages:
ruby
By installing you accept licenses for the packages.
Progress: Downloading ruby 2.4.2.2... 100%

ruby v2.4.2.2 [Approved]
ruby package files install completed. Performing other installation steps.
Ruby is going to be installed in 'C:\tools\ruby24'
Installing 64-bit ruby...
ruby has been installed.
Environment Vars (like PATH) have changed. Close/reopen your shell to
 see the changes (or in powershell/cmd.exe just type `refreshenv`).
 The install of ruby was successful.
  Software installed to 'C:\tools\ruby24\'

Chocolatey installed 1/1 packages. 0 packages failed.
 See the log for details (C:\ProgramData\chocolatey\logs\chocolatey.log).

#Refresh your path (refreshenv may work; YMMV)
PS C:\dev\fsackur.github.io> $Env:Path = [Environment]::GetEnvironmentVariable('Path', 'Machine')

You should have Ruby now:

PS C:\dev\fsackur.github.io> ruby -v
ruby 2.4.2p198 (2017-09-14 revision 59899) [x64-mingw32]

Install jekyll:

PS C:\dev\fsackur.github.io> gem install jekyll
Fetching: public_suffix-3.0.0.gem (100%)
Successfully installed public_suffix-3.0.0
Fetching: addressable-2.5.2.gem (100%)
Successfully installed addressable-2.5.2
Fetching: colorator-1.1.0.gem (100%)
Successfully instal...

        ...umentation for rouge-2.2.1
Installing ri documentation for rouge-2.2.1
Parsing documentation for jekyll-3.6.0
Installing ri documentation for jekyll-3.6.0
Done installing documentation for public_suffix, addressable, colorator, rb-fsevent, ffi, rb-inotify, sass-listen, sass, jekyll-sass-converter, listen, jekyll-watch, kramdown, liquid, mercenary, forwardable-extended, pathutil, rouge, safe_yaml, jekyll after 19 seconds
19 gems installed

Install bundle

PS C:\dev\fsackur.github.io> gem install bundle
Fetching: bundler-1.15.4.gem (100%)
Successfully installed bundler-1.15.4
Fetching: bundle-0.0.1.gem (100%)
Successfully installed bundle-0.0.1
Parsing documentation for bundler-1.15.4
Installing ri documentation for bundler-1.15.4
Parsing documentation for bundle-0.0.1
Installing ri documentation for bundle-0.0.1
Done installing documentation for bundler, bundle after 6 seconds
2 gems installed

…wait for it! Add a gitiginore for the rendered site (this goes into _site and should not be published to Github. This file needs to go in the root of your repo):

"`n_site`n" | Out-File '.gitignore' -Append -Encoding ascii

Install any gems from the theme’s Gemfile:

PS C:\dev\fsackur.github.io> bundle install
Fetching gem metadata from https://rubygems.org/..........
Fetching version metadata from https://rubygems.org/..
Fetching dependency metadata from https://rubygems.org/.
Using public_suffix 3.0.0
Using bundler 1.15.4
Using ffi 1.9.18 (x64-mingw32)
Fetching jekyll-paginate 1.1.0
Installing jekyll-paginate 1.1.0
Using rb-fsevent 0.10.2
Using kramdown 1.15.0
Using rouge 2.2.1
Using addressable 2.5.2
Using rb-inotify 0.9.10
Using listen 3.0.8
Using jekyll-watch 1.5.0
Bundle complete! 5 Gemfile dependencies, 11 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.

Now you can build your site:

PS C:\dev\fsackur.github.io> jekyll build
Configuration file: C:/dev/fsackur.github.io/_config.yml
       Deprecation: The 'gems' configuration option has been renamed to 'plugins'. Please update your config file accordingly.
            Source: C:/dev/fsackur.github.io
       Destination: C:/dev/fsackur.github.io/_site
 Incremental build: disabled. Enable with --incremental
      Generating...
                    done in 0.779 seconds.
 Auto-regeneration: disabled. Use --watch to enable.

You should now see a folder called _site containing all your html in folders. You can drill down into it and open the posts! …just kidding, you will actually get some build failures first:

 C:/Ruby24-x64/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- kramdown (Load Error)
        from C:/Ruby24-x64/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/jekyll-3.6.0/lib/jekyll/plugin_manager.rb:48:in `require_from_bundler'
        from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/jekyll-3.6.0/exe/jekyll:11:in `<top (required)>'
        from C:/Ruby24-x64/bin/jekyll:23:in `load'
        from C:/Ruby24-x64/bin/jekyll:23:in `<main>'

The missing gem is named in the first line - in example above, it is kramdown. Add the following line to your Gemfile:

gem 'kramdown'

Gemfile and Gemfile.lock should not be in your .gitignore

Install with bundle:

PS C:\dev\fsackur.github.io> bundle install

Try to build the site again:

PS C:\dev\fsackur.github.io> jekyll build

Keep going until it succeeds.

Now start the jekyll server and set it to watch and rebuild on filesystem changes:

PS C:\dev\fsackur.github.io> jekyll serve --watch
Configuration file: C:/dev/fsackur.github.io/_config.yml
       Deprecation: The 'gems' configuration option has been renamed to 'plugins'. Please update your config file accord
ingly.
            Source: C:/dev/fsackur.github.io
       Destination: C:/dev/fsackur.github.io/_site
 Incremental build: disabled. Enable with --incremental
      Generating...
                    done in 0.616 seconds.
  Please add the following to your Gemfile to avoid polling for changes:
    gem 'wdm', '>= 0.1.0' if Gem.win_platform?
 Auto-regeneration: enabled for 'C:/dev/fsackur.github.io'
    Server address: http://127.0.0.1:4000
  Server running... press ctrl-c to stop.

You can see from the above that your site is available on http://127.0.0.1:4000. Get it right then push to Github.

I haven’t been able to get jekyll to consistently render all the styles correctly when running locally. If you have that issue, try pushing and see if your live site renders correctly.

Wrapping external utilities, using EMC's inq.exe as an example

To fetch WWNs for an iSCSI LUN, you might find yourself using inq.exe, which is provided by EMC.

This is an old-skool utility that runs in cmd. We do like our powershell and it is a frequent problem to pull data from a CLI utility, so I thought I’d post my process for doing so.

First, an intermediate (but functional) stage:

$Output = @()

$Lines = & .\inq.exe -winvol
$Lines = $Lines |
    select -Skip 6 |
    where {$_ -notmatch '^-*$'}

$HeaderRow = $Lines[0]; $Rows = $Lines[1..($Lines.Count)]
$ColumnHeaders = $HeaderRow -split ':'

foreach ($Row in $Rows) {
    $HT = @{}
    $Values = $Row -split ':'
    for ($i=0; $i -lt $ColumnHeaders.Count; $i++) {
        $HT += @{
            $ColumnHeaders[$i].Trim() = $Values[$i].Trim()
        }
    }
    $Output += (New-Object psobject -Property $HT)
}
$Output | ft

If we only get to the code above, we’ve accomplished the required result. (Full code is on my github; link at the bottom.)

To approach this task, I start in stages. So I might begin something like this:

$Text = & .\inq.exe -winvol
    $Text
    $Text.Count
Inquiry utility, Version V7.3-1305 (Rev 1.0) (SIL Version V7.3.1.0 (Edit Level 1305)
Copyright (C) by EMC Corporation, all rights reserved.
For help type inq -h.



-------------------------------------------------------------------------------------
DEVICE             :VEND   :PROD         :REV  :SER NUM    :CAP(kb)  :WIN vol
-------------------------------------------------------------------------------------
\\.\PHYSICALDRIVE0 :VMware :Virtual disk :1.0  :           :83884032 : C:
\\.\PHYSICALDRIVE1 :EMC    :SYMMETRIX    :5876 :8802182000 :0        : S:
\\.\PHYSICALDRIVE3 :EMC    :SYMMETRIX    :5876 :8804965000 :0        : T:
12

From the output, I can see that the preamble (“Inquiry utility, Version V7.3” etc) is 6 lines, and I have received an array of 12 lines. So I’ll rename my first variable to TextArray (to keep things clear) and try this:

$TextArray = & .\inq.exe -winvol
$Text = $TextArray | select -Skip 6 | Out-String
$Text = $Text -replace '-------------------------------------------------------------------------------------'
$Text


DEVICE             :VEND   :PROD         :REV  :SER NUM    :CAP(kb)  :WIN vol

\\.\PHYSICALDRIVE0 :VMware :Virtual disk :1.0  :           :83884032 : C:
\\.\PHYSICALDRIVE1 :EMC    :SYMMETRIX    :5876 :8802182000 :0        : S:
\\.\PHYSICALDRIVE3 :EMC    :SYMMETRIX    :5876 :8804965000 :0        : T:

Quick digression; inq.exe is very easy to invoke, but some utilities make it quite difficult to call them, because they might need quotemarks or special characters. See Scripting Guy on the topic.

Now I have blank lines in my input. So, at this stage, I spend a while mucking about with regex to try to match the multilines and get upset, because there are a number of regex tools in powershell and none of them work the same way. So I’ll be cheap and do this:

$Text = & .\inq.exe -winvol
$Text = $TextArray | select -Skip 6 | Out-String
$Text = $Text -replace '-------------------------------------------------------------------------------------'
$Lines = $Text -split '\r\n' | where {-not [string]::IsNullOrWhiteSpace($_)}
$Lines

DEVICE             :VEND   :PROD         :REV  :SER NUM    :CAP(kb)  :WIN vol
\\.\PHYSICALDRIVE0 :VMware :Virtual disk :1.0  :           :83884032 : C:
\\.\PHYSICALDRIVE1 :EMC    :SYMMETRIX    :5876 :8802182000 :0        : S:
\\.\PHYSICALDRIVE3 :EMC    :SYMMETRIX    :5876 :8804965000 :0        : T:

OK, that’s the crucial first stage. From here on we’re golden, especially since the output is very helpfully separated consistently with a colon. THANKS, EMC! Brocade, take note.

In a problem like this, the next stage is always always going to be nested loops. You have tabular data, so you have to iterate over the rows and then iterate over each field. If this is new to you, then I would like you to pretend briefly that you are manipulating some kind of awful CSV file.

The inner loop is going to be a for loop, because we are going to index into both the array of the row we’re working on, and also the header row:

foreach ($Row in $Rows) {
    $HT = @{}
    for ($i=0; $i -lt $ColumnHeaders.Count; $i++) {

    }
    $Output.Add($HT)
}

I’ve skipped a few steps there. To explain: we have to split each row into substrings, commonly called ‘tokens’. The header row tokens will be our column names, the row tokens will be our values. We also know that we’re going to be adding values to something in the inner loop and then adding the result to some kind of array, so I’ve defined a hashtable, $HT, and added it to an $Output array.

If I add the definitions for those variables:

#    ...code we've already seen, above...

$Output = @()
$HeaderRow = $Lines[0]
$Rows = $Lines[1..($Lines.Count)]
$ColumnHeaders = $HeaderRow -split ':'

foreach ($Row in $Rows) {
    $HT = @{}
    $Values = $Row -split ':'
    for ($i=0; $i -lt $ColumnHeaders.Count; $i++) {
        $HT += @{
            $ColumnHeaders[$i] = $Values[$i]
        }
    }
    $Output += $HT
}

As I’m going along, I keep looking at $Output to see if I’m getting closer to the desired result. Trial and error all the way! This is basically complete now. One thing that’s important is to call Trim() on the tokens (because it will really throw later tasks out of whack if you have spaces that aren’t easy to see):

$ColumnHeaders[$i].Trim() = $Values[$i].Trim()

If you don’t do the above, you’ll try something like this:

$Obj.REV

and wonder why you don’t get any output when you can see the output in Format-Table. You will be forced to do this:

$Obj.'REV         '

Not helpful to your colleagues!

The only difference between this code we’ve got to so far and the first codeblock I pasted is a smidgen of refactoring. “Refactoring” is the process of changing the internals of code without changing the end result. It’s very important to keep doing this as you go along. Examples: rename variables and functions so that they continue to make sense once the code has evolved a bit; tidy up code constructs when you find better ways of doing things. If you never refactor, your code is going to be janky ;-)

Before refactor:

$Text = & .\inq.exe -winvol
$Text = $TextArray | select -Skip 6 | Out-String
$Text = $Text -replace '-------------------------------------------------------------------------------------'

$Lines = $Text -split '\r\n' | where {-not [string]::IsNullOrWhiteSpace($_)}
$Output = @()
$HeaderRow = $Lines[0]; $Rows = $Lines[1..($Lines.Count)]
$ColumnHeaders = $HeaderRow -split ':'

foreach ($Row in $Rows) {
    $HT = @{}
    $Values = $Row -split ':'
    for ($i=0; $i -lt $ColumnHeaders.Count; $i++) {
        $HT += @{
            $ColumnHeaders[$i].Trim() = $Values[$i].Trim()
        }
    }
    $Output += (New-Object psobject -Property $HT)
}
$Output | ft

After refactor:

$Output = @()

$Lines = & .\inq.exe -winvol
$Lines = $Lines |
    select -Skip 6 |
    where {$_ -notmatch '^-*$'}
$HeaderRow = $Lines[0]; $Rows = $Lines[1..($Lines.Count)]
$ColumnHeaders = $HeaderRow -split ':'

foreach ($Row in $Rows) {
    $HT = @{}
    $Values = $Row -split ':'
    for ($i=0; $i -lt $ColumnHeaders.Count; $i++) {
        $HT += @{
            $ColumnHeaders[$i].Trim() = $Values[$i].Trim()
        }
    }
    $Output += (New-Object psobject -Property $HT)
}
$Output | ft

Finished (more or less) code:

(https://gist.github.com/fsackur/afb34f3f93310fea2f60393abae8da98)

I hope this helps when you have to map a CLI utility into a powershell function.

Using delegates in Powershell, part 1

Part 2 is here

I debug code and think that software engineering principles would be nice. I don’t understand them but I think they are nice.

Really, there are a lot of concepts from software engineering that should be applied once your powershell applications, for that is what they are, start getting larger.

I’m currently working on a project that has multiple layers of wrapping. Our API has a client module already built. I’m writing a module to transform that output so it’s easier to consume in the logic part of that code, and also so I can slot in another API later. This leads to the problem of how to handle error conditions.

Say you have:

process {
    foreach ($Item in $InputStream) {

        # ...27 lines of logic about processing the input...

        Write-Output $Item.RelevantProperty
    }
}

Well, if I already have a lot of branching, the last thing I want to do is clutter it even further with exception and null handling.

A concept that’s been around since donkey’s years in .NET is delegates, which is a way to make a type out of the concept of a callback function. (It doesn’t have to be a callback, but that’s how I’m going to use it here.)

Side note: I am not a C# developer. I don’t want to give the idea that this article covers much of the topic of delegates. I’m always open to having any misconceptions cleared up.

In Powershell terms, think of a delegate as passing a scriptblock that has a strongly-typed param block and return type. So, conceptually:

#In the function that gets called:
param(
    [string]$MainParam,
    [string]$OtherParams,
    [System.Delegate]$ErrorCallback
)

#Could also use try/catch
trap [System.Web.HttpException] {$ErrorCallback.Invoke($_.Exception)}

#  ...main processing...


#In the calling function:
$Scriptblock = {
    param([Exception]$Exception)
    if ($Exception.Message -match '401') {Get-Credential}
}

Invoke-LowerLayer -MainParam $blah -OtherParams $blahblah -ErrorCallback $Scriptblock

In this code, any exceptions that happen get passed upwards through a side channel. Your main output stream contains only the objects you care about, and you don’t have to clog up your logic with error-handling. Your error-handling code is separate to your main logic. And if that doesn’t make you smile, you haven’t inspected much code ;-)

Now, you don’t actually pass a System.Delegate. You’ll pass a System.Action. We can use that as a generic type, so we’ll specify some other type or types at the same time, which map to the types of the parameters that your scriptblock accepts. Because we’ve got param([Exception]) above, we’ll have param(Action[Exception]) below. Thus:

#On the function you call:
param(
    [string]$MainParam,
    [string]$OtherParams,
    [Action[Exception]]$ErrorCallback
)

#In the calling function:
$Scriptblock = {
    param(
        [Exception]$Exception
    )
    switch -Regex ($Exception.Message) {
        '401|Token' {
            Update-StoredCredential
        }
        '403' {
            Write-PopupMessage "ACCESS DENIED"
        }
        default {
            Write-PopupMessage $_
        }
    }
}

Invoke-LowerLayer -MainParam $blah -OtherParams $blahblah -ErrorCallback $Scriptblock

Which we can refine further:

#On the function you call:
param(
    [string]$MainParam,
    [string]$OtherParams,
    [Action[Exception, System.Management.Automation.CommandInfo, hashtable, bool]]$ErrorCallback
)

#In our try/catch or trap:
 $IsRetryable = $(some logic)
 $ErrorCallback.Invoke($_, $MyInvocation.MyCommand, [hashtable]$PSBoundParameters, $IsRetryable)


#In the calling function:
$Scriptblock = {
    [CmdletBinding()]
    [OutputType([void])]
    param(
        [Exception]$Exception,
        [System.Management.Automation.CommandInfo]$Caller,
        [hashtable]$CallerParameters,
        [bool]$Retryable
    )
    switch -Regex ($Exception.Message) {
        '401|Token' {
            Update-StoredCredential
        }
        '403' {
            Write-PopupMessage "ACCESS DENIED"
        }
        default {
            if ($Retryable) {
                Start-Sleep 5;
                & $Caller @CallerParameters
            }
        }
    }
}

By passing the extra parameters, we can retry the original call.

Note that I’m casting $PSBoundParameters to hashtable because that object doesn’t contain Clone() and I don’t want to pass a reference to an object that may still be being used. (‘Be being’?)

Finally, can we send in a function definition? Is cheese tasty..?

function Handle-Error {
    [CmdletBinding()]
    [OutputType([void])]
    param(
        [Exception]$Exception,
        [System.Management.Automation.CommandInfo]$Caller,
        [hashtable]$CallerParameters,
        [bool]$Retryable
    )
    switch -Regex ($Exception.Message) {
        '401|Token' {
            Update-StoredCredential
        }
        '403' {
            Write-PopupMessage "ACCESS DENIED"
        }
        default {
            if ($Retryable) {
                Start-Sleep 5;
                & $Caller @CallerParameters
            }
        }
    }
}

Invoke-LowerLayer -MainParam $blah -OtherParams $blahblah -ErrorCallback (Get-Item Function:\Handle-Error).ScriptBlock

Note that this code doesn’t function as intended - the solution is in part 2

The benefits of this approach are:

The type system will help your colleagues to use it correctly
It’s easy to wrap layer upon layer upon layer
Error handling is kept separate from business-as-usual processing
Other benefits that I’ll think of after I click ‘Publish’

Now, about that code snippet not working. I wanted to show, conceptually, that you can pass CommandInfo through and invoke it (CommandInfo is the type you get from Get-Command). Two things:

You wouldn’t pass an Exception and a CommandInfo in Powershell, you’d just pass the ErrorRecord. ErrorRecord already contains the Exception and the invocation details of whence it was thrown. It’s a nice little wrapper class.
Whatever scriptblock you cast to Action will never return any output. Action returns void.

There’s more in Part 2, but I won’t blame you if that’s enough for now.

Happy callbacking!

Part 2

Dynamic modules in Powershell

¡Hola chicos!

I’ve been futzing recently with a script library. In this library, we have a bunch of .ps1 files, each of which contain exactly on function of the same name as the file. (Some of the functions contain child functions, but that doesn’t bear upon today’s post.)

I’m thinking about better ways to serve the library to our users. One aspect that I quite like is to put all of the functions in one PS module, such that you could set $PSDefaultParameterValues += @{‘Get-Command:Module’=’Library’}. Then there’s an easy way for users to get a clean look at what functions are available to them.

I found out about New-Module a while back. It’s a way of taking invokable code, namely, a scriptblock, and promoting it into a PS module. You don’t have to create a .psm1 file and run it through Import-Module. Obviously there are lots of ways to abuse that

$DefinitionString = @"
function Write-SternLetter {
    Write-Output $(
        Read-Host "Give them a piece of your mind"
    )
}
"@

$SB = [scriptblock]::Create($DefinitionString)

New-Module -Name Correspondence -ScriptBlock $SB

Get-Command 'Write-SternLetter'

Now, there are a lot of reasons not to dynamically generate code from strings - you lose all the language checking, and you open a door to injection attacks. This is just for demo purposes. But the function does get exported into your scope, and you can see that it’s defined in our dynamic module called ‘Correspondence’:

Get-Command 'Write-SternLetter'
(Get-Command 'Write-SternLetter').Module

However, this alone doesn’t make it easy to work with the module, as Get-Module doesn’t find it. For that, you have to pipe New-Module to Import-Module:

New-Module -Name Correspondence -ScriptBlock $SB | Import-Module
Get-Module

It’s still dynamic, but now you can find it and remove it with Remove-Module, just as if it were defined in a .psm1.

Side note: I use this in scripts in the library to import modules, as $Employer doesn’t have the facility to copy .psm1 dependencies when we execute script from the library. I embed the text from the .psm1 file in a here-string instead, and can use the module as normal. This isn’t perfect, but it does make the code work the same way as it did in dev (think: module scopes.)

So if we have a dynamic module, can we dynamically update a single function in it without having to regenerate and reload the whole module?

Let’s first go on a tangent. If everything in Powershell is an object, what kind of object is a module?

PS C:\dev> (Get-Module Correspondence).GetType()

IsPublic    IsSerial    Name            BaseType
--------    --------    ----            --------
True        False       PSModuleInfo    System.Object

This is conceptually related to (although not in the same inheritance tree as) CommandInfo, which is the base class of FunctionInfo, ScriptInfo, CmdletInfo, et al. All those latter types are what you get from Get-Command; the former is what you get from Get-Module. If you explore these objects, you find interesting properties such as ScriptBlock and Definition. But next we’re going to work with PSModuleInfo’s Invoke() method.

Can anyone tell me what the call operator, &, does?

It calls the Invoke() method on whatever object is passed as the first argument. Subsequent arguments are passed to the Invoke() method.

In pseudocode:

& ($Module) {scriptblock}

causes the scriptblock to be invoked in the scope of the module. So, for example, it can access all the module members that you specifically didn’t export in Export-ModuleMember.

Say your module defines $PrivateVariable but doesn’t export it:

$PrivateVariable

& (Get-Module MyModule) {$PrivateVariable}
foo

…you see where this is leading?

Side note: don’t do this. You can get into debugging hell when you start getting tricky with scopes. What I’m explaining in this post is a very limited use of manipulating the scope of code execution.

So, the answer should be simple: call Invoke() on the module and pass it a scriptblock that redefines the function. Let’s try:

Next, let’s mention the Function drive. An alternative to Get-Command, if you want to find commands that are available (output truncated):

PS C:\dev> Get-ChildItem Function:\ | ft -AutoSize

CommandType Name Version Source
----------- ---- ------- ------
...
Function Get-FileHash 3.1.0.0 Microsoft.PowerShell.Utility
Function Get-IseSnippet 1.0.0.0 ISE
Function Import-IseSnippet 1.0.0.0 ISE
Function Import-PowerShellDataFile 3.1.0.0 Microsoft.PowerShell.Utility
...

However, you’ll see that I enumerated these functions with Get-ChildItem. Because Function:\ is a PSDrive, it allows you to use the same cmdlets that you use to manipulate files on disk, such as Get-ChildItem, Get-Item, Get-Content and… Set-Content!

To dynamically update a function in a module, then, here are all the pieces:

* Import the module
* Get the module into a variable
* Define the updated version of the function
* Get the updated version into a variable
* Executing in the scope of the module, redefine the function

$ModuleSB = {
    function Get-AWitness {
        "YEAH!"
    }
    Export-ModuleMember Get-AWitness
}

$Module = New-Module -Name 'FeelIt' -ScriptBlock $ModuleSB | Import-Module -PassThru

Get-AWitness


YEAH!

& $Module {Set-Content Function:\Get-AWitness {"Hell naw."}}

$Module | Import-Module

Get-AWitness


Hell naw.

UNFORTUNATELY I can’t get this to work for disk-based modules. It would be lovely to be able to shim code without having to fork it, but it looks to be impossible. Please let me know if you have found a way.

Regex trick - named capture groups

I just wanted to share something that I find really cool. _(https://kevinmarquette.github.io/” target=”_blank” rel=”noopener noreferrer”>Kevin Marquette replied to my comment on a forum _with this trick! </starstruck>

Named captures in PS

$Text =
'ERROR: Exception occurred in application FruitComparison.
12 Apples
16 Oranges
Resulted in Divide By Citrus at line 666'

$Text -match '(?<Severity>\w*).*'
$Matches
True

Name                           Value
----                           -----
Severity                       ERROR
0                              ERROR: Exception occurred in application FruitComparison....</span>

Reminder, this is what the $Matches automatic variable looks like when you use a subgroup that isn’t named:

$Text -match '(\w*).*'
$Matches
True

Name                           Value
----                           -----
1                              ERROR
0                              ERROR: Exception occurred in application FruitComparison....</span>

‘\w’ captures all ‘word’ characters, of course.

The problem sometimes with unnamed captures is that you might want to pick out captures 3, 7 and 9, but you have a nagging concern that one day someone will feed it input that has an extra capture between 5 and 6. That will jank up the rest of your code.

Here’s the full pattern:

$Text =
'ERROR: Exception occurred in application FruitComparison.
12 Apples
16 Oranges
Resulted in Divide By Citrus at line 666'

$Pattern = '(?<Severity>\w*): Exception .*? application (?<Application>\w*)\.\r\n(?<Input1>.*)\r\n(?<Input2>.*)\r\nResulted in (?<ExceptionType>.*?) at line (?<Line>\d*)$'

if ($Text -match $Pattern) {
    $Matches
}

Name                           Value
----                           -----
Input1                         12 Apples
ExceptionType                  Divide By Citrus
Severity                       ERROR
Line                           666
Input2                         16 Oranges
Application                    FruitComparison
0                              ERROR: Exception occurred in application FruitComparison....</span>

$Matches is a kind of custom dictionary / hashtable, which isn’t always that useful. (quick note: don’t forget that if a match fails but you then look in $Matches, you’ll get the results from the previous successful match, which is the reason for the if statement.) But guess what? You can get from input text to psobject in very little code:

if ($Text -match $Pattern) {
    $Matches.Remove(0)
    New-Object psobject -Property $Matches

} else {
    throw (New-Object System.Management.Automation.ParseException ("Error parsing input"))
}

Input1        : 12 Apples
ExceptionType : Divide By Citrus
Severity      : ERROR
Line          : 666
Input2        : 16 Oranges
Application   : FruitComparison</span>

I found that ParseException class by googling “msdn parse exception” and picking the second hit. I could have gone with InvalidArgument or similar.

MSDN reference for .net implementations of regex

Hope you enjoyed.

Unit-testing PS help

It’s really useful to have someone to keep you honest if you don’t write help for your PS functions as you go. I find it hard to update help continually, so a red test makes sure I don’t check in code with no help.

Pester tests for PS function-based help

#Help.Tests.ps1

Describe 'Function help' {

    $ExportedCommands = (
                            Get-Module $ModuleName |
                            select -ExpandProperty ExportedCommands
                        ).Keys

    Context 'Correctly-formatted help' {
        It 'Provides examples for every exported function' {
            foreach ($Command in $ExportedCommands)
            {
                (Get-Help $Command -Examples |
                Out-String) |
                Should Match '-------------------------- EXAMPLE 1 --------------------------'
            }
        }
    }
}

It literally does text matching on the output of Get-Help piped to Out-String. Not very powershelly! Perhaps it should use grep.

It would be tough to write a unit test to ensure the help is actually useful. But this will validate that all the examples are syntactically correct:

#Help.Tests.ps1

$ModuleName = 'FunkyModule'
Import-Module "$PSScriptRoot\$ModuleName.psm1" -Force


Describe 'Function help' {
    Context 'Correctly-formatted help' {

        foreach (
            $Command in (
                Get-Module $ModuleName |
                select -ExpandProperty ExportedCommands
                ).Keys
            )
            {
                $Help = Get-Help $Command

                It "$Command has one or more help examples" {
                    $Help.examples.example | Should Not Be $null
                }

                #Test only the parameters? Mock it and see if it throws
                Mock $Command -MockWith {}

                It "$Command examples are syntactically correct" {
                    foreach ($Example in $Help.examples.example) {
                        [Scriptblock]::Create($Example.code) |
                            Should Not Throw
                    }
                }
            } #end foreach
    }
}

Get-Help returns an object that parses the code example out of the rest of the example text. Unfortunately, it only recognises a single line of code. The following will not properly test the code, because the .code property will be only the $MyVar line:

<#
    .Example
    PS C:\> $MyVar = 'Djibouti', 'Fiji'

    PS C:\> Get-CapitalCity -Country $MyVar

    Gets the capital city
#>

Testing the syntax

Pester mocking transparently recreates the param block of the command you are mocking, so, to save handling the running of the command, we just run an empty mock of it:

    #Test only the parameters? Mock it and see if it throws
    Mock $Command -MockWith {}

If we run the tests on a module like this:

**#FunkyModule.psm1**

function Funk {
    <#
        .Example
        PS C:\> Funk -Disco

        .Example
        PS C:\> Funk -On 1, 2, 3, 4

        some helpful text
     #>
     [CmdletBinding()]
     param(
         [switch]$Disco,
         [int[]]$On
     )

     Write-Host "Lorem ipsum"
}

The object form the Get-Help pulls out the example commands from the function help:

PS C:\> foreach ($Example in $Help.examples.example) {
           [Scriptblock]::Create($Example.code)
}
Funk -Disco
Funk -On 1, 2, 3, 4

This matches the param block:

     param(
         [switch]$Disco,
         [int[]]$On
     )

Therefore this test passes. But if we change or delete one of the parameter declarations so the examples are no longer correct, the test fails:

PS C:\> .\dev\FunkyModule\Help.Tests.ps1
Describing Function help
   Context Correctly-formatted help
    [+] Funk has one or more help examples 45ms
    [-] Funk examples are syntactically correct 46ms
      Expected: the expression not to throw an exception. Message was {A parameter cannot be found that matches parameter name 'Disco'.}
          from line:1 char:6
          + Funk -Disco
          +      ~~~~~~
      32:                         [Scriptblock]::Create($Invocation) | Should Not Throw
      at , C:\dev\FunkyModule\Help.Tests.ps1: line 32

The message it gives you tells you how you borked it:

Message was {A parameter cannot be found that matches parameter name 'Disco'.}

So there you have it - unit testing for Powershell function help.

How does PowerShell work?

This is a piece about the components of Powershell. There are many blogposts that explain how to achieve a particular result with some of these classes, but not many that give you an…

Overview of PowerShell

When you open powershell, you are running an executable called (in Windows) powershell.exe. It could be called anything but it is not the PowerShell class. An excerpt from the (https://msdn.microsoft.com/en-us/library/system.management.automation.powershell)MSDN page for the class:

Provides methods that are used to create a pipeline of commands and invoke those commands either synchronously or asynchronously within a runspace.

powershell.exe VS PowerShell (the .net class)

powershell.exe is a console host, like cmd.exe or ConEmu. In fact, you can run PowerShell (the commands) in cmd or ConEmu. Powershell.exe exists only to accept text input, pass it to a command interpreter, and display the text output. It’s just an executable, like the meatbags we inhabit in this forlorn physical world. The animating spirit is within the PowerShell class.

Powershell.exe, when it loads, creates an instance of PowerShell (the class). When you type into powershell.exe, your input is read as a string and passed to the instance of PowerShell using the instance’s AddScript() method. Then powershell.exe calls the instance’s Invoke() method, reads back the results, and formats them to the window.

powershell.exe | Stop-Topic

You can spin up your own instance of PowerShell within PowerShell…

...Powersheption!

PS C:\dev> $PS = [System.Management.Automation.PowerShell]::Create()
PS C:\dev> $PS
Commands            : System.Management.Automation.PSCommand
Streams             : System.Management.Automation.PSDataStreams
InstanceId          : 21cf70ad-878d-47f3-9264-3f201c36ca5f
InvocationStateInfo : System.Management.Automation.PSInvocationStateInfo
IsNested            : False
HadErrors           : False
Runspace            : System.Management.Automation.Runspaces.LocalRunspace
RunspacePool        :
IsRunspaceOwner     : True
HistoryString       :

Note that I’m calling the Create() method to get an instance of PowerShell; New-Object won’t work.

So what does this thing give us?

PS C:\dev> $PS.Commands
Commands
--------
{}

PS C:\dev> $PS.Runspace
 Id Name      ComputerName Type  State  Availability
 -- ----      ------------ ----  -----  ------------
 2  Runspace2 localhost    Local Opened Available

PS C:\dev> $PS.Streams
Error       : {}
Progress    : {}
Verbose     : {}
Debug       : {}
Warning     : {}
Information : {}

It makes sense that Commands is empty - we’ve only just created it. We have the following methods to safely and correctly add commands to our PowerShell:

* AddArgument
* AddCommand
* AddParameter
* AddParameters
* AddScript
* AddStatement

If you don’t want the work of laboriously specifying each detail, use AddScript():

PS C:\dev> $PS.AddScript("Get-Content C:\Windows\System32\drivers\etc\hosts | Select-String 'SqlCluster'")
Commands            : System.Management.Automation.PSCommand
Streams             : System.Management.Automation.PSDataStreams
InstanceId          : 21cf70ad-878d-47f3-9264-3f201c36ca5f
InvocationStateInfo : System.Management.Automation.PSInvocationStateInfo
IsNested            : False
HadErrors           : True
Runspace            : System.Management.Automation.Runspaces.LocalRunspace
RunspacePool        :
IsRunspaceOwner     : True
HistoryString       :

PS C:\dev> $PS.Invoke()

10.120.0.115 SqlCluster.weylandyutani.local

Note that the AddScript method returns the modified object. That’s great for chaining methods, but I’ll edit it out of the code for the rest of the post.

Streams

PS C:\dev> $PS.Commands.Clear()
PS C:\dev> $PS.AddScript('$VerbosePreference = "Continue"')
PS C:\dev> $PS.AddScript("Write-Verbose (Get-Content C:\Windows\System32\drivers\etc\hosts | Select-String 'SqlCluster')")
PS C:\dev> $PS.Invoke()

PS C:\dev> $PS.Streams
Error       : {}
Progress    : {}
Verbose     : {10.120.0.115 SqlCluster.weylandyutani.local}
Debug       : {}
Warning     : {}
Information : {}

As expected, Invoke() returns nothing. However, Our Streams object now has info in the Verbose stream. That’s awesome because it lets us easily capture and output the alternate streams wherever we want - and if you’ve ever had to work with the (https://ss64.com/ps/syntax-redirection.html)redirect operators, you’ll be happy.

Please note that Invoke() does not clear the commands when it completes - you have to manually call Clear(), as in the top line above. Likewise, you have to clear the streams with ClearStreams().

In the next part of this topic, I’ll discuss what the runspace does.

← Newer Page 2 of 3 Older →