When PowerShell hash table magic backfires

by Klaus Graefensteiner 1/4/2009 2:30:48 PM

Introduction

We just made it through the holidays and I finally found time to write about the most annoying thing in PowerShell: Hash table semi-automatic, which can easily result in some subtle bugs. And all of that, just because the hash tables don't behave as I would expect. I assumed hash tables to be implicitly sorted by the keys using the Sort-Object cmdlet and to always stay hash tables and not get converted into arrays of Dictionary Entries or even a single Dictionary entry when piped into a Where-Object cmdlet. To workaround these strange defects you need to apply some unnatural constructs and roll your own select statements. It just doesn't feel fair to make your children call GetEnumerator() before they try to find today's window on their Advent's Calendar.

An Advent Calendar is a hash table. The day of December maps to a chocolate. 

Figure 1: An Advent Calendar is a hash table. The day of December maps to a chocolate.

Hash table sorting needs GetEnumrator()

   1: $a = @{}
   2:  
   3: $a[1] = "one"
   4: $a[11] = "eleven"
   5: $a[2] = "two"
   6:  
   7: # I expect this to automatically sort by the hash table key
   8: $a | Sort-Object -Descending
   9:  
  10: Name                           Value                                 
  11: ----                           -----                                 
  12: 2                              two                                   
  13: 1                              one                                   
  14: 11                             eleven 
  15:  
  16:  
  17: # Here is the workaround:
  18: $a.GetEnumerator() | Sort-Object Key -Descending
  19:  
  20: Name                           Value                                 
  21: ----                           -----                                 
  22: 11                             eleven                                
  23: 2                              two                                   
  24: 1                              one 

In this case I took me a while to get over it. Why do I need to call GetEnumerator() on a hash table, but not on an array in PowerShell? This is just ugly.

Hash table filtering needs hand rolled script

   1: $h = @{}
   2: $h["one"] = 1
   3: $h["two"] = 2
   4: $h.GetType().Fullname
   5: $h
   6:  
   7: System.Collections.Hashtable
   8: Name                           Value                                 
   9: ----                           -----                                 
  10: two                            2                                     
  11: one                            1                                     
  12:  
  13: $g = @{}
  14: $g["one"] = 1
  15: $g.GetType().Fullname
  16: $g
  17:  
  18: System.Collections.Hashtable
  19: Name                           Value                                 
  20: ----                           -----  
  21: one                            1                                     
  22:  
  23: # Where-object transfors a hash table into an array of dictionary entries, if they are more than 1
  24: $c = $h.GetEnumerator() | Where-Object { $_.Key -eq "one" -or $_.Key -eq "two" }
  25: $c.GetType().Fullname
  26: $c
  27:  
  28: System.Object[]
  29: two                            2                                     
  30: one                            1  
  31:  
  32: # Where-object transforms a hash table into a DictionaryEntry object, if there are just 1
  33: $c = $h.GetEnumerator() | Where-Object { $_.Key -eq "one" }
  34: $c.GetType().Fullname
  35: $c
  36:  
  37: System.Collections.DictionaryEntry
  38: one                              1
  39:  
  40: # You better use Foreach-Object and fill the resulting hash table explicitly
  41: $r = @{}
  42: $h.GetEnumerator() | ForEach-Object { if ( $_.Key -eq "one") { $r[$_.Key] = $_.Value }}
  43:  
  44: $r.GetType().Fullname
  45: $r
  46: System.Collections.Hashtable
  47: Name                           Value                                 
  48: ----                           -----  
  49: one                            1                                     
  50: $h = $r
  51: $h.GetType().Fullname
  52: $h
  53: System.Collections.Hashtable
  54: Name                           Value                                 
  55: ----                           -----  
  56: one                            1                                     

This is another nice one. If a hash table contains more than one entry, then the result of piping it through where-object is an array of DictionaryEntry, but if the hash table contains only one entry then the result of where-object is just one object of type DictionaryEntry. I expected that where-object would filter out some entries of a hash table, but the result would still be a hash table, even if there would be zero entries. I avoided using where-object and did my own comparison operations while iterating over the hash table. This way I had full control over what gets emitted by the pipe.

Ausblick

The bad news is that the hash table shortcomings make PowerShell a little bit ugly to write and chew up a lot of time for debugging. The good news is that the once you know about the quirks, you have several fall back solutions to avoid getting completely combusted.

Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: ,

PowerShell

Related posts

Comments

1/5/2009 10:50:47 AM

Bruce Payette

Hi Klaus,

Hashtables *are* passed as a single object through the pipeline. It's calling GetEnumerator() that sends the stream of dictionary entries. In fact this is why you have to call GetEnumerator(). Otherwise you just pass the hashtable intact. If you want to just get the keys, then you can use the Keys property, just the values, use the Values property, etc. Hashtable keys are not stored in sorted order so when you display a hashtable, the display appears random because all we're doing to display the hashtable is calling GetEnumerator() and then displaying the Key/Value pairs. (I supposed we could display the object sorting the keys first. The end-user can also do this by adding a clause in a formats.ps1xml file for hashtables.)

BTW - if you want to copy or merge hashtables, the "+" operator will do this. $new = @{} + $old will make a copy of the hashtable in $new and, if $first and $second are both hashtables, doing $merged = $first + $second will create a new hashtable with the merged keys from both (assuming there are no collisions which is considered an error.)

Bruce Payette
Microsoft PowerShell Team

Bruce Payette us

1/6/2009 8:08:29 AM

KG

Hi Bruce,

I guess it just boils down to how you see a hashtable from a high level conceptual point. As the ancient Greeks would ask: Does a hashtable burn like an array or like an object? Smile
One way to help users would be providing some kind of warning, when for example a hashtable variable gets passed into the sort-object cmdlet. "Warning: Do you really intend to sort one object, or do you want to sort the keys collection of this object. Use GetEnumerator() in the latter case."

By the way. I really enjoyed reading your book and now that I did some experimenting with PowerShell I am going to read it again to extract the pieces that I didn't completely inhale the first time.

I hope you there will be a second edition of your book that covers the upcoming release of PowerShell.

Thanks,

Klaus

KG us

Comments are closed

Powered by BlogEngine.NET 1.3.0.0
Vanilla Theme by Klaus Graefensteiner

About Klaus Graefensteiner

GRAVATAR icon of Klaus Graefensteiner I enjoy the programming of machines.

E-mail me Send mail
Blogroll as OPML OPML LinkedIn Profile View Klaus Graefensteiner's LinkedIn profile

Calendar

<<  July 2009  >>
MoTuWeThFrSaSu
293012345
6789101112
13141516171819
20212223242526
272829303112
3456789

View posts in large calendar

Recent comments

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2009

Sign in