Welcome!

A Software Architect Living in a Networking World

Joe Pruitt

Subscribe to Joe Pruitt: eMailAlertsEmail Alerts
Get Joe Pruitt via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Blog Feed Post

Unix To PowerShell – Cut

PowerShell_unix PowerShell is definitely gaining momentum in the windows scripting world but I still hear folks wanting to rely on Unix based tools to get their job done.  In this series of posts I’m going to look at converting some of the more popular Unix based tools to PowerShell.

cut

The Unix “cut” command is used to extract sections from each link of input.  Extraction of line segments can be done by bytes, characters, or fields separated by a delimiter.  A range must be provided in each case which consists of one of N, N-M, N- (N to the end of the line), or –M (beginning of the line to M), where N and M are counted from 1 (there is no zeroth value).

For PowerShell, I’ve omitted support for bytes but the rest of the features is included.  The Parse-Range function is used to parse the above range specification.  It takes as input a range specifier and returns an array of indices that the range contains.  Then, the In-Range function is used to determine if a given index is included in the parsed range. 

The real work is done in the Do-Cut function.  In there, input error conditions are checked.  Then for each file supplied, lines are extracted and processed with the given input specifiers.  For character ranges, each character is processed and if it’s index in the line is in the given range, it is appended to the output line.  For field ranges, the line is split into tokens using the delimiter specifier (default is a TAB).  Each field is processed and if it’s index is in the included range, the field is appended to the output with the given output_delimiter specifier (which defaults to the input delimiter).

The options to the Unix cut command are implemented with the following PowerShell arguments:

Unix PowerShell Description
FILE -filespec The files to process.
-c -characters Output only this range of characters.
-f -fields Output only these fields specified by given range.
-d -delimiter Use DELIM instead of TAB for input field delimiter.
-s -only_delimited Do not print lines not containing delimiters.
--output-delimiter -output_delimiter Use STRING as the output deflimiter.

 

   1: #----------------------------------------------------------------
   2: # Cut.ps1
   3: #----------------------------------------------------------------
   4: param
   5: (
   6:   [string]$filespec = $null,
   7:   [string]$characters = $null,      # Output only this range of characters
   8:   [string]$fields = $null,          # Output Only this field range (ie 1-2; 3,4)
   9:   [string]$delimiter = $null,        # Use DELIM instead of TAB for field delimiter
  10:   [bool]$only_delimited = $false,   # Do not print lines not containing delimiter
  11:   [string]$output_delimiter = $null # Use STRING as output delimiter 
  12: );
  13:  
  14: $script:IN_DELIMITER = "`t";
  15: if ( $delimiter ) { $script:IN_DELIMITER = $delimiter; }
  16: $script:OUT_DELIMITER = $delimiter;
  17: if ( $output_delimiter ) { $script:OUT_DELIMITER = $output_delimiter; }
  18:  
  19: $script:MAX_COLUMNS = 1000;
  20:  
  21: #----------------------------------------------------------------
  22: # function Show-Error
  23: #----------------------------------------------------------------
  24: function Show-Error()
  25: {
  26:   param([string]$msg = $null);
  27:   if ( $msg )
  28:   {
  29:     Write-Host $msg;
  30:     exit;
  31:   }
  32: }
  33:  
  34: #----------------------------------------------------------------
  35: # function Parse-Range
  36: #----------------------------------------------------------------
  37: function Parse-Range()
  38: {
  39:   param([string]$range);
  40:   
  41:   [int[]]$indices = @();
  42:   if ( $range )
  43:   {
  44:     $tokens = $range.Split(',');
  45:     foreach ($token in $tokens )
  46:     {
  47:       if ( $token.Contains("-") )
  48:       {
  49:         $subtokens = $token.Split('-');
  50:         if ( $subtokens.Length -ne 2 )
  51:         {
  52:           Show-Error "Cut: Invalid character or field list."
  53:           exit;
  54:         }
  55:         else
  56:         {
  57:           if ( ($subtokens[0].Length -eq 0) -and ($subtokens[1].Length -gt 0) )
  58:           {
  59:             # -N
  60:             $indices += @(1 .. $subtokens[1]);
  61:           }
  62:           elseif ( ($subtokens[1].Length -eq 0) -and ($subtokens[0].Length -gt 0) )
  63:           {
  64:             # N-
  65:             $indices += @($subtokens[0] .. $script:MAX_COLUMNS);
  66:           }
  67:           else
  68:           {
  69:             # N-N
  70:             if ( $subtokens[1] -lt $subtokens[0] )
  71:             {
  72:               Show-Error "Cut: Invalid character or field list";
  73:             }
  74:             else
  75:             {
  76:               $indices += @($subtokens[0] .. $subtokens[1]);
  77:             }
  78:           }
  79:         }
  80:       }
  81:       else
  82:       {
  83:        $indices += @($token);
  84:       }
  85:     }
  86:   }
  87:   $indices;
  88: }
  89:  
  90: #----------------------------------------------------------------
  91: # function In-Range
  92: #----------------------------------------------------------------
  93: function In-Range()
  94: {
  95:   param
  96:   (
  97:     [int]$index = 0,
  98:     [int[]]$range = $null
  99:   );
 100:   $inrange = $true;
 101:   
 102:   if ( $range )
 103:   {
 104:     $inrange = $false;
 105:     foreach ($i in $range )
 106:     {
 107:       if ( $i -eq $index )
 108:       {
 109:         $inrange = $true;
 110:       }
 111:     }
 112:   }
 113:   $inrange;
 114: }
 115:  
 116: #----------------------------------------------------------------
 117: # function Do-Cut
 118: #----------------------------------------------------------------
 119: function Do-Cut()
 120: {
 121:   param
 122:   (
 123:     [string]$filespec = $null,
 124:     [string]$characters = $null,
 125:     [string]$fields = $null,
 126:     [string]$delimiter = $null,
 127:     [bool]$only_delimited = $false,
 128:     [string]$output_delimiter = $null
 129:   );
 130:   
 131:   if ( $filespec )
 132:   {
 133:     # Check parameters
 134:     if ( $characters -and $fields )
 135:     {
 136:       Show-Error "Cut: only one type of list may be specified.";
 137:     }
 138:     elseif ( !$characters -and !$fields )
 139:     {
 140:       Show-Error "Cut: You must specify a list of characters or fields.";
 141:     }
 142:     else
 143:     {
 144:       if ( $characters )
 145:       {
 146:         $range = Parse-Range -range $characters;
 147:       }
 148:       elseif ( $fields )
 149:       {
 150:         $range = Parse-Range -range $fields;
 151:       }
 152:       
 153:       $files = @(Get-ChildItem -Path $filespec);
 154:       foreach ($file in $files)
 155:       {
 156:         $lines = Get-Content -Path $file;
 157:         foreach ($line in $lines)
 158:         {
 159:           $line_out = $line;
 160:           if ( $characters )
 161:           {
 162:             $line_out = "";
 163:             for($i = 1; $i -le $line.Length; $i++)
 164:             {
 165:               if ( In-Range $i $range )
 166:               {
 167:                 $line_out += $line[$i-1];
 168:               }
 169:             }
 170:           }
 171:           elseif ( $fields )
 172:           {
 173:             $line_out = "";
 174:             $line_fields = $line.Split($script:IN_DELIMITER);
 175:             if ( $only_delimited -and ($line_fields.Length -eq 1) )
 176:             {
 177:               # Skip this line since it didn't contain the delimiter
 178:               continue;
 179:             }
 180:             else
 181:             {
 182:               for($i=1; $i -le $line_fields.Length; $i++)
 183:               {
 184:                 if ( In-Range $i $range )
 185:                 {
 186:                   if ( $line_out.Length ) { $line_out += $script:OUT_DELIMITER; }
 187:                   $line_out += $line_fields[$i-1];
 188:                 }
 189:               }
 190:             }
 191:           }
 192:           $line_out;
 193:         }
 194:       }
 195:     }
 196:   }
 197: }
 198:  
 199: Do-Cut -filespec $filespec -characters $characters -fields $fields `
 200:   -delimiter $delimiter -only_delimited $only_delimited `
 201:   -output_delimiter $output_delimiter;

You can download the script here: Cut.ps1

Read the original blog entry...

More Stories By Joe Pruitt

Joe Pruitt is a Principal Strategic Architect at F5 Networks working with Network and Software Architects to allow them to build network intelligence into their applications.