What is awk?

-Utility designed for text processing and typically used as a data extraction and reporting tool.

Where is awk? The location of awk can be found using the which command

username@computername ~ % which awk

/usr/bin/awk

How to get the computer’s Model Identifier? The Model Identifier information lives in the system_profiler  > SPHardwareDataType, below is an example of the Hardware Overview. The Model Identifier can be seen in this list.

username@computername ~ % system_profiler SPHardwareDataType

Hardware:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro16,2
Processor Name: Quad-Core Intel Core i5
Processor Speed: 2 GHz
Number of Processors: 1
Total Number of Cores: 4
L2 Cache (per Core): 512 KB
L3 Cache: 6 MB
Hyper-Threading Technology: Enabled
Memory: 16 GB
System Firmware Version: 1715.40.15.0.0 (iBridge: 19.16.10548.0.0,0)
OS Loader Version: 540.40.4~45
Serial Number (system): C02DP4YSML7H
Hardware UUID: 189F76C2-E9F6-533E-A4A6-9D87DBF5CBAC
Provisioning UDID: 189F76C2-E9F6-533E-A4A6-9D87DBF5CBAC
Activation Lock Status: Disabled

The end goal is to grab the Model Identifier from this list.

username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Identifier/'

Model Identifier: MacBookPro16,2

This result can be imagined as 3 separate columns to our shell. (1) Model, (2) Identifier:, and (3) MacBookPro16,2. This is because bash uses spaces are its default Internal Field Separator. Basically where bash recognizes boundaries while splitting a sequence of character strings. With awk we call call out specific pieces of this string using the print command with awk.

username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Identifier/{print $1}'

Model


>system_profiler SPHardwareDataType | awk '/Model Identifier/{print $2}'

Identifier:


>system_profiler SPHardwareDataType | awk '/Model Identifier/{print $3}'

MacBookPro16,2

In this specific case, the useful information exists in $3. Since it is expected that the last portion of this string will contain the useful information, the command can also be formatted using NF (Number of Fields):

username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Identifier/{print $NF}'

MacBookPro16,2

Going a step further, this line can be used with command substitution to store this value into a variable that could be used in a script. For this, the command will look like:

username@computername ~ % modelIdentifier=$(system_profiler SPHardwareDataType | awk '/Model Identifier/{print $NF}')

username@computername ~ % echo $modelIdentifier

MacBookPro16,2

If multiple fields are required to print, those can also be called within the same print bracket:

username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Name/{print $3 $4}'

MacBookPro

If the space between is desired:

username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Name/{print $3, $4}'

MacBook Pro

And if $NF is the way to go with multiple values:

username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Name/{print $(NF-1), $NF}'

MacBook Pro

For more information on awk:

username@computername ~ % man awk

AWK(1)                       General Commands Manual                      AWK(1)



NNAAMMEE
       awk - pattern-directed scanning and processing language

SSYYNNOOPPSSIISS
       aawwkk [ --FF _f_s ] [ --vv _v_a_r_=_v_a_l_u_e ] [ _'_p_r_o_g_' | --ff _p_r_o_g_f_i_l_e ] [ _f_i_l_e _._._.  ]

DDEESSCCRRIIPPTTIIOONN
       _A_w_k scans each input _f_i_l_e for lines that match any of a set of patterns
       specified literally in _p_r_o_g or in one or more files specified as --ff
       _p_r_o_g_f_i_l_e.  With each pattern there can be an associated action that will
       be performed when a line of a _f_i_l_e matches the pattern.  Each line is
       matched against the pattern portion of every pattern-action statement;
       the associated action is performed for each matched pattern.  The file
       name -- means the standard input.  Any _f_i_l_e of the form _v_a_r_=_v_a_l_u_e is
       treated as an assignment, not a filename, and is executed at the time it
       would have been opened if it were a filename.  The option --vv followed by
       _v_a_r_=_v_a_l_u_e is an assignment to be done before _p_r_o_g is executed; any number
       of --vv options may be present.  The --FF _f_s option defines the input field
       separator to be the regular expression _f_s.

       An input line is normally made up of fields separated by white space, or
       by the regular expression FFSS.  The fields are denoted $$11, $$22, ..., while
       $$00 refers to the entire line.  If FFSS is null, the input line is split
       into one field per character.

       A pattern-action statement has the form:

              _p_a_t_t_e_r_n {{ _a_c_t_i_o_n }}

       A missing {{ _a_c_t_i_o_n }} means print the line; a missing pattern always
       matches.  Pattern-action statements are separated by newlines or
       semicolons.

       An action is a sequence of statements.  A statement can be one of the
       following:

              if( _e_x_p_r_e_s_s_i_o_n ) _s_t_a_t_e_m_e_n_t [ else _s_t_a_t_e_m_e_n_t ]
              while( _e_x_p_r_e_s_s_i_o_n ) _s_t_a_t_e_m_e_n_t
              for( _e_x_p_r_e_s_s_i_o_n ; _e_x_p_r_e_s_s_i_o_n ; _e_x_p_r_e_s_s_i_o_n ) _s_t_a_t_e_m_e_n_t
              for( _v_a_r in _a_r_r_a_y ) _s_t_a_t_e_m_e_n_t
              do _s_t_a_t_e_m_e_n_t while( _e_x_p_r_e_s_s_i_o_n )
              break
              continue
              { [ _s_t_a_t_e_m_e_n_t _._._. ] }
              _e_x_p_r_e_s_s_i_o_n              # commonly _v_a_r _= _e_x_p_r_e_s_s_i_o_n
              print [ _e_x_p_r_e_s_s_i_o_n_-_l_i_s_t ] [ > _e_x_p_r_e_s_s_i_o_n ]
              printf _f_o_r_m_a_t [ , _e_x_p_r_e_s_s_i_o_n_-_l_i_s_t ] [ > _e_x_p_r_e_s_s_i_o_n ]
              return [ _e_x_p_r_e_s_s_i_o_n ]
              next                    # skip remaining patterns on this input line
              nextfile                # skip rest of this file, open next, start at top
              delete _a_r_r_a_y[ _e_x_p_r_e_s_s_i_o_n ]# delete an array element
              delete _a_r_r_a_y            # delete all elements of array
              exit [ _e_x_p_r_e_s_s_i_o_n ]     # exit immediately; status is _e_x_p_r_e_s_s_i_o_n

       Statements are terminated by semicolons, newlines or right braces.  An
       empty _e_x_p_r_e_s_s_i_o_n_-_l_i_s_t stands for $$00.  String constants are quoted " ",
       with the usual C escapes recognized within.  Expressions take on string
       or numeric values as appropriate, and are built using the operators ++ -- **
       // %% ^^ (exponentiation), and concatenation (indicated by white space).
       The operators !! ++++ ---- ++== --== **== //== %%== ^^== >> >>== << <<== ==== !!== ??:: are also
       available in expressions.  Variables may be scalars, array elements
       (denoted _x[[_i]]) or fields.  Variables are initialized to the null string.
       Array subscripts may be any string, not necessarily numeric; this allows
       for a form of associative memory.  Multiple subscripts such as [[ii,,jj,,kk]]
       are permitted; the constituents are concatenated, separated by the value
       of SSUUBBSSEEPP.

       The pprriinntt statement prints its arguments on the standard output (or on a
       file if >> _f_i_l_e  or >>>> _f_i_l_e  is present or on a pipe if || _c_m_d  is
       present), separated by the current output field separator, and terminated
       by the output record separator.  _f_i_l_e and _c_m_d may be literal names or
       parenthesized expressions; identical string values in different
       statements denote the same open file.  The pprriinnttff statement formats its
       expression list according to the _f_o_r_m_a_t (see _p_r_i_n_t_f(3)).  The built-in
       function cclloossee((_e_x_p_r)) closes the file or pipe _e_x_p_r.  The built-in function
       fffflluusshh((_e_x_p_r)) flushes any buffered output for the file or pipe _e_x_p_r.

       The mathematical functions aattaann22, ccooss, eexxpp, lloogg, ssiinn, and ssqqrrtt are built
       in.  Other built-in functions:


       lleennggtthh
            the length of its argument taken as a string, number of elements in
            an array for an array argument, or length of $$00 if no argument.
       rraanndd random number on [0,1).
       ssrraanndd
            sets seed for rraanndd and returns the previous seed.
       iinntt  truncates to an integer value.
       ssuubbssttrr((_s,, _m [,, _n]))
            the _n-character substring of _s that begins at position _m counted
            from 1.  If no _n, use the rest of the string.
       iinnddeexx((_s,, _t))
            the position in _s where the string _t occurs, or 0 if it does not.
       mmaattcchh((_s,, _r))
            the position in _s where the regular expression _r occurs, or 0 if it
            does not.  The variables RRSSTTAARRTT and RRLLEENNGGTTHH are set to the position
            and length of the matched string.
       sspplliitt((_s,, _a [,, _f_s]))
            splits the string _s into array elements _a[[11]], _a[[22]], ..., _a[[_n]], and
            returns _n.  The separation is done with the regular expression _f_s or
            with the field separator FFSS if _f_s is not given.  An empty string as
            field separator splits the string into one array element per
            character.
       ssuubb((_r,, _t [, _s]))
            substitutes _t for the first occurrence of the regular expression _r
            in the string _s.  If _s is not given, $$00 is used.
       ggssuubb((_r,, _t [, _s]))
            same as ssuubb except that all occurrences of the regular expression
            are replaced; ssuubb and ggssuubb return the number of replacements.
       sspprriinnttff((_f_m_t,, _e_x_p_r,, _._._.))
            the string resulting from formatting _e_x_p_r _._._.  according to the
            _p_r_i_n_t_f(3) format _f_m_t.
       ssyysstteemm((_c_m_d))
            executes _c_m_d and returns its exit status. This will be -1 upon
            error, _c_m_d's exit status upon a normal exit, 256 + _s_i_g upon death-
            by-signal, where _s_i_g is the number of the murdering signal, or 512 +
            _s_i_g if there was a core dump.
       ttoolloowweerr((_s_t_r))
            returns a copy of _s_t_r with all upper-case characters translated to
            their corresponding lower-case equivalents.
       ttoouuppppeerr((_s_t_r))
            returns a copy of _s_t_r with all lower-case characters translated to
            their corresponding upper-case equivalents.

       The ``function'' ggeettlliinnee sets $$00 to the next input record from the
       current input file; ggeettlliinnee << _f_i_l_e  sets $$00 to the next record from _f_i_l_e.
       ggeettlliinnee _x sets variable _x instead.  Finally, _c_m_d || ggeettlliinnee  pipes the
       output of _c_m_d into ggeettlliinnee; each call of ggeettlliinnee returns the next line of
       output from _c_m_d.  In all cases, ggeettlliinnee returns 1 for a successful input,
       0 for end of file, and -1 for an error.

       Patterns are arbitrary Boolean combinations (with !! |||| &&&&) of regular
       expressions and relational expressions.  Regular expressions are as
       defined in _r_e___f_o_r_m_a_t(7).  Isolated regular expressions in a pattern apply
       to the entire line.  Regular expressions may also occur in relational
       expressions, using the operators ~~ and !!~~.  //_r_e// is a constant regular
       expression; any string (constant or variable) may be used as a regular
       expression, except in the position of an isolated regular expression in a
       pattern.

       A pattern may consist of two patterns separated by a comma; in this case,
       the action is performed for all lines from an occurrence of the first
       pattern though an occurrence of the second.

       A relational expression is one of the following:

              _e_x_p_r_e_s_s_i_o_n _m_a_t_c_h_o_p _r_e_g_u_l_a_r_-_e_x_p_r_e_s_s_i_o_n
              _e_x_p_r_e_s_s_i_o_n _r_e_l_o_p _e_x_p_r_e_s_s_i_o_n
              _e_x_p_r_e_s_s_i_o_n iinn _a_r_r_a_y_-_n_a_m_e
              ((_e_x_p_r,,_e_x_p_r_,_._._.)) iinn _a_r_r_a_y_-_n_a_m_e

       where a _r_e_l_o_p is any of the six relational operators in C, and a _m_a_t_c_h_o_p
       is either ~~ (matches) or !!~~ (does not match).  A conditional is an
       arithmetic expression, a relational expression, or a Boolean combination
       of these.

       The special patterns BBEEGGIINN and EENNDD may be used to capture control before
       the first input line is read and after the last.  BBEEGGIINN and EENNDD do not
       combine with other patterns.  They may appear multiple times in a program
       and execute in the order they are read by _a_w_k.

       Variable names with special meanings:


       AARRGGCC argument count, assignable.
       AARRGGVV argument array, assignable; non-null members are taken as filenames.
       CCOONNVVFFMMTT
            conversion format used when converting numbers (default %%..66gg).
       EENNVVIIRROONN
            array of environment variables; subscripts are names.
       FFIILLEENNAAMMEE
            the name of the current input file.
       FFNNRR  ordinal number of the current record in the current file.
       FFSS   regular expression used to separate fields; also settable by option
            --FF_f_s.
       NNFF   number of fields in the current record.
       NNRR   ordinal number of the current record.
       OOFFMMTT output format for numbers (default %%..66gg).
       OOFFSS  output field separator (default space).
       OORRSS  output record separator (default newline).
       RRLLEENNGGTTHH
            the length of a string matched by mmaattcchh.
       RRSS   input record separator (default newline).  If empty, blank lines
            separate records.  If more than one character long, RRSS is treated as
            a regular expression, and records are separated by text matching the
            expression.
       RRSSTTAARRTT
            the start position of a string matched by mmaattcchh.
       SSUUBBSSEEPP
            separates multiple subscripts (default 034).

       Functions may be defined (at the position of a pattern-action statement)
       thus:

              ffuunnccttiioonn ffoooo((aa,, bb,, cc)) {{ ......;; rreettuurrnn xx }}

       Parameters are passed by value if scalar and by reference if array name;
       functions may be called recursively.  Parameters are local to the
       function; all other variables are global.  Thus local variables may be
       created by providing excess parameters in the function definition.

EENNVVIIRROONNMMEENNTT VVAARRIIAABBLLEESS
       If PPOOSSIIXXLLYY__CCOORRRREECCTT is set in the environment, then _a_w_k follows the POSIX
       rules for ssuubb and ggssuubb with respect to consecutive backslashes and
       ampersands.

EEXXAAMMPPLLEESS
       length($0) > 72
       Print lines longer than 72 characters.
       { print $2, $1 }
       Print first two fields in opposite order.

       BEGIN { FS = ",[ \t]*|[ \t]+" }
             { print $2, $1 }

              Same, with input fields separated by comma and/or spaces and tabs.

            { s += $1 }
       END  { print "sum is", s, " average is", s/NR }

              Add up first column, print sum and average.
              /start/, /stop/
              Print all lines between start/stop pairs.

       BEGIN     {    # Simulate echo(1)
            for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
            printf "\n"
            exit }

SSEEEE AALLSSOO
       _g_r_e_p(1), _l_e_x(1), _s_e_d(1)
       A. V. Aho, B. W. Kernighan, P. J. Weinberger, _T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g
       _L_a_n_g_u_a_g_e, Addison-Wesley, 1988.  ISBN 0-201-07981-X.

BBUUGGSS
       There are no explicit conversions between numbers and strings.  To force
       an expression to be treated as a number add 0 to it; to force it to be
       treated as a string concatenate "" to it.

       The scope rules for variables in functions are a botch; the syntax is
       worse.

       Only eight-bit characters sets are handled correctly.



                                   2020-11-24                             AWK(1)

Leave a Reply

Blog at WordPress.com.

%d bloggers like this: