
What is awk?
-Utility designed for text processing and typically used as a data extraction and reporting tool.
Where is awk? The location of awk can be found using the which command
username@computername ~ % which awk /usr/bin/awk
How to get the computer’s Model Identifier? The Model Identifier information lives in the system_profiler > SPHardwareDataType, below is an example of the Hardware Overview. The Model Identifier can be seen in this list.
username@computername ~ % system_profiler SPHardwareDataType Hardware: Hardware Overview: Model Name: MacBook Pro Model Identifier: MacBookPro16,2 Processor Name: Quad-Core Intel Core i5 Processor Speed: 2 GHz Number of Processors: 1 Total Number of Cores: 4 L2 Cache (per Core): 512 KB L3 Cache: 6 MB Hyper-Threading Technology: Enabled Memory: 16 GB System Firmware Version: 1715.40.15.0.0 (iBridge: 19.16.10548.0.0,0) OS Loader Version: 540.40.4~45 Serial Number (system): C02DP4YSML7H Hardware UUID: 189F76C2-E9F6-533E-A4A6-9D87DBF5CBAC Provisioning UDID: 189F76C2-E9F6-533E-A4A6-9D87DBF5CBAC Activation Lock Status: Disabled
The end goal is to grab the Model Identifier from this list.
username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Identifier/' Model Identifier: MacBookPro16,2
This result can be imagined as 3 separate columns to our shell. (1) Model, (2) Identifier:, and (3) MacBookPro16,2. This is because bash uses spaces are its default Internal Field Separator. Basically where bash recognizes boundaries while splitting a sequence of character strings. With awk we call call out specific pieces of this string using the print command with awk.
username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Identifier/{print $1}' Model >system_profiler SPHardwareDataType | awk '/Model Identifier/{print $2}' Identifier: >system_profiler SPHardwareDataType | awk '/Model Identifier/{print $3}' MacBookPro16,2
In this specific case, the useful information exists in $3. Since it is expected that the last portion of this string will contain the useful information, the command can also be formatted using NF (Number of Fields):
username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Identifier/{print $NF}' MacBookPro16,2
Going a step further, this line can be used with command substitution to store this value into a variable that could be used in a script. For this, the command will look like:
username@computername ~ % modelIdentifier=$(system_profiler SPHardwareDataType | awk '/Model Identifier/{print $NF}') username@computername ~ % echo $modelIdentifier MacBookPro16,2
If multiple fields are required to print, those can also be called within the same print bracket:
username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Name/{print $3 $4}' MacBookPro
If the space between is desired:
username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Name/{print $3, $4}' MacBook Pro
And if $NF is the way to go with multiple values:
username@computername ~ % system_profiler SPHardwareDataType | awk '/Model Name/{print $(NF-1), $NF}' MacBook Pro
For more information on awk:
username@computername ~ % man awk AWK(1) General Commands Manual AWK(1) NNAAMMEE awk - pattern-directed scanning and processing language SSYYNNOOPPSSIISS aawwkk [ --FF _f_s ] [ --vv _v_a_r_=_v_a_l_u_e ] [ _'_p_r_o_g_' | --ff _p_r_o_g_f_i_l_e ] [ _f_i_l_e _._._. ] DDEESSCCRRIIPPTTIIOONN _A_w_k scans each input _f_i_l_e for lines that match any of a set of patterns specified literally in _p_r_o_g or in one or more files specified as --ff _p_r_o_g_f_i_l_e. With each pattern there can be an associated action that will be performed when a line of a _f_i_l_e matches the pattern. Each line is matched against the pattern portion of every pattern-action statement; the associated action is performed for each matched pattern. The file name -- means the standard input. Any _f_i_l_e of the form _v_a_r_=_v_a_l_u_e is treated as an assignment, not a filename, and is executed at the time it would have been opened if it were a filename. The option --vv followed by _v_a_r_=_v_a_l_u_e is an assignment to be done before _p_r_o_g is executed; any number of --vv options may be present. The --FF _f_s option defines the input field separator to be the regular expression _f_s. An input line is normally made up of fields separated by white space, or by the regular expression FFSS. The fields are denoted $$11, $$22, ..., while $$00 refers to the entire line. If FFSS is null, the input line is split into one field per character. A pattern-action statement has the form: _p_a_t_t_e_r_n {{ _a_c_t_i_o_n }} A missing {{ _a_c_t_i_o_n }} means print the line; a missing pattern always matches. Pattern-action statements are separated by newlines or semicolons. An action is a sequence of statements. A statement can be one of the following: if( _e_x_p_r_e_s_s_i_o_n ) _s_t_a_t_e_m_e_n_t [ else _s_t_a_t_e_m_e_n_t ] while( _e_x_p_r_e_s_s_i_o_n ) _s_t_a_t_e_m_e_n_t for( _e_x_p_r_e_s_s_i_o_n ; _e_x_p_r_e_s_s_i_o_n ; _e_x_p_r_e_s_s_i_o_n ) _s_t_a_t_e_m_e_n_t for( _v_a_r in _a_r_r_a_y ) _s_t_a_t_e_m_e_n_t do _s_t_a_t_e_m_e_n_t while( _e_x_p_r_e_s_s_i_o_n ) break continue { [ _s_t_a_t_e_m_e_n_t _._._. ] } _e_x_p_r_e_s_s_i_o_n # commonly _v_a_r _= _e_x_p_r_e_s_s_i_o_n print [ _e_x_p_r_e_s_s_i_o_n_-_l_i_s_t ] [ > _e_x_p_r_e_s_s_i_o_n ] printf _f_o_r_m_a_t [ , _e_x_p_r_e_s_s_i_o_n_-_l_i_s_t ] [ > _e_x_p_r_e_s_s_i_o_n ] return [ _e_x_p_r_e_s_s_i_o_n ] next # skip remaining patterns on this input line nextfile # skip rest of this file, open next, start at top delete _a_r_r_a_y[ _e_x_p_r_e_s_s_i_o_n ]# delete an array element delete _a_r_r_a_y # delete all elements of array exit [ _e_x_p_r_e_s_s_i_o_n ] # exit immediately; status is _e_x_p_r_e_s_s_i_o_n Statements are terminated by semicolons, newlines or right braces. An empty _e_x_p_r_e_s_s_i_o_n_-_l_i_s_t stands for $$00. String constants are quoted "Â ", with the usual C escapes recognized within. Expressions take on string or numeric values as appropriate, and are built using the operators ++ -- ** // %% ^^ (exponentiation), and concatenation (indicated by white space). The operators !! ++++ ---- ++== --== **== //== %%== ^^== >> >>== << <<== ==== !!== ??:: are also available in expressions. Variables may be scalars, array elements (denoted _x[[_i]]) or fields. Variables are initialized to the null string. Array subscripts may be any string, not necessarily numeric; this allows for a form of associative memory. Multiple subscripts such as [[ii,,jj,,kk]] are permitted; the constituents are concatenated, separated by the value of SSUUBBSSEEPP. The pprriinntt statement prints its arguments on the standard output (or on a file if >> _f_i_l_e or >>>> _f_i_l_e is present or on a pipe if || _c_m_d is present), separated by the current output field separator, and terminated by the output record separator. _f_i_l_e and _c_m_d may be literal names or parenthesized expressions; identical string values in different statements denote the same open file. The pprriinnttff statement formats its expression list according to the _f_o_r_m_a_t (see _p_r_i_n_t_f(3)). The built-in function cclloossee((_e_x_p_r)) closes the file or pipe _e_x_p_r. The built-in function fffflluusshh((_e_x_p_r)) flushes any buffered output for the file or pipe _e_x_p_r. The mathematical functions aattaann22, ccooss, eexxpp, lloogg, ssiinn, and ssqqrrtt are built in. Other built-in functions: lleennggtthh the length of its argument taken as a string, number of elements in an array for an array argument, or length of $$00 if no argument. rraanndd random number on [0,1). ssrraanndd sets seed for rraanndd and returns the previous seed. iinntt truncates to an integer value. ssuubbssttrr((_s,, _m [,, _n])) the _n-character substring of _s that begins at position _m counted from 1. If no _n, use the rest of the string. iinnddeexx((_s,, _t)) the position in _s where the string _t occurs, or 0 if it does not. mmaattcchh((_s,, _r)) the position in _s where the regular expression _r occurs, or 0 if it does not. The variables RRSSTTAARRTT and RRLLEENNGGTTHH are set to the position and length of the matched string. sspplliitt((_s,, _a [,, _f_s])) splits the string _s into array elements _a[[11]], _a[[22]], ..., _a[[_n]], and returns _n. The separation is done with the regular expression _f_s or with the field separator FFSS if _f_s is not given. An empty string as field separator splits the string into one array element per character. ssuubb((_r,, _t [, _s])) substitutes _t for the first occurrence of the regular expression _r in the string _s. If _s is not given, $$00 is used. ggssuubb((_r,, _t [, _s])) same as ssuubb except that all occurrences of the regular expression are replaced; ssuubb and ggssuubb return the number of replacements. sspprriinnttff((_f_m_t,, _e_x_p_r,, _._._.)) the string resulting from formatting _e_x_p_r _._._. according to the _p_r_i_n_t_f(3) format _f_m_t. ssyysstteemm((_c_m_d)) executes _c_m_d and returns its exit status. This will be -1 upon error, _c_m_d's exit status upon a normal exit, 256 + _s_i_g upon death- by-signal, where _s_i_g is the number of the murdering signal, or 512 + _s_i_g if there was a core dump. ttoolloowweerr((_s_t_r)) returns a copy of _s_t_r with all upper-case characters translated to their corresponding lower-case equivalents. ttoouuppppeerr((_s_t_r)) returns a copy of _s_t_r with all lower-case characters translated to their corresponding upper-case equivalents. The ``function'' ggeettlliinnee sets $$00 to the next input record from the current input file; ggeettlliinnee << _f_i_l_e sets $$00 to the next record from _f_i_l_e. ggeettlliinnee _x sets variable _x instead. Finally, _c_m_d || ggeettlliinnee pipes the output of _c_m_d into ggeettlliinnee; each call of ggeettlliinnee returns the next line of output from _c_m_d. In all cases, ggeettlliinnee returns 1 for a successful input, 0 for end of file, and -1 for an error. Patterns are arbitrary Boolean combinations (with !! |||| &&&&) of regular expressions and relational expressions. Regular expressions are as defined in _r_e___f_o_r_m_a_t(7). Isolated regular expressions in a pattern apply to the entire line. Regular expressions may also occur in relational expressions, using the operators ~~ and !!~~. //_r_e// is a constant regular expression; any string (constant or variable) may be used as a regular expression, except in the position of an isolated regular expression in a pattern. A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines from an occurrence of the first pattern though an occurrence of the second. A relational expression is one of the following: _e_x_p_r_e_s_s_i_o_n _m_a_t_c_h_o_p _r_e_g_u_l_a_r_-_e_x_p_r_e_s_s_i_o_n _e_x_p_r_e_s_s_i_o_n _r_e_l_o_p _e_x_p_r_e_s_s_i_o_n _e_x_p_r_e_s_s_i_o_n iinn _a_r_r_a_y_-_n_a_m_e ((_e_x_p_r,,_e_x_p_r_,_._._.)) iinn _a_r_r_a_y_-_n_a_m_e where a _r_e_l_o_p is any of the six relational operators in C, and a _m_a_t_c_h_o_p is either ~~ (matches) or !!~~ (does not match). A conditional is an arithmetic expression, a relational expression, or a Boolean combination of these. The special patterns BBEEGGIINN and EENNDD may be used to capture control before the first input line is read and after the last. BBEEGGIINN and EENNDD do not combine with other patterns. They may appear multiple times in a program and execute in the order they are read by _a_w_k. Variable names with special meanings: AARRGGCC argument count, assignable. AARRGGVV argument array, assignable; non-null members are taken as filenames. CCOONNVVFFMMTT conversion format used when converting numbers (default %%..66gg). EENNVVIIRROONN array of environment variables; subscripts are names. FFIILLEENNAAMMEE the name of the current input file. FFNNRR ordinal number of the current record in the current file. FFSS regular expression used to separate fields; also settable by option --FF_f_s. NNFF number of fields in the current record. NNRR ordinal number of the current record. OOFFMMTT output format for numbers (default %%..66gg). OOFFSS output field separator (default space). OORRSS output record separator (default newline). RRLLEENNGGTTHH the length of a string matched by mmaattcchh. RRSS input record separator (default newline). If empty, blank lines separate records. If more than one character long, RRSS is treated as a regular expression, and records are separated by text matching the expression. RRSSTTAARRTT the start position of a string matched by mmaattcchh. SSUUBBSSEEPP separates multiple subscripts (default 034). Functions may be defined (at the position of a pattern-action statement) thus: ffuunnccttiioonn ffoooo((aa,, bb,, cc)) {{ ......;; rreettuurrnn xx }} Parameters are passed by value if scalar and by reference if array name; functions may be called recursively. Parameters are local to the function; all other variables are global. Thus local variables may be created by providing excess parameters in the function definition. EENNVVIIRROONNMMEENNTT VVAARRIIAABBLLEESS If PPOOSSIIXXLLYY__CCOORRRREECCTT is set in the environment, then _a_w_k follows the POSIX rules for ssuubb and ggssuubb with respect to consecutive backslashes and ampersands. EEXXAAMMPPLLEESS length($0) > 72 Print lines longer than 72 characters. { print $2, $1 } Print first two fields in opposite order. BEGIN { FS = ",[ \t]*|[ \t]+" } { print $2, $1 } Same, with input fields separated by comma and/or spaces and tabs. { s += $1 } END { print "sum is", s, " average is", s/NR } Add up first column, print sum and average. /start/, /stop/ Print all lines between start/stop pairs. BEGIN { # Simulate echo(1) for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] printf "\n" exit } SSEEEE AALLSSOO _g_r_e_p(1), _l_e_x(1), _s_e_d(1) A. V. Aho, B. W. Kernighan, P. J. Weinberger, _T_h_e _A_W_K _P_r_o_g_r_a_m_m_i_n_g _L_a_n_g_u_a_g_e, Addison-Wesley, 1988. ISBN 0-201-07981-X. BBUUGGSS There are no explicit conversions between numbers and strings. To force an expression to be treated as a number add 0 to it; to force it to be treated as a string concatenate "" to it. The scope rules for variables in functions are a botch; the syntax is worse. Only eight-bit characters sets are handled correctly. 2020-11-24 AWK(1)