awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

  NAME
       awk -- data transformation, report generation language

  SYNOPSIS
       awk [-F ere] [-f prog] [-v var=value ...]  [program] [var=value
       ...]  [file...]

  DESCRIPTION
       awk is a file-processing language which is well-suited to data
       manipulation and retrieval of information from text files.  This
       reference page provides a full technical description of awk.  If
       you are unfamiliar with the language, you may find it helpful to
       read the awk Tutorial in the User's Guide before reading the fol-
       lowing material.

       An awk program consists of any number of user-defined functions
       and rules of the form:

          pattern {action}

       There are two ways to specify the awk program:

       (a)
          Directly on the command line.  In this case, the program is a
          single command line argument, usually enclosed in apostrophes
          (') to prevent the shell from attempting to expand it.

       (b)
          By using the -f prog option.

       You can only specify program directly on the command line if you
       do not use any -f prog arguments.

       When you specify files on the command line, those files provide
       the input data for awk to manipulate.  If you specify no such
       files or you specify - as a file, awk reads data from the stan-
       dard input.

       You can initialize variables on the command line using

          var=value

       You can intersperse such initializations with the names of input
       files on the command line.  awk processes initializations and
       input files in the order they appear on the command line.  For
       example, the command

          awk -f progfile a=1 f1 f2 a=2 f3

       sets a to 1 before reading input from f1 and sets a to 2 before
       reading input from f3.

       Variable initializations that appear before the first file on the

                                                                       1

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       command line are performed immediately after the BEGIN action.
       Initializations appearing after the last file are performed imme-
       diately before the END action.  For more information on BEGIN and
       END, see Patterns.

       The -v option lets you assign a value to a variable before the
       awk program begins running (that is, before the BEGIN action).
       For example, in

          awk -v v1=10 -f prog datafile

       awk assigns the variable v1 its value before the BEGIN action of
       the program (but after default assignments made to built-in vari-
       ables like FS, and OFMT; these built-in variables have special
       meaning to awk, as described in later sections).

       awk divides input into records. By default, newline characters
       separate records; however, you may specify a different record
       separator if you want.

       One at a time, and in order, awk compares each input record with
       the pattern of every rule in the program.  When a pattern
       matches, awk performs the action part of the rule on that input
       record.  Patterns and actions often refer to separate fields
       within a record.  By default, white space (usually blanks, new-
       lines, or horizontal tab characters) separates fields; however,
       you can specify a different field separator string using the -F
       ere option (see Input).

       You can omit the pattern or action part of an awk rule (but not
       both).  If you omit pattern, awk performs the action on every
       input record (that is, every record matches).  If you omit
       action, awk writes every record matching the pattern to the stan-
       dard output.

       awk considers everything after a # in a program line to be a com-
       ment.  For example:

          # This is a comment

       To continue program lines on the next line, add a backslash (\)
       to the end of the line.  Statement lines ending with a comma (,),
       double or-bars (||), or double ampersands (&&) continue automati-
       cally on the next line.

  Options
       awk accepts the following options:

       -F ere
            specifies an extended regular expression to use as the field
            separator.

                                                                       2

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       -f prog
            runs the awk program contained in the file prog. When more
            than one -f option appears on the command line, the result-
            ing program is a concatenation of all programs you specify.

       -v var=value
            assigns value to var before running the program.  You can
            specify this option a number of times.

  Variables and Expressions
       There are three types of variables in awk:  identifiers, fields,
       and array elements.

       An identifier is a sequence of letters, digits, and underscores
       beginning with a letter or an underscore.

       For a description of fields, see the Input subsection.

       Arrays are associative collections of values called the elements
       of the array.  Constructs of the form,

          identifier[subscript]

       where subscript has the form expr or expr,expr,...., reference
       array elements.  Each such expr can have any string value.  For
       multiple expr subscripts, awk concatenates the string values of
       all exprs with a separate character SUBSEP between each.  The
       initial value of SUBSEP is set to \034 (ASCII field separator).

       We sometimes refer to fields and identifiers as scalar variables
       to distinguish them from arrays.

       You do not declare awk variables and you do not need to initial-
       ize them.  The value of an uninitialized variable is the empty
       string in a string context and the number 0 in a numeric context.

       Expressions consist of constants, variables, functions, regular
       expressions and subscript in array conditions (described later)
       combined with operators.  Each variable and expression has a
       string value and a corresponding numeric value; awk uses the
       value appropriate to the context.

       When converting a numeric value to its corresponding string
       value, awk performs the equivalent of a call to the sprintf func-
       tion (see Built-in String Functions) where the one and only expr
       argument is the numeric value and the fmt argument is either %d
       (if the numeric value is an integer) or the value of the variable
       CONVFMT (if the numeric value is not an integer).  The default
       value of CONVFMT is %.6g.  If you use a string in a numeric con-
       text, and awk cannot interpret the contents of the string as a
       number, it treats the value of the string as zero.

       Numeric constants are sequences of decimal digits.

                                                                       3

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       String constants are quoted, as in "a literal string".  Literal
       strings can contain the escape sequences shown in Table 1, Escape
       Sequences in awk Literal Strings.

                       _________________________________
                       | Escape | Character             |
                       |________|_______________________|
                       |   \a   | audible bell          |
                       |   \b   | backspace             |
                       |   \f   | formfeed              |
                       |   \n   | newline               |
                       |   \r   | carriage return       |
                       |   \t   | horizontal tab        |
                       |   \v   | vertical tab          |
                       |  \ooo  | octal value ooo       |
                       |  \xdd  | hexadecimal value dd  |
                       |   \/   | slash                 |
                       |   \"   | quote                 |
                       |   \c   | any other character c |
                       |________|_______________________|

               Table 1: Escape Sequences in awk Literal Strings

       awk supports full regular expressions (see regexp(3)).  When awk
       reads a program, it compiles characters enclosed in slash charac-
       ters (/) as regular expressions. In addition, when literal
       strings and variables appear on the right side of a ~ or !~ oper-
       ator, or as certain arguments to built-in matching and substitu-
       tion functions, awk interprets them as dynamic regular expres-
       sions.

       Note: When you use literal strings as regular expressions, you
       need extra backslashes to escape regular expression metacharac-
       ters, since the backslash is also the literal string escape char-
       acter.  For example the regular expression,

          /e\.g\./

       when written as a string is:

          "e\\.g\\."

       awk defines the subscript in array condition as:

          index in array

       where index looks like expr or (expr,...,expr). This condition
       evaluates to 1 if the string value of index is a subscript of
       array, and to 0 otherwise.  This is a way to determine if an
       array element exists.  When the element does not exist, this con-
       dition does not create it.

                                                                       4

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

  Symbol Table
       You can access the symbol table through the built-in array
       SYMTAB.

          SYMTAB[expr]

       is equivalent to the variable named by the evaluation of expr.
       For example,

          SYMTAB["var"]

       is a synonym for the variable var.

  Environment
       An awk program can determine its initial environment by examining
       the ENVIRON array.  If the environment consists of entries of the
       form:

          name=value

       then

          ENVIRON[name]

       has string value

          "value"

       For example, the following program is equivalent to the default
       output of env(1):

          BEGIN {
               for (i in ENVIRON)
                    printf("%s=%s\n", i, ENVIRON[i])
               exit
          }

  Operators
       awk follows the usual precedence order of arithmetic operations,
       unless overridden with parentheses; a table giving the order of
       operations appears later in this section.

       The unary operators are +, -, ++, and --, where you can use the
       ++ and -- operators as either postfix or prefix operators, as in
       C.  The binary arithmetic operators are +, -, *, /, %, and ^.

       The conditional operator

          expr ? expr1 : expr2

       evaluates to the expr1 if the value of expr is non-zero, and to

                                                                       5

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       expr2 otherwise.

       If two expressions are not separated by an operator, awk con-
       catenates their string values.

       The operator ~ yields 1 (true) if the regular expression on the
       right side matches the string on the left side.  The operator !~
       yields 1 when the right side has no match on the left.  To illus-
       trate:

          $2 ~ /[0-9]/

       selects any line where the second field contains at least one
       digit.  awk interprets any string or variable on the right side
       of ~ or !~ as a dynamic regular expression.

       The relational operators are <, <=, >, >=, ==, and !=.  When both
       operands in a comparison are numeric, awk compares their values
       numerically; otherwise, it compares them as strings.  An operand
       is numeric if it is an integer or floating point number, if it is
       a field or ARGV element that looks like a number, or if it is a
       variable created by a command line assignment that looks like a
       number.

       The Boolean operators are || (or), && (and), and ! (not). Short
       Circuit Evaluation is used when evaluating expressions.  With an
       && expression, if the first operator is false, the entire expres-
       sion is false and it is not necessary to evaluate the second
       operator. With an || expression, a similar situation exists if
       the first operator is true.

       You can assign values to a variable with

          var = expr

       If op is a binary arithmetic operator,

          var op= expr

       is equivalent to

          var = var op expr

       except that var is evaluated only once.

       See Table 2, awk Order of Operations for the precedence rules of
       the operators.

                                                                       6

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

           _________________________________________________________
           |                  Order of Operations                   |
           |________________________________________________________|
           | (A)             | grouping                             |
           |_________________|______________________________________|
           | $i  V[a]        | field, array element                 |
           |_________________|______________________________________|
           | V++  V--        | increment, decrement                 |
           | ++V  --V        |                                      |
           |_________________|______________________________________|
           | A^B             | exponentiation                       |
           |_________________|______________________________________|
           | +A  -A  !A      | unary plus, unary minus, logical NOT |
           |_________________|______________________________________|
           | A*B  A/B  A%B   | multiplication, division, remainder  |
           |_________________|______________________________________|
           | A+B  A-B        | addition, subtraction                |
           |_________________|______________________________________|
           | A B             | string concatenation                 |
           |_________________|______________________________________|
           | A<B  A>B  A<=B  | comparisons                          |
           | A>=B A!=B A==B  |                                      |
           |_________________|______________________________________|
           | A~B  A!~B       | regular expression matching          |
           |_________________|______________________________________|
           | A in V          | array membership                     |
           |_________________|______________________________________|
           | A && B          | logical AND                          |
           |_________________|______________________________________|
           | A || B          | logical OR                           |
           |_________________|______________________________________|
           | A ? B : C       | conditional expression               |
           |_________________|______________________________________|
           | V=B  V+=B  V-=B | assignment                           |
           | V*=B V/=B  V%=B |                                      |
           | V^=B            |                                      |
           |_________________|______________________________________|
           | A, B and C are any expression.                         |
           | i is any expression yielding an integer.               |
           | V is any variable.                                     |
           |________________________________________________________|

                       Table 2: awk Order of Operations

  Command Line Arguments
       awk sets the built-in variable ARGC to the number of command line
       arguments.  The built-in array ARGV has elements subscripted with
       digits from zero to ARGC-1, giving command line arguments in the
       order they appeared on the command line.

       The ARGC count and the ARGV vector do not include command line
       options (beginning with -) or the program file (following -f).
       They do include the name of the command itself, the names of

                                                                       7

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       input data files, and initialization statements of the form

          var=value

       awk actually creates ARGC and ARGV before doing anything else.
       It then walks through ARGV processing the arguments.  If an ele-
       ment of ARGV is an empty string, awk skips it.  If it contains an
       equals sign (=), awk interprets it as a variable assignment.  If
       it is a minus sign (-), awk immediately reads input from the
       standard input until it encounters the end-of-file; otherwise,
       awk treats the argument as a file name and reads input from that
       file until it reaches end-of-file.

       Note: awk runs the program by walking through ARGV in this way;
       thus if the program changes ARGV, awk can read different files
       and make different assignments.

  Input
       awk divides input into records.  A record separator character
       separates each record from the next.  The value of the built-in
       variable RS gives the current record separator character; by
       default, it begins as the newline (\n).  If you assign a differ-
       ent character to RS, awk uses that as the record separator char-
       acter from that point on.

       awk divides records into fields.  A field separator string, given
       by the value of the built-in variable FS, separates each field
       from the next.  You can set a specific separator string by
       assigning a value to FS, or by specifying the -F ere option on
       the command line.  You can assign a regular expression to FS.
       For example,

          FS = "[,:$]"

       says that commas, colons, or dollar signs can separate fields.
       As a special case, assigning FS a string containing only a blank
       character sets the field separator to white space.  In this case,
       awk considers any sequence of contiguous space and/or tab charac-
       ters a single field separator.  This is the default for FS; how-
       ever, if you assign FS a string containing any other character,
       that character designates the start of a new field.  For example,
       if we set FS="\t" (the tab character),

          texta \t textb \t  \t  \t textc

       contains five fields, two of which only contain blanks.  With the
       default setting, this record only contains three fields, since
       awk considers the sequence of multiple blanks and tabs a single
       separator.

       The following list of built-in variables provides various pieces
       of information about input.

                                                                       8

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

          NF       number of fields in the current record
          NR       number of records read so far
          FILENAME name of file containing current record
          FNR      number of records read from current file

       Field specifiers have the form $n where n runs from 1 through NF.
       Such a field specifier refers to the nth field of the current
       input record.  $0 (zero) refers to the entire current input
       record.

       The getline function can read a value for a variable or $0 from
       the current input, from a file, or from a pipe.  The result of
       getline is an integer indicating whether the read operation was
       successful.  A value of 1 indicates success; 0 indicates
       end-of-file encountered; and -1 indicates that an error occurred.
       Possible forms for getline are:

       getline
            reads next input record into $0 and splits the record into
            fields.  NF, NR, and FNR are set appropriately.

       getline var
            reads next input record into the variable var. awk does not
            split the record into fields (which means that the current
            $n values do not change), but sets NR and FNR appropriately.

       getline <expr
            interprets the string value of expr to be a file name.  awk
            reads the next record from that file into $0, splits it into
            fields, and sets NF appropriately.  If the file is not open,
            awk opens it.  The file remains open until you close it with
            a close function.

       getline var <expr
            interprets the string value of expr to be a file name, and
            reads the next record from that file into the variable var,
            but does not split it into fields.

       expr | getline
            interprets the string value of expr as a command line to be
            run.  awk pipes output from this command into getline, and
            reads it into $0 in a manner similar to getline <expr.  See
            the System Function section for additional details.

       expr | getline var
            runs the string value of expr as a command and pipes the
            output of the command into getline. The result is similar to
            getline var <expr.

       You can only have a limited number of files and pipes open at one
       time.  You can close files and pipes during execution using the

                                                                       9

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

          close(expr)

       function.  The expr must be one that came before | or after < in
       getline, or after > or >> in print or printf. For a description
       of print and printf, see the Output section.  If the function
       successfully closes the pipe, it returns zero.  By closing files
       and pipes that you no longer need, you can use any number of
       files and pipes in the course of running an awk program.

  Built-In Arithmetic Functions

       atan2(expr1, expr2)
            returns the arctangent of expr1/expr2 in the range of -pi
            through pi.

       exp(expr), log(expr), sqrt(expr)
            returns the exponential, natural logarithm, and square root
            of the numeric value of expr. If you omit (expr), these
            functions use $0 instead.

       int(expr)
            returns the integer part of the numeric value of expr. If
            you omit (expr), the function returns the integer part of
            $0.

       rand()
            returns a random floating-point number in the range 0
            through 1.

       sin(expr), cos(expr)
            returns the sine and cosine of the numeric value of expr
            (interpreted as an angle in radians).

       srand(expr)
            sets the seed of the rand function to the integer value of
            expr. If you omit (expr), awk uses the time of day as a
            default seed.

  Built-In String Functions

       n = gsub(regexp, repl, string)
            works the same way as sub, except that gsub replaces all
            matching substrings (global substitution).

       pos = index(string, str)
            returns the position of the first occurrence of str in
            string. If index does not find str in string, it returns
            zero.

       len = length(expr)
            returns the number of characters in the string value of
            expr. If you omit (expr), the function uses $0 instead.  The

                                                                      10

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

            parentheses around expr are optional.

       pos = match(string, regexp)
            searches string for the first substring matching the regular
            expression regexp, and returns an integer giving the posi-
            tion of this substring counting from one.  If it finds no
            such substring, match returns zero.  This function also sets
            the built-in variable RSTART to pos and the built-in vari-
            able RLENGTH to the length of the matched string.  If it
            does not find a match, match sets RSTART to zero and RLENGTH
            to -1. You can enclose regexp in slashes or specify it as a
            string.

       n = ord(expr)
            returns the integer value of first character in the string
            value of expr. This is useful in conjunction with %c in
            sprintf.

       n = split(string, array, regexp)
            splits the string into fields.  regexp is a regular expres-
            sion giving the field separator string for the purposes of
            this operation.  This function assigns the separate fields,
            in order, to the elements of array; subscripts for array
            begin at 1.  awk discards all other elements of array. split
            returns the number of fields into which it divided string
            (which is also the maximum subscript for array). regexp
            divides the record in the same way that the FS field separa-
            tor string does.  If you omit regexp in the call to split,
            it uses the current value of FS.

       str = sprintf(fmt, expr, expr...)
            formats the expression list expr, expr, ...  using specifi-
            cations from the string fmt, then returns the formatted
            string.  The fmt string consists of conversion specifica-
            tions which convert and add the next expr to the string, and
            ordinary characters which sprintf simply adds to the string.
            These conversion specifications are similar to those used by
            the ANSI C standard.
            Conversion specifications have the form

               %[-][0][x][.y]c

            where

               -    left justifies the field; default is right justification
               0    leading zero prints numbers with leading zero
               x    is the minimum field width
               y    is the precision
               c    is the conversion character

            In a string, the precision is the maximum number of charac-
            ters to be printed from the string; in a number, the preci-
            sion is the number of digits to be printed to the right of

                                                                      11

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

            the decimal point in a floating point value.  If x or y is *
            (asterisk), the minimum field width or precision is the
            value of the next expr in the call to sprintf.
            The conversion character c is one of following:

               d    decimal integer
               i    decimal integer
               o    unsigned octal integer
               x,X  unsigned hexadecimal integer
               u    unsigned decimal integer
               f,F  floating point
               e,E  floating point (scientific notation)
               g,G  the shorter of e and f (suppresses non-significant zeros)
               c    single character of an integer value; first character of string
               s    string

            The lowercase x prints alphabetic hex digits in lowercase
            while the uppercase X prints alphabetic hex digits in upper-
            case.  The other upper/lowercase pairs work similarly.

       n = sub(regexp, repl, string)
            searches string for the first substring matching the
            extended regular expression regexp, and replaces the sub-
            string with the string repl. awk replaces any ampersand (&)
            in repl with the substring of string which matches regexp.
            You can suppress this special behavior by preceding the
            ampersand with a backslash.  If you omit string, sub uses
            the current record instead.  sub returns the number of sub-
            strings replaced (which is one if it found a match, and zero
            otherwise).

       str = substr(string, offset, len)
            returns the substring of string that begins in position off-
            set and is at most len characters long.  The first character
            of the string has an offset equal to one.  If you omit len,
            substr returns the rest of string.

       str = tolower(expr)
            converts all letters in the string value of expr into lower-
            case, and returns the result.  If you omit expr, tolower
            uses $0 instead.

       str = toupper(expr)
            converts all letters in the string value of expr into upper-
            case, and returns the result.  If you omit expr, toupper
            uses $0 instead.

  System Function

       status = system(expr)
            runs the string value of expr as a command.  For example,

               system("tail " $1)

                                                                      12

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

            calls the tail command, using the string value of $1 as the
            file that tail examines.  The MPE/iX Shell runs the command
            as discussed in the PORTABILITY section, and the exit status
            returned depends on that command interpreter.

  User-Defined Functions
       You can define your own functions using the form

          function name(parameter-list) {
               statements
          }

       A function definition can appear in the place of a pattern
       {action} rule.  The parameter-list contains any number of normal
       (scalar) and array variables separated by commas.  When you call
       a function, awk passes scalar arguments by value, and array argu-
       ments by reference.  The names specified in the parameter-list
       are local to the function; all other names used in the function
       are global.  You can define local variables by adding them to the
       end of the parameter list as long as no call to the function uses
       these extra parameters.

       A function returns to its caller either when it performs the
       final statement in the function, or when it reaches an explicit
       return statement.  The return value, if any, is specified in the
       return statement (see the Actions section).

  Patterns
       A pattern is a regular expression, a special pattern, a pattern
       range, or any arithmetic expression.

       BEGIN is a special pattern used to label actions that awk per-
       forms before reading any input records.  END is a special pattern
       used to label actions that awk performs after reading all input
       records.

       You can give a pattern range as

          pattern1,pattern2

       This matches all lines from one that matches pattern1 to one that
       matches pattern2, inclusive.

       If you omit a pattern, or if the numeric value of the pattern is
       non-zero (true), awk performs the resulting action for the line.

  Actions
       An action is a series of statements terminated by semicolons,
       newlines, or closing braces.  A condition is any expression; awk
       considers a non-zero value true, and a zero value false.  A
       statement is one of the following or any series of statements
       enclosed in braces.

                                                                      13

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

          # expression statement, e.g. assignment
          expression

          # if statement
          if (condition)
               statement
          [else
               statement]

          # while loop
          while (condition)
               statement

          # do-while loop
          do
               statement
          while (condition)

          # for loop
          for (expression1; condition; expression2)
               statement

       The for statement is equivalent to:

          expression1
          while (condition) {
               statement
               expression2
          }

       The for statement can also have the form

          for (i in array)
               statement

       awk performs the statement once for each element in array; on
       each repetition, the variable i contains the name of a subscript
       of array, running through all the subscripts in an arbitrary
       order.  If array is multi-dimensional (has multiple subscripts),
       i is expressed as a single string with the SUBSEP character
       separating the subscripts.

       The statement

          break

       exits a for or a while loop immediately.

                                                                      14

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

          continue

       stops the current iteration of a for or while loop and begins the
       next iteration (if there is one).

          next

       terminates any processing for the current input record and imme-
       diately starts processing the next input record.  Processing for
       the next record begins with the first appropriate rule.

          exit[(expr)]

       immediately goes to the END action if it exists; if there is no
       END action, or if awk is already performing the END action, the
       awk program terminates.  awk sets the exit status of the program
       to the numeric value of expr. If you omit (expr), the exit status
       is 0.

          return [expr]

       returns from the execution of a function.  If you specify an
       expr, the function returns the value of the expression as its
       result; otherwise, the function result is undefined.

          delete array[i]

       deletes element i from the given array.

          print expr, expr, ...

       is described in the Output subsection.

          printf fmt, expr, expr, ...

       is also described in the Output subsection.

  Output
       The print statement prints its arguments with only simple format-
       ting.  If it has no arguments, it prints the current input record
       in its entirety.  awk adds the output record separator ORS to the
       end of the output that each print statement produces; when commas
       separate arguments in the print statement, the output field sepa-
       rator OFS separates the corresponding output values.  ORS and OFS
       are built-in variables, the values of which you can change by
       assigning them strings.  The default output record separator is a
       newline and the default output field separator is a space.

       The variable OFMT gives the format of floating point numbers out-
       put by print. By default, the value is %.6g; you can change this
       by assigning OFMT a different string value.  OFMT only applies to
       floating point numbers (ones with fractional parts).

                                                                      15

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       The printf statement formats its arguments using the fmt argu-
       ment.  Formatting is the same as for the built-in function
       sprintf. Unlike print, printf does not add output separators
       automatically.  This gives the program more precise control of
       the output.

       The print and printf statements write to the standard output.
       You can redirect output to a file or pipe as described later.

       If you add >expr to a print or printf statement, awk treats the
       string value of expr as a file name, and writes output to that
       file.  Similarly, if you add >>expr, awk appends output to the
       current contents of the file.  The distinction between > and >>
       is only important for the first print to the file expr. Subse-
       quent outputs to an already open file append to what is there
       already.

       To eliminate ambiguities, statements such as

          print a > b c

       are syntactically illegal.  Use parentheses to resolve the ambi-
       guity.

       If you add |expr to a print or printf statement, awk treats the
       string value of expr as an executable command and runs it with
       the output from the statement piped as input into the command.

       As mentioned earlier, you can have only a limited number of files
       and pipes open at any time.  To avoid going over the limit, use
       the close function to close files and pipes when you no longer
       need them.

       print and printf are also available as functions with the same
       calling sequence, but no redirection.

  EXAMPLES

          awk '{print NR ":" $0}' input1

       outputs the contents of the file input1 with line numbers
       prepended to each line.

       The following is an example using var=value on the command line.

          awk '{print NR SEP $0}' SEP=":" input1

       awk can also read the program script from a file as in the com-
       mand line:

          awk -f addline.awk input1

                                                                      16

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       which produces the same output when the file addline.awk contains

          {print NR ":" $0}

       The following program appends all input lines starting with
       January to the file jan (which may or may not exist already), and
       all lines starting with February or March to the file febmar:

          /^January/ {print >> "jan"}
          /^February|^March/ {print >> "febmar"}

       This program prints the total and average for the last column of
       each input line:

               {s += $NF}
          END  {print "sum is", s, "average is", s/NR}

       The next program interchanges the first and second fields of
       input lines:

          {
               tmp = $1
               $1 = $2
               $2 = tmp
               print
          }

       The following inserts line numbers so that output lines are left-
       aligned:

          {printf "%-6d: %s\n", NR, $0}

       The following prints input records in reverse order (assuming
       sufficient memory):

          {
               a[NR] = $0 # index using record number
          }
          END {
               for (i = NR; i>0; --i)
                    print a[i]
          }

       The next program determines the number of lines starting with the
       same first field:

                                                                      17

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

          {
               ++a[$1] # array indexed using the first field
          }
          END {     # note output will be in undefined order
               for (i in a)
                    print a[i], "lines start with", i
          }

       The following program can be used to determine the number of
       lines in each input file:

          {
               ++a[FILENAME]
          }
          END {
               for (file in a)
                    if (a[file] == 1)
                         print file, "has 1 line"
                    else
                         print file, "has", a[file], "lines"
          }

       The following program illustrates how you can use a two dimen-
       sional array in awk.  Assume the first field of each input record
       contains a product number, the second field contains a month num-
       ber, and the third field contains a quantity (bought, sold, or
       whatever).  The program generates a table of products versus
       month.

          BEGIN     {NUMPROD = 5}
          {
               array[$1,$2] += $3
          }
          END  {
               print "\t Jan\t Feb\tMarch\tApril\t May\t" \
                   "June\tJuly\t Aug\tSept\t Oct\t Nov\t Dec"
               for (prod = 1; prod <= NUMPROD; prod++) {
                    printf "%-7s", "prod#" prod
                    for (month = 1; month <= 12; month++){
                         printf "\t%5d", array[prod,month]
                    }
                    printf "\n"
               }
          }

                                                                      18

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       As the following program reads in each line of input, it reports
       whether the line matches a pre-determined value:

          function randint() {
               return (int((rand()+1)*10))
          }
          BEGIN     {
               prize[randint(),randint()] = "$100";
               prize[randint(),randint()] = "$10";
               prize[1,1] = "the booby prize"
               }
          {
               if (($1,$2) in prize)
                    printf "You have won %s!\n", prize[$1,$2]
          }

       The following example prints lines, the first and last fields of
       which are the same, reversing the order of the fields:

          $1==$NF {
               for (i = NF; i > 0; --i)
                    printf "%s", $i (i>1 ? OFS : ORS)
          }

       The following program prints the input files from the command
       line.  The infiles function first empties the passed array, and
       then fills the array.  Notice that the extra parameter i of
       infiles is a local variable.

          function infiles(f,i) {
               for (i in f)
                    delete f[i]
               for (i = 1; i < ARGC; i++)
                    if (index(ARGV[i],"=") == 0)
                         f[i] = ARGV[i]
          }
          BEGIN     {
               infiles(a)
               for (i in a)
                    print a[i]
               exit
          }

                                                                      19

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       Here is the standard recursive factorial function:

          function fact(num) {
               if (num <= 1)
                    return 1
               else
                    return num * fact(num - 1)
          }
          { print $0 " factorial is " fact($0) }

       The following program illustrates the use of getline with a pipe.
       Here, getline sets the current record from the output of the wc
       command.  The program prints the number of words in each input
       file.

          function words(file,   string) {
               string = "wc " fn
               string | getline
               close(string)
               return ($2)
          }
          BEGIN     {
               for (i=1; i<ARGC; i++) {
                    fn = ARGV[i]
                    printf "There are %d words in %s.",
                        words(fn), fn
               }
          }

  ENVIRONMENT VARIABLES

       PATH contains a list of directories that awk searches when look-
            ing for commands run by system(expr), or input and output
            pipes.

       Any other environment variable may be accessed by the awk program
       itself.

  DIAGNOSTICS
       Possible exit status values are:

       0  Successful completion.

       1  An error occurred.

       When an awk program terminates because of a call to exit(), the
       exit status is the value passed to exit().

  Messages

                                                                      20

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       Message:  array "name" cannot be used as a scalar
       Cause:    You attempted to use the array name which has been used
                 earlier in the script as a scalar. A variable can be
                 used as an array or a scalar but not as both.
       Action:   Make sure that you use name as either a scalar or an
                 array but not as both.

       Message:  attempt to redefine builtin function
       Cause:    You attempted to redefine one of the built-in awk func-
                 tions.
       Action:   Choose a name for the function you are defining that is
                 not the same as any of the built-in functions.  See the
                 DESCRIPTION section of this man page for lists of
                 built-in arithmetic and string functions.

       Message:  cannot assign to function "funcname"
       Cause:    "funcname" is defined to be a function in your script
                 and cannot be used as a variable.
       Action:   Use a different name for the variable.

       Message:  cannot open input file "filename"
       Cause:    awk was unable to open one of the files named on the
                 command line.
       Action:   Check that the file exists, was named properly and that
                 you have the appropriate permissions.

       Message:  cannot open script file "filename"
       Cause:    awk was unable to open one of the script files speci-
                 fied with the -f option.
       Action:   Check that the file exists, was named properly and that
                 you have the appropriate permissions.

       Message:  division (/ or %) by zero
       Cause:    An arithmetic operation using / or % resulted in an
                 attempt to divide by zero.
       Action:   Modify your program so that division by zero does not
                 occur.

       Message:  EOF in regular expression
       Cause:    awk encountered the end-of-file character while reading
                 a regular expression from the script file.
       Action:   Check for missing / delimiters at the end of regular
                 expressions.

       Message:  EOF in string
       Cause:    awk encountered the end-of-file character while reading
                 a string constant from the script file.
       Action:   Check for missing " delimiters at the end of string
                 constants.

       Message:  error in function funcname(arg) at NR=num
       Cause:    A math error occurred while performing the function
                 funcname on argument arg.

                                                                      21

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       Action:   Make sure that you are passing a proper argument to the
                 function funcname.

       Message:  function "funcname" nesting level > number
       Cause:    There have been too many nested or recursive function
                 calls. awk allows a maximum of number levels.
       Action:   Make sure that nested and recursive function calls do
                 not exceed number levels of nesting.

       Message:  function "funcname" redefined
       Cause:    You attempted to redefine an existing function.
       Action:   Choose a new name for your function that does not con-
                 flict with any other function name.

       Message:  inadmissible use of reserved keyword
       Cause:    You attempted to use a reserved word in an unacceptable
                 way such as a function or variable name.
       Action:   Choose a different name for your function or variable.

       Message:  insufficient arguments to printf or sprintf
       Cause:    You did not specify enough arguments to match the num-
                 ber required by the specified format string.
       Action:   Check your format string and number of arguments.

       Message:  insufficient memory for string storage
       Cause:    There were not enough free system resources for awk to
                 use for string storage.
       Action:   Free up more system resources, or modify your awk pro-
                 gram to require less string storage.

       Message:  invalid character "char" (hex hexnum)
       Cause:    awk encountered the invalid character char while pro-
                 cessing the input file.
       Action:   Check the input file for invalid characters.

       Message:  lvalue required in assignment
       Cause:    You did not specify a variable or array element as the
                 left-hand side of an assignment expression.
       Action:   Make sure that you specify a valid variable or array
                 index on the left side of an assignment operator.

       Message:  may delete only array element or array
       Cause:    You attempted to use the delete statement to delete a
                 scalar variable.
       Action:   Only use delete to delete arrays and array element.

       Message:  Missing field separator
       Cause:    You specified the -F option but did not follow it with
                 a field separator.
       Action:   Provide a field separator following the -F option.

       Message:  Missing script file
       Cause:    You specified the -f option but did not follow it with

                                                                      22

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

                 the name of a script file.
       Action:   Provide the name of a script file following the -f
                 option.

       Message:  Missing variable assignment
       Cause:    You specified the -v option but did not follow it with
                 a variable assignment.
       Action:   Provide a variable assignment following the -v option.

       Message:  Newline in regular expression
       Cause:    awk encountered a newline while reading a regular
                 expression.
       Action:   Check for a missing / delimiter.

       Message:  Newline in string
       Cause:    awk encountered a newline while reading a string con-
                 stant.
       Action:   Check for a missing " delimiter.

       Message:  panic: sprintf() string longer than number characters
       Cause:    The maximum length of a string created by sprintf() is
                 limited to number characters.
       Action:   Try processing the string in a different way.

       Message:  Record too long (LIMIT: number bytes)
       Cause:    awk read a record that was longer than the maximum
                 record size it can handle.  On UNIX and POSIX-compliant
                 systems, the maximum record length is 20000 characters.
       Action:   Edit the offending record so that it does not exceed
                 the limit.

       Message:  regular expression error
       Cause:    An error occurred while processing a regular expres-
                 sion.
       Action:   Check the regular expression.

       Message:  return outside of a function
       Cause:    awk encountered a return statement that is not part of
                 a function.
       Action:   Only use the return statement inside a function defini-
                 tion.

       Message:  scalar "name" cannot be used as array
       Cause:    You attempted to use name as an array variable when it
                 has already been used as a scalar.
       Action:   Make sure that you use a variable as either and array
                 or a scalar, but not as both.

       Message:  second parameter to "split" must be an array
       Cause:    You invoked the split function but the second parameter
                 was not an array.
       Action:   Ensure that split is invoked with an array as the sec-
                 ond parameter.

                                                                      23

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       Message:  strcoll error, cannot malloc space.
       Cause:    There are not enough free system resources to allocate
                 string space.
       Action:   Free up more resources.

       Message:  SYMTAB must have exactly one index
       Cause:    You tried to reference the SYMTAB array using more than
                 one index.
       Action:   Always reference SYMTAB with exactly one index.

       Message:  syntax error "regular expression error" in /line/
       Cause:    See regerror(3).
       Action:   See regerror(3).

       Message:  too deeply nested for in loop (LIMIT: number)
       Cause:    For loops can only be nested number levels deep.
       Action:   Re-write the script to use fewer levels.

       Message:  Too many fields (LIMIT: number)
       Cause:    awk read a record with more fields than it was able to
                 handle.
       Action:   Edit the input file to decrease the number of fields in
                 the offending record.

       Message:  too many open streams to funcname onto "filename"
       Cause:    awk can only have a limited number of files open at one
                 time.  There were too many open files.
       Action:   Make sure that unused files are being closed properly.
                 If this doesn't fix the problem, restructure your pro-
                 gram.

       Message:  unbalanced char
       Cause:    An unbalanced number of parentheses or braces was
                 encountered.
       Action:   Make sure that all braces and parentheses are matched
                 up.

       Message:  Unknown option "-option"
       Cause:    You specified an option that is not valid for awk.
       Action:   Check the DESCRIPTION of this man page for a list of
                 valid awk options.

       Message:  unredirected getline in END action
       Cause:    The default input stream has already been closed by the
                 time that the END action is performed so a getline
                 which has not been redirected will fail.
       Action:   Redirect getline to read from a named file.

       Message:  variable "name" cannot be used as a function
       Cause:    You attempted to use the variable name as a function
                 when it has not explicitly been defined as one, or when
                 it has not been defined at all.

                                                                      24

  awk(1)                MPE/iX Shell and Utilities                awk(1)
  ______________________________________________________________________

       Action:   Replace the offending variable name with the name of a
                 function or define a function with that name.

       Message:  wrong number of arguments to function "funcname"
       Cause:    You attempted to invoke the function funcname with the
                 wrong number of arguments.
       Action:   Specify the correct number of arguments for funcname.

  LIMITS
       Most constructions in this implementation of awk are dynamic,
       limited only by memory restrictions of the target machine.  The
       parser stack depth is limited to 150 levels.  Attempting to pro-
       cess extremely complicated programs may result in an overflow of
       this stack, causing an error.

  PORTABILITY
       POSIX.2.  x/OPEN Portability Guide 4.0.  All UNIX systems.

       The ord function is an extension to traditional implementations
       of awk.  The toupper and tolower functions and the ENVIRON array
       are in POSIX and the UNIX System V Release 4 version of awk.
       This version is a superset of New AWK as described in The AWK
       Programming Language by Aho, Weinberger, and Kernighan.

       The shell that the system function uses and that awk uses to run
       pipelines for getline, print and printf is system dependent.  On
       the MPE/iX system, this is always the MPE/iX Shell.

  MPE/iX NOTES
       For information on how the current MPE/iX implementation may
       affect the operation of this utility, see Appendix A, MPE/iX
       Implementation Considerations.

  SEE ALSO
       ed(1), egrep(1), sed(1), vi(1), ascii(3), regexp(3)

                                                                      25