Awk Tutorial
Awk Tutorial
Awk Tutorial
AWK
Audience
This tutorial will be useful for software developers, system administrators, or any
enthusiastic reader inclined to learn how to do text processing and data extraction in
Unix-like environment.
Prerequisites
You must have a basic understanding of GNU/Linux operating system and shell
scripting.
AWK
Table of Contents
About the Tutorial ............................................................................................................................................. i
Audience ............................................................................................................................................................ i
Prerequisites ...................................................................................................................................................... i
Copyright & Disclaimer ...................................................................................................................................... i
Table of Contents .............................................................................................................................................. ii
1. OVERVIEW ................................................................................................................................... 1
Types of AWK.................................................................................................................................................... 1
Typical Uses of AWK ......................................................................................................................................... 1
2. AWK ENVIRONMENT.................................................................................................................... 2
Installation Using Package Manager ................................................................................................................. 2
Installation from Source Code ........................................................................................................................... 2
ii
AWK
7. OPERATORS ............................................................................................................................... 23
Arithmetic Operators ...................................................................................................................................... 23
Increment and Decrement Operators ............................................................................................................. 24
Assignment Operators .................................................................................................................................... 25
Relational Operators ....................................................................................................................................... 27
Logical Operators ............................................................................................................................................ 29
Ternary Operator ............................................................................................................................................ 30
Unary Operators ............................................................................................................................................. 31
Exponential Operators .................................................................................................................................... 31
String Concatenation Operator ....................................................................................................................... 32
Array Membership Operator........................................................................................................................... 32
Regular Expression Operators ......................................................................................................................... 32
iii
AWK
9. ARRAYS ...................................................................................................................................... 38
Creating Array ................................................................................................................................................. 38
Deleting Array Elements ................................................................................................................................. 38
Multi-Dimensional Arrays ............................................................................................................................... 39
iv
AWK
1. OVERVIEW
AWK
Types of AWK
Following are the variants of AWK:
AWK - Original AWK from AT & T Laboratory.
NAWK - Newer and improved version of AWK from AT & T Laboratory.
GAWK - It is GNU AWK. All GNU/Linux distributions ship GAWK. It is fully compatible
with AWK and NAWK.
Text processing
2. AWK ENVIRONMENT
AWK
This chapter describes how to set up the AWK environment on your GNU/Linux
system.
AWK
3. AWK WORKFLOW
AWK
To become an expert AWK programmer, you need to know its internals. AWK follows
a simple workflow: Read, Execute, and Repeat. The following diagram depicts the
workflow of AWK:
Read
AWK reads a line from the input stream (file, pipe, or stdin) and stores it in memory.
Execute
All AWK commands are applied sequentially on the input. By default, AWK executes
commands on every line. We can restrict this by providing patterns.
Repeat
This process repeats until the file reaches its end.
AWK
Program Structure
Let us now understand the program structure of AWK.
BEGIN Block
The syntax of the BEGIN block is as follows:
BEGIN {awk-commands}
The BEGIN block gets executed at program start-up. It executes only once. This is
good place to initialize variables. BEGIN is an AWK keyword and hence it must be in
upper-case. Please note that this block is optional.
Body Block
The syntax of the body block is as follows:
/pattern/ {awk-commands}
The body block applies AWK commands on every input line. By default, AWK executes
commands on every line. We can restrict this by providing patterns. Note that there
are no keywords for the Body block.
END Block
The syntax of the END block is as follows:
END {awk-commands}
The END block executes at the end of the program. END is an AWK keyword and hence
it must be in upper-case. Please note that this block is optional.
Example
Let us create a file marks.txt which contains the serial number, name of the student,
subject name, and number of marks obtained.
1)
Amit
Physics
80
2)
Rahul
Maths
90
3)
Shyam
Biology
87
4)
Kedar
English
85
5)
Hari
History
89
Let us now display the file contents with header by using AWK script.
5
AWK
Sub
Marks
1)
Amit
Physics
80
2)
Rahul
Maths
90
3)
Shyam
Biology
87
4)
Kedar
English
85
5)
Hari
History
89
At the start, AWK prints the header from the BEGIN block. Then in the body block, it
reads a line from a file and executes AWK's print command which just prints the
contents on the standard output stream. This process repeats until file reaches the
end.
4. BASIC SYNTAX
AWK
AWK is simple to use. We can provide AWK commands either directly from the
command line or in the form of a text file containing AWK commands.
Example
Consider a text file marks.txt with following content:
1)
Amit
Physics
80
2)
Rahul
Maths
90
3)
Shyam
Biology
87
4)
Kedar
English
85
5)
Hari
History
89
Let us display the complete content of the file using AWK as follows:
[jerry]$ awk '{print}' marks.txt
On executing this code, you get the following result:
1)
Amit
Physics
80
2)
Rahul
Maths
90
3)
Shyam
Biology
87
4)
Kedar
English
85
5)
Hari
History
89
AWK
First, create a text file command.awk containing the AWK command as shown below:
{print}
Now we can instruct the AWK to read commands from the text file and perform the
action. Here, we achieve the same result as shown in the above example.
[jerry]$ awk -f command.awk marks.txt
On executing this code, you get the following result:
1)
Amit
Physics
80
2)
Rahul
Maths
90
3)
Shyam
Biology
87
4)
Kedar
English
85
5)
Hari
History
89
The -v Option
This option assigns a value to a variable. It allows assignment before the program
execution. The following example describes the usage of the -v option.
[jerry]$ awk -v name=Jerry 'BEGIN{printf "Name = %s\n", name}'
On executing this code, you get the following result:
Name = Jerry
AWK
-f progfile
--file=progfile
-F fs
--field-separator=fs
-v var=val
--assign=var=val
9
AWK
Short options:
-b
--characters-as-bytes
-c
--traditional
-C
--copyright
-d[file]
--dump-variables[=file]
-e 'program-text' --source='program-text'
-E file
--exec=file
-g
--gen-pot
-h
--help
-L [fatal]
--lint[=fatal]
-n
--non-decimal-data
-N
--use-lc-numeric
-O
--optimize
-p[file]
--profile[=file]
-P
--posix
-r
--re-interval
-S
--sandbox
-t
--lint-old
-V
--version
AWK
# BEGIN block(s)
BEGIN {
printf "---|Header|--\n"
}
# Rule(s)
{
print $0
}
# END block(s)
END {
printf "---|Footer|---\n"
}
AWK
12
5. BASIC EXAMPLES
AWK
This chapter describes several useful AWK commands and their appropriate examples.
Consider a text file marks.txt to be processed with the following content:
1)
Amit
Physics
80
2)
Rahul
Maths
90
3)
Shyam
Biology
87
4)
Kedar
English
85
5)
Hari
History
89
80
Maths
90
Biology
87
English
85
History
89
In the file marks.txt, the third column contains the subject name and the
fourth column contains the marks obtained in a particular subject. Let us print these
two columns using AWK print command. In the above example, $3 and $4 represent
the third and the fourth fields respectively from the input record.
Rahul
Maths
90
3)
Shyam
Biology
87
13
AWK
4)
Kedar
English
85
5)
Hari
History
89
In the above example, we are searching form pattern a. When a pattern match
succeeds, it executes a command from the body block. In the absence of a body block
- default action is taken which is print the record. Hence, the following command
produces the same result:
[jerry]$ awk '/a/' marks.txt
90
Biology
87
English
85
History
89
Maths
87
Biology
85
English
89
History
AWK
Shyam
Biology
87
4)
Kedar
English
85
AWK provides a built-in length function that returns the length of the
string. $0 variable stores the entire line and in the absence of a body block, default
action is taken, i.e., the print action. Hence, if a line has more than 18 characters,
then the comparison results true and the line gets printed.
15
6. BUILT-IN VARIABLES
AWK
AWK provides several built-in variables. They play an important role while writing AWK
scripts. This chapter demonstrates the usage of built-in variables.
ARGC
It implies the number of arguments provided at the command line.
[jerry]$ awk 'BEGIN {print "Arguments =", ARGC}' One Two Three Four
On executing this code, you get the following result:
Arguments = 5
But why AWK shows 5 when you passed only 4 arguments? Just check the following
example to clear your doubt.
ARGV
It is an array that stores the command-line arguments. The array's valid index ranges
from 0 to ARGC-1.
[jerry]$ awk 'BEGIN { for (i = 0; i < ARGC - 1; ++i) { printf "ARGV[%d] =
%s\n", i, ARGV[i] } }' one two three four
On executing this code, you get the following result:
ARGV[0] = awk
ARGV[1] = one
ARGV[2] = two
ARGV[3] = three
CONVFMT
It represents the conversion format for numbers. Its default value is %.6g.
[jerry]$ awk 'BEGIN { print "Conversion Format =", CONVFMT }'
16
AWK
ENVIRON
It is an associative array of environment variables.
[jerry]$ awk 'BEGIN { print ENVIRON["USER"] }'
On executing this code, you get the following result:
jerry
To find names of other environment variables, use env command.
FILENAME
It represents the current file name.
[jerry]$ awk 'END {print FILENAME}' marks.txt
On executing this code, you get the following result:
marks.txt
Please note that FILENAME is undefined in the BEGIN block.
FS
It represents the (input) field separator and its default value is space. You can also
change this by using -F command line option.
[jerry]$ awk 'BEGIN {print "FS = " FS}' | cat -vte
On executing this code, you get the following result:
FS =
NF
It represents the number of fields in the current record. For instance, the following
example prints only those lines that contain more than two fields.
[jerry]$ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NF > 2'
17
AWK
NR
It represents the number of the current record. For instance, the following example
prints the record if the current record contains less than three fields.
[jerry]$ echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NR < 3'
FNR
It is similar to NR, but relative to the current file. It is useful when AWK is operating
on multiple files. Value of FNR resets with new file.
OFMT
It represents the output format number and its default value is %.6g.
[jerry]$ awk 'BEGIN {print "OFMT = " OFMT}'
On executing this code, you get the following result:
OFMT = %.6g
OFS
It represents the output field separator and its default value is space.
[jerry]$ awk 'BEGIN {print "OFS = " OFS}' | cat -vte
On executing this code, you get the following result:
OFS =
ORS
It represents the output record separator and its default value is newline.
18
AWK
RLENGTH
It represents the length of the string matched by match function. AWK's match
function searches for a given string in the input-string.
[jerry]$ awk 'BEGIN { if (match("One Two Three", "re")) { print RLENGTH } }'
RS
It represents (input) record separator and its default value is newline.
[jerry]$ awk 'BEGIN {print "RS = " RS}' | cat -vte
On executing this code, you get the following result:
RS = $
$
RSTART
It represents the first position in the string matched by match function.
[jerry]$ awk 'BEGIN { if (match("One Two Three", "Thre")) { print RSTART } }'
SUBSEP
It represents the separator character for array subscripts and its default value
is \034.
[jerry]$ awk 'BEGIN { print "SUBSEP = " SUBSEP }' | cat -vte
19
AWK
$0
It represents the entire input record.
[jerry]$ awk '{print $0}' marks.txt
On executing this code, you get the following result:
1)
Amit
Physics
80
2)
Rahul
Maths
90
3)
Shyam
Biology
87
4)
Kedar
English
85
5)
Hari
History
89
$n
It represents the nth field in the current record where the fields are separated by FS.
[jerry]$ awk '{print $3 "\t" $4}' marks.txt
On executing this code, you get the following result:
Physics
80
Maths
90
Biology
87
English
85
History
89
ARGIND
It represents the index in ARGV of the current file being processed.
[jerry]$ awk '{ print "ARGIND
= ", ARGIND; print "Filename = ",
ARGV[ARGIND] }' junk1 junk2 junk3
20
AWK
Filename =
junk1
ARGIND
Filename =
junk2
ARGIND
Filename =
junk3
BINMODE
It is used to specify binary mode for all file I/O on non-POSIX systems. Numeric values
of 1, 2, or 3 specify that input files, output files, or all files, respectively, should use
binary I/O. String values of r or w specify that input files or output files, respectively,
should use binary I/O. String values of rw or wr specify that all files should use binary
I/O.
ERRNO
A string indicates an error when a redirection fails for getline or if close call fails.
[jerry]$ awk 'BEGIN { ret = getline < "junk.txt"; if (ret == -1) print
"Error:", ERRNO }'
On executing this code, you get the following result:
Error: No such file or directory
FIELDWIDTHS
A space separated list of field widths variable is set, GAWK parses the input into fields
of fixed width, instead of using the value of the FS variable as the field separator.
IGNORECASE
When this variable is set, GAWK becomes case-insensitive. The following example
demonstrates this:
[jerry]$ awk 'BEGIN{IGNORECASE=1} /amit/' marks.txt
On executing this code, you get the following result:
1)
Amit
Physics
80
21
AWK
LINT
It provides dynamic control of the --lint option from the GAWK program. When this
variable is set, GAWK prints lint warnings. When assigned the string value fatal, lint
warnings become fatal errors, exactly like --lint=fatal.
[jerry]$ awk 'BEGIN {LINT=1; a}'
On executing this code, you get the following result:
awk: cmd. line:1: warning: reference to uninitialized variable `a'
awk: cmd. line:1: warning: statement has no effect
PROCINFO
This is an associative array containing information about the process, such as real and
effective UID numbers, process ID number, and so on.
[jerry]$ awk 'BEGIN { print PROCINFO["pid"] }'
On executing this code, you get the following result:
4316
TEXTDOMAIN
It represents the text domain of the AWK program. It is used to find the localized
translations for the program's strings.
[jerry]$ awk 'BEGIN { print TEXTDOMAIN }'
On executing this code, you get the following result:
messages
The above output shows English text due to en_IN locale.
22
7. OPERATORS
AWK
Like other programming languages, AWK also provides a large set of operators. This
chapter explains AWK operators with suitable examples.
Arithmetic Operators
AWK supports the following arithmetic operators:
Addition
It is represented by plus (+) symbol which adds two or more numbers. The following
example demonstrates this:
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a + b) = ", (a + b) }'
On executing this code, you get the following result:
(a + b) =
70
Subtraction
It is represented by minus (-) symbol which subtracts two or more numbers. The
following example demonstrates this:
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a - b) = ", (a - b) }'
On executing this code, you get the following result:
(a - b) =
30
Multiplication
It is represented by asterisk (*) symbol which multiplies two or more numbers. The
following example demonstrates this:
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a * b) = ", (a * b) }'
On executing this code, you get the following result:
(a * b) =
1000
23
AWK
Division
It is represented by slash (/) symbol which divides two or more numbers. The
following example illustrates this:
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a / b) = ", (a / b) }'
On executing this code, you get the following result:
(a / b) =
2.5
Module
It is represented by percent (%) symbol which finds the module division of two or
more numbers. The following example illustrates this:
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a % b) = ", (a % b) }'
On executing this code, you get the following result:
(a % b) =
10
Pre-Increment
It is represented by ++. It increments the value of an operand by 1. This operator
first increments the value of the operand, then returns the incremented value. For
instance, in the following example, this operator sets the value of both the operands,
a and b, to 11.
awk 'BEGIN { a = 10; b = ++a; printf "a = %d, b = %d\n", a, b }'
On executing this code, you get the following result:
a = 11, b = 11
Pre-Decrement
It is represented by --. It decrements the value of an operand by 1. This operator first
decrements the value of the operand, then returns the decremented value. For
instance, in the following example, this operator sets the value of both the operands,
a and b, to 9.
24
AWK
[jerry]$ awk 'BEGIN { a = 10; b = --a; printf "a = %d, b = %d\n", a, b }'
On executing the above code, you get the following result:
a = 9, b = 9
Post-Increment
It is represented by ++. It increments the value of an operand by 1. This operator
first returns the value of the operand, then it increments its value. For instance, the
following code sets the value of operand a to 11 and b to 10.
[jerry]$ awk 'BEGIN { a = 10; b = a++; printf "a = %d, b = %d\n", a, b }'
On executing this code, you get the following result:
a = 11, b = 10
Post-Decrement
It is represented by --. It decrements the value of an operand by 1. This operator first
returns the value of the operand, then it decrements its value. For instance, the
following code sets the value of the operand a to 9 and b to 10.
[jerry]$ awk 'BEGIN { a = 10; b = a--; printf "a = %d, b = %d\n", a, b }'
On executing this code, you get the following result:
a = 9, b = 10
Assignment Operators
AWK supports the following assignment operators:
Simple Assignment
It is represented by =. The following example demonstrates this:
[jerry]$ awk 'BEGIN { name = "Jerry"; print "My name is", name }'
On executing this code, you get the following result:
My name is Jerry
25
AWK
Shorthand Addition
It is represented by +=. The following example demonstrates this:
[jerry]$ awk 'BEGIN { cnt=10; cnt += 10; print "Counter =", cnt }'
On executing this code, you get the following result:
Counter = 20
In the above example, the first statement assigns value 10 to the variable cnt. In the
next statement, the shorthand operator increments its value by 10.
Shorthand Subtraction
It is represented by -=. The following example demonstrates this:
[jerry]$ awk 'BEGIN { cnt=100; cnt -= 10; print "Counter =", cnt }'
On executing this code, you get the following result:
Counter = 90
In the above example, the first statement assigns value 100 to the variable cnt. In
the next statement, the shorthand operator decrements its value by 10.
Shorthand Multiplication
It is represented by *=. The following example demonstrates this:
[jerry]$ awk 'BEGIN { cnt=10; cnt *= 10; print "Counter =", cnt }'
On executing this code, you get the following result:
Counter = 100
In the above example, the first statement assigns value 10 to the variable cnt. In the
next statement, the shorthand operator multiplies its value by 10.
Shorthand Division
It is represented by /=. The following example demonstrates this:
[jerry]$ awk 'BEGIN { cnt=100; cnt /= 5; print "Counter =", cnt }'
On executing this code, you get the following result:
Counter = 20
26
AWK
In the above example, the first statement assigns value 100 to the variable cnt. In
the next statement, the shorthand operator divides it by 5.
Shorthand Modulo
It is represented by %=. The following example demonstrates this:
[jerry]$ awk 'BEGIN { cnt=100; cnt %= 8; print "Counter =", cnt }'
On executing this code, you get the following result:
Counter = 4
Shorthand Exponential
It is represented by ^=. The following example demonstrates this:
[jerry]$ awk 'BEGIN { cnt=2; cnt ^= 4; print "Counter =", cnt }'
On executing this code, you get the following result:
Counter = 16
The above example raises the value of cnt by 4.
Shorthand Exponential
It is represented by **=. The following example demonstrates this:
[jerry]$ awk 'BEGIN { cnt=2; cnt **= 4; print "Counter =", cnt }'
On executing this code, you get the following result:
Counter = 16
This example also raises the value of cnt by 4.
Relational Operators
AWK supports the following relational operators:
Equal to
It is represented by ==. It returns true if both operands are equal, otherwise it returns
false. The following example demonstrates this:
awk 'BEGIN { a = 10; b = 10; if (a == b) print "a == b" }'
27
AWK
Not Equal to
It is represented by !=. It returns true if both operands are unequal, otherwise it
returns false.
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (a != b) print "a != b" }'
On executing this code, you get the following result:
a != b
Less Than
It is represented by <. It returns true if the left-side operand is less than the rightside operand; otherwise it returns false.
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (a < b) print "a < b" }'
On executing this code, you get the following result:
a < b
Greater Than
It is represented by >. It returns true if the left-side operand is greater than the rightside operand, otherwise it returns false.
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (b > a ) print "b > a" }'
28
AWK
Logical Operators
AWK supports the following logical operators:
Logical AND
It is represented by &&. Its syntax is as follows:
expr1 && expr2
It evaluates to true if both expr1 and expr2 evaluate to true; otherwise it returns
false. expr2 is evaluated if and only if expr1 evaluates to true. For instance, the
following example checks whether the given single digit number is in octal format or
not.
[jerry]$ awk 'BEGIN {num = 5; if (num >= 0 && num <= 7) printf "%d is in
octal format\n", num }'
On executing this code, you get the following result:
5 is in octal format
Logical OR
It is represented by ||. The syntax of Logical OR is:
expr1 || expr2
It evaluates to true if either expr1 or expr2 evaluates to true; otherwise it returns
false. expr2 is evaluated if and only if expr1 evaluates to false. The following example
demonstrates this:
29
AWK
[jerry]$ awk 'BEGIN {ch = "\n"; if (ch == " " || ch == "\t" || ch == "\n")
print "Current character is whitespace." }'
On executing this code, you get the following result:
Current character is whitespace.
Logical NOT
It is represented by exclamation mark (!). The following example demonstrates
this:
! expr1
It returns the logical compliment of expr1. If expr1 evaluates to true, it returns 0;
otherwise it returns 1. For instance, the following example checks whether a string is
empty or not.
[jerry]$ awk 'BEGIN { name = ""; if (! length(name)) print "name is empty
string." }'
On executing this code, you get the following result:
name is empty string.
Ternary Operator
We can easily implement a condition expression using ternary operator. The following
example demonstrates this:
condition expression ? statement1 : statement2
When the condition expression returns true, statement1 gets executed; otherwise
statement2 is executed. For instance, the following example finds the largest number
from two given numbers.
[jerry]$ awk 'BEGIN { a = 10; b = 20; (a > b) ? max = a : max = b; print
"Max =", max}'
On executing this code, you get the following result:
Max = 20
30
AWK
Unary Operators
AWK supports the following unary operators:
Unary Plus
It is represented by +. It multiplies a single operand by +1.
[jerry]$ awk 'BEGIN { a = -10; a = +a; print "a =", a }'
On executing this code, you get the following result:
a = -10
Unary Minus
It is represented by -. It multiplies a single operand by -1.
[jerry]$ awk 'BEGIN { a = -10; a = -a; print "a =", a }'
On executing this code, you get the following result:
a = 10
Exponential Operators
There are two formats of exponential operators:
Exponential Format 1
It is an exponential operator that raises the value of an operand. For instance, the
following example raises the value of 10 by 2.
[jerry]$ awk 'BEGIN { a = 10; a = a ^ 2; print "a =", a }'
On executing this code, you get the following result:
a = 100
Exponential Format 2
It is an exponential operator that raises the value of an operand. For instance, the
following example raises the value of 10 by 2.
[jerry]$ awk 'BEGIN { a = 10; a = a ** 2; print "a =", a }'
31
AWK
Match
It is represented as ~. It looks for a field that contains the match string. For instance,
the following example prints the lines that contain the pattern 9.
[jerry]$ awk '$0 ~ 9' marks.txt
32
AWK
Rahul Maths 90
5)
Hari
History
89
Not Match
It is represented as !~. It looks for a field that does not contain the match string. For
instance, the following example prints the lines that do not contain the pattern 9.
[jerry]$ awk '$0 !~ 9' marks.txt
On executing this code, you get the following result:
1)
Amit
Physics
80
3)
Shyam Biology
87
4)
Kedar English
85
33
8. REGULAR EXPRESSIONS
AWK
Dot
It matches any single character except the end of line character. For instance, the
following example matches fin, fun, fan, etc.
[jerry]$ echo -e "cat\nbat\nfun\nfin\nfan" | awk '/f.n/'
On executing the above code, you get the following result:
fun
fin
fan
Start of Line
It matches the start of line. For instance, the following example prints all the lines
that start with pattern The.
[jerry]$ echo -e "This\nThat\nThere\nTheir\nthese" | awk '/^The/'
On executing this code, you get the following result:
There
Their
End of Line
It matches the end of line. For instance, the following example prints the lines that
end with the letter n.
[jerry]$ echo -e "knife\nknow\nfun\nfin\nfan\nnine" | awk '/n$/'
34
AWK
Exclusive Set
In exclusive set, the carat negates the set of characters in the square brackets. For
instance, the following example prints only Ball.
[jerry]$ echo -e "Call\nTall\nBall" | awk '/[^CT]all/'
On executing this code, you get the following result:
Ball
Alteration
A vertical bar allows regular expressions to be logically ORed. For instance, the
following example prints Ball and Call.
[jerry]$ echo -e "Call\nTall\nBall\nSmall\nShall" | awk '/Call|Ball/'
On executing this code, you get the following result:
Call
Ball
35
AWK
| awk '/2+/'
36
AWK
Grouping
Parentheses () are used for grouping and the character | is used for alternatives.
For instance, the following regular expression matches the lines containing
either Apple Juice or Apple Cake.
[jerry]$ echo -e "Apple Juice\nApple Pie\nApple Tart\nApple Cake" | awk
'/Apple (Juice|Cake)/'
On executing this code, you get the following result:
Apple Juice
Apple Cake
37
9. ARRAYS
AWK
AWK has associative arrays and one of the best thing about it is the indexes need
not to be continuous set of number; you can use either string or number as an array
index. Also, there is no need to declare the size of an array in advance arrays can
expand/shrink at runtime.
Its syntax is as follows:
array_name[index]=value
Where array_name is the name of array, index is the array index, and value is any
value assigning to the element of the array.
Creating Array
To gain more insight on array, let us create and access the elements of an array.
[jerry]$ awk 'BEGIN {
fruits["mango"]="yellow";
fruits["orange"]="orange"
print fruits["orange"] "\n" fruits["mango"]
}'
On executing this code, you get the following result:
orange
yellow
In the above example, we declare the array as fruits whose index is fruit name and
the value is the color of the fruit. To access array elements, we use
array_name[index] format.
38
AWK
The following example deletes the element orange. Hence the command does not
show any output.
[jerry]$ awk 'BEGIN {
fruits["mango"]="yellow";
fruits["orange"]="orange";
delete fruits["orange"];
print fruits["orange"]
}'
Multi-Dimensional Arrays
AWK only supports one-dimensional arrays. But you can easily simulate a multidimensional array using the one-dimensional array itself.
For instance, given below is a 3x3 three-dimensional array:
100 200 300
400 500 600
700 800 900
In the above example, array[0][0] stores 100, array[0][1] stores 200, and so on. To
store 100 at array location [0][0], we can use the following syntax:
array["0,0"] = 100
Though we gave 0,0 as index, these are not two indexes. In reality, it is just one index
with the string 0,0.
The following example simulates a 2-D array:
[jerry]$ awk 'BEGIN {
array["0,0"] = 100;
array["0,1"] = 200;
array["0,2"] = 300;
array["1,0"] = 400;
array["1,1"] = 500;
array["1,2"] = 600;
# print array elements
print "array[0,0] = " array["0,0"];
print "array[0,1] = " array["0,1"];
39
AWK
40
AWK
If Statement
It simply tests the condition and performs certain actions depending upon the
condition. Given below is the syntax of if statement:
if (condition)
action
We can also use a pair of curly braces as given below to execute multiple actions:
if (condition)
{
action-1
action-1
.
.
action-n
}
For instance, the following example checks whether a number is even or not:
[jerry]$ awk 'BEGIN {num = 10; if (num % 2 == 0) printf "%d is even
number.\n", num }'
On executing this code, you get the following result:
10 is even number.
If-Else Statement
In if-else syntax, we can provide a list of actions to be performed when a condition
becomes false.
41
AWK
If-Else-If Ladder
We can easily create an if-else-if ladder by using multiple if-else statements. The
following example demonstrates this:
[jerry]$ awk 'BEGIN {
a=30;
if (a==10)
print "a = 10";
else if (a == 20)
print "a = 20";
else if (a == 30)
print "a = 30";
}'
On executing this code, you get the following result:
a = 30
42
11. LOOPS
AWK
This chapter explains AWK's loops with suitable example. Loops are used to execute
a set of actions in a repeated manner. The loop execution continues as long as the
loop condition is true.
For Loop
The syntax of for loop is:
for (initialisation; condition; increment/decrement)
action
Initially, the for statement performs initialization action, then it checks the condition.
If the condition is true, it executes actions, thereafter it performs increment or
decrement operation. The loop execution continues as long as the condition is true.
For instance, the following example prints 1 to 5 using for loop:
[jerry]$ awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }'
On executing this code, you get the following result:
1
2
3
4
5
While Loop
The while loop keeps executing the action until a particular logical condition evaluates
to true. Here is the syntax of while loop:
while (condition)
action
AWK first checks the condition; if the condition is true, it executes the action. This
process repeats as long as the loop condition evaluates to true. For instance, the
following example prints 1 to 5 using while loop:
[jerry]$ awk 'BEGIN {i = 1; while (i < 6) { print i; ++i } }'
43
AWK
Do-While Loop
The do-while loop is similar to the while loop, except that the test condition is
evaluated at the end of the loop. Here is the syntax of do-while loop:
do
action
while (condition)
In a do-while loop, the action statement gets executed at least once even when the
condition statement evaluates to false. For instance, the following example prints 1 to
5 numbers using do-while loop:
[jerry]$ awk 'BEGIN {i = 1; do { print i; ++i } while (i < 6) }'
On executing this code, you get the following result:
1
2
3
4
5
Break Statement
As its name suggests, it is used to end the loop execution. Here is an example which
ends the loop when the sum becomes greater than 50.
[jerry]$ awk 'BEGIN {sum = 0; for (i = 0; i < 20; ++i) { sum += i; if (sum
> 50) break; else print "Sum =", sum } }'
44
AWK
Continue Statement
The continue statement is used inside a loop to skip to the next iteration of the loop.
It is useful when you wish to skip the processing of some data inside the loop. For
instance, the following example uses continue statement to print the even numbers
between 1 to 20.
[jerry]$ awk 'BEGIN {for (i = 1; i <= 20; ++i) {if (i % 2 == 0) print i ;
else continue} }'
On executing this code, you get the following result:
2
4
6
8
10
12
14
16
18
20
45
AWK
Exit Statement
It is used to stop the execution of the script. It accepts an integer as an argument
which is the exit status code for AWK process. If no argument is supplied, exit returns
status zero. Here is an example that stops the execution when the sum becomes
greater than 50.
[jerry]$ awk 'BEGIN {sum = 0; for (i = 0; i < 20; ++i) { sum += i; if (sum
> 50) exit(10); else print "Sum =", sum } }'
On executing this code, you get the following result:
Sum = 0
Sum = 1
Sum = 3
Sum = 6
Sum = 10
Sum = 15
Sum = 21
Sum = 28
Sum = 36
Sum = 45
Let us check the return status of the script.
[jerry]$ echo $?
On executing this code, you get the following result:
10
46
AWK
AWK has a number of functions built into it that are always available to the
programmer. This chapter describes Arithmetic, String, Time, Bit manipulation, and
other miscellaneous functions with suitable examples.
Arithmetic Functions
AWK has the following built-in arithmetic functions:
atan2(y, x)
It returns the arctangent of (y/x) in radians. The following example demonstrates
this:
[jerry]$ awk 'BEGIN {
PI = 3.14159265
x = -10
y = 10
result = atan2 (y,x) * 180 / PI;
cos(expr)
This function returns the cosine of expr, which is expressed in radians. The following
example demonstrates this:
[jerry]$ awk 'BEGIN {
PI = 3.14159265
param = 60
result = cos(param * PI / 180.0);
AWK
}'
On executing this code, you get the following result:
The cosine of 60.000000 degrees is 0.500000.
exp(expr)
This function is used to find the exponential value of a variable.
[jerry]$ awk 'BEGIN {
param = 5
result = exp(param);
int(expr)
This function truncates the expr to an integer value. The following example
demonstrates this:
[jerry]$ awk 'BEGIN {
param = 5.12345
result = int(param)
log(expr)
This function calculates the natural logarithm of a variable.
[jerry]$ awk 'BEGIN {
48
AWK
param = 5.5
result = log (param)
rand
This function returns a random number N, between 0 and 1, such that 0 <= N < 1.
For instance, the following example generates three random numbers:
[jerry]$ awk 'BEGIN {
print "Random num1 =" , rand()
print "Random num2 =" , rand()
print "Random num3 =" , rand()
}'
On executing this code, you get the following result:
Random num1 = 0.237788
Random num2 = 0.291066
Random num3 = 0.845814
sin(expr)
This function returns the sine of expr, which is expressed in radians. The following
example demonstrates this:
[jerry]$ awk 'BEGIN {
PI = 3.14159265
param = 30.0
result = sin(param * PI /180)
AWK
sqrt(expr)
This function returns the square root of expr.
[jerry]$ awk 'BEGIN {
param = 1024.0
result = sqrt(param)
srand([expr])
This function generates a random number using seed value. It uses expr as the new
seed for the random number generator. In the absence of expr, it uses the time of
day as the seed value.
[jerry]$ awk 'BEGIN {
param = 10
50
AWK
String Functions
AWK has the following built-in String functions:
asort(arr [, d [, how] ])
This function sorts the contents of arr using GAWK's normal rules for comparing
values, and replaces the indexes of the sorted values arr with sequential integers
starting with 1.
[jerry]$ awk 'BEGIN {
arr[0] = "Three"
arr[1] = "One"
arr[2] = "Two"
asort(arr)
AWK
asorti(arr [, d [, how] ])
The behavior of this function is the same as that of asort(), except that the array
indexes are used for sorting.
[jerry]$ awk 'BEGIN {
arr["Two"] = 1
arr["One"] = 2
arr["Three"] = 3
asorti(arr)
AWK
index(str, sub)
It checks whether sub is a substring of str or not. On success, it returns the position
where sub starts; otherwise it returns 0. The first character of str is at position 1.
[jerry]$ awk 'BEGIN {
str = "One Two Three"
subs = "Two"
length(str)
It returns the length of a string.
[jerry]$ awk 'BEGIN {
str = "Hello, World !!!"
match(str, regex)
It returns the index of the first longest match of regex in string str. It returns 0 if no
match found.
53
AWK
sprintf(format, expr-list)
This function returns a string constructed from expr-list according to format.
[jerry]$ awk 'BEGIN {
str = sprintf("%s", "Hello, World !!!")
54
AWK
print str
}'
On executing this code, you get the following result:
Hello, World !!!
strtonum(str)
This function examines str and return its numeric value. If str begins with a leading
0, it is treated as an octal number. If str begins with a leading 0x or 0X, it is taken as
a hexadecimal number. Otherwise, assume it is a decimal number.
[jerry]$ awk 'BEGIN {
print "Decimal num = " strtonum("123")
print "Octal num = " strtonum("0123")
print "Hexadecimal num = " strtonum("0x123")
}'
On executing this code, you get the following result:
Decimal num = 123
Octal num = 83
Hexadecimal num = 291
55
AWK
substr(str, start, l)
This function returns the substring of string str, starting at index start of length l. If
length is omitted, the suffix of str starting at index start is returned.
[jerry]$ awk 'BEGIN {
str = "Hello, World !!!"
subs = substr(str, 1, 5)
tolower(str)
This function returns a copy of string str with all upper-case characters converted to
lower-case.
[jerry]$ awk 'BEGIN {
str = "HELLO, WORLD !!!"
toupper(str)
This function returns a copy of string str with all lower-case characters converted to
upper case.
56
AWK
Time Functions
AWK has the following built-in time functions:
systime
This function returns the current time of the day as the number of seconds since the
Epoch (1970-01-01 00:00:00 UTC on POSIX systems).
[jerry]$ awk 'BEGIN {
print "Number of seconds since the Epoch = " systime()
}'
On executing this code, you get the following result:
Number of seconds since the Epoch = 1418574432
mktime(datespec)
This function converts datespec string into a timestamp of the same form as returned
by systime(). The datespec is a string of the form YYYY MM DD HH MM SS.
[jerry]$ awk 'BEGIN {
print "Number of seconds since the Epoch = " mktime("2014 12 14 30 20 10")
}'
On executing this code, you get the following result:
Number of seconds since the Epoch = 1418604610
57
AWK
Date format
specification
%a
%A
%b
%B
%c
%C
The century part of the current year. This is the year divided by 100
and truncated to the next lower integer.
%d
%D
%e
The day of the month, padded with a space if it is only one digit.
%F
%g
The year modulo 100 of the ISO 8601 week number, as a decimal
number (0099). For example, January 1, 1993 is in week 53 of
1992. Thus, the year of its ISO 8601 week number is 1992, even
though its year is 1993. Similarly, December 31, 1973 is in week 1
of 1974. Thus, the year of its ISO week number is 1974, even
though its year is 1973.
%G
%h
Equivalent to %b.
%H
AWK
%I
%j
%m
%M
%n
%p
%r
%R
%S
%t
A TAB character.
%T
%u
%U
The week number of the year (the first Sunday as the first day of
week one) as a decimal number (0053).
%V
The week number of the year (the first Monday as the first day of
week one) as a decimal number (0153).
%w
%W
The week number of the year (the first Monday as the first day of
week one) as a decimal number (0053).
%x
%X
%y
%Y
%z
%Z
59
AWK
and
Performs bitwise AND operation.
[jerry]$ awk 'BEGIN {
num1 = 10
num2 = 6
compl
It performs bitwise COMPLEMENT operation.
[jerry]$ awk 'BEGIN {
num1 = 10
lshift
It performs bitwise LEFT SHIFT operation.
[jerry]$ awk 'BEGIN {
num1 = 10
printf "lshift(%d) by 1 = %d\n", num1, lshift(num1, 1)
}'
On executing this code, you get the following result:
60
AWK
lshift(10) by 1 = 20
rshift
It performs bitwise RIGHT SHIFT operation.
[jerry]$ awk 'BEGIN {
num1 = 10
printf "rshift(%d) by 1 = %d\n", num1, rshift(num1, 1)
}'
On executing this code, you get the following result:
rshift(10) by 1 = 5
or
It performs bitwise OR operation.
[jerry]$ awk 'BEGIN {
num1 = 10
num2 = 6
printf "(%d OR %d) = %d\n", num1, num2, or(num1, num2)
}'
On executing this code, you get the following result:
(10 OR 6) = 14
xor
It performs bitwise XOR operation.
[jerry]$ awk 'BEGIN {
num1 = 10
num2 = 6
printf "(%d XOR %d) = %d\n", num1, num2, xor(num1, num2)
}'
AWK
Miscellaneous Functions
AWK has the following miscellaneous functions:
close(expr)
This function closes file of pipe.
[jerry]$ awk 'BEGIN {
cmd = "tr [a-z] [A-Z]"
print "hello, world !!!" |& cmd
close(cmd, "to")
cmd |& getline out
print out;
close(cmd);
}'
On executing this code, you get the following result:
HELLO, WORLD !!!
Does the script look cryptic? Let us demystify it.
The first statement, cmd = "tr [a-z] [A-Z]" - is the command to which we
establish the two way communication from AWK.
The next statement, i.e., the print command, provides input to the tr
command. Here &| indicates two-way communication.
The third statement, i.e., close(cmd, "to"), closes the to process after
competing its execution.
The next statement cmd |& getline out stores the output into out variable
with the aid of getline function.
The next print statement prints the output and finally the close function closes
the command.
delete
This function deletes an element from an array. The following example shows the
usage of the close function:
62
AWK
delete arr[0]
delete arr[1]
exit
This function stops the execution of a script. It also accepts an optional expr which
becomes AWK's return value. The following example describes the usage of exit
function.
63
AWK
exit 10
fflush
This function flushes any buffers associated with open output file or pipe. The following
syntax demonstrates the function.
fflush([output-expr])
If no output-expr is supplied, it flushes the standard output. If output-expr is the null
string (""), then it flushes all open files and pipes.
getline
This function instructs AWK to read the next line. The following example reads and
displays the marks.txt file using getline function.
[jerry]$ awk '{getline; print $0}' marks.txt
On executing this code, you get the following result:
2)
Rahul Maths 90
4)
Kedar English
85
5)
Hari
89
History
The script works fine. But where is the first line? Let us find out.
At the start, AWK reads the first line from the file marks.txt and stores it
into $0 variable.
In the next statement, we instructed AWK to read the next line using getline. Hence
AWK reads the second line and stores it into $0 variable.
And finally, AWK's print statement prints the second line. This process continues until
the end of the file.
64
AWK
next
The next function changes the flow of the program. It causes the current processing
of the pattern space to stop. The program reads the next line, and starts executing
the commands again with the new line. For instance, the following program does not
perform any processing when a pattern match succeeds.
[jerry]$ awk '{if ($0 ~/Shyam/) next; print $0}' marks.txt
On executing this code, you get the following result:
1)
Amit
Physics
2)
Rahul Maths 90
4)
Kedar English
85
5)
Hari
89
History
80
nextfile
The nextfile function changes the flow of the program. It stops processing the current
input file and starts a new cycle through pattern/procedures statements, beginning
with the first record of the next file. For instance, the following example stops
processing the first file when a pattern match succeeds.
First create two files. Let us say file1.txt contains:
file1:str1
file1:str2
file1:str3
file1:str4
And file2.txt contains:
file2:str1
file2:str2
file2:str3
file2:str4
AWK
file1:str1
file2:str1
file2:str2
file2:str3
file2:str4
return
This function can be used within a user-defined function to return the value. Please
note that the return value of a function is undefined if expr is not provided. The
following example describes the usage of the return function.
First, create a functions.awk file containing AWK command as shown below:
function addition(num1, num2)
{
result = num1 + num2
return result
}
BEGIN {
res = addition(10, 20)
print "10 + 20 = " res
}
On executing this code, you get the following result:
10 + 20 = 30
system
This function executes the specified command and returns its exit status. A return
status 0 indicates that a command execution has succeeded. A non-zero value
indicates a failure of command execution. For instance, the following example displays
the current date and also shows the return status of the command.
[jerry]$ awk 'BEGIN { ret = system("date"); print "Return value = " ret }'
On executing this code, you get the following result:
Sun Dec 21 23:16:07 IST 2014
66
AWK
Return value = 0
67
AWK
Functions are basic building blocks of a program. AWK allows us to define our own
functions. A large program can be divided into functions and each function can be
written/tested independently. It provides re-usability of code.
Given below is the general format of a user-defined function:
function function_name(argument1, argument2, ...)
{
function body
}
In this syntax, the function_name is the name of the user-defined function. Function
name should begin with a letter and the rest of the characters can be any combination
of numbers, alphabetic characters, or underscore. AWK's reserve words cannot be
used as function names.
Functions can accept multiple arguments separated by comma. Arguments are not
mandatory. You can also create a user-defined function without any argument.
function body consists of one or more AWK statements.
Let us write two functions that calculate the minimum and the maximum number and
call these functions from another function called main. The functions.awk file
contains:
# Returns minimum number
function find_min(num1, num2)
{
if (num1 < num2)
return num1
return num2
}
AWK
return num2
}
# Main function
function main(num1, num2)
{
# Find minimum number
result = find_min(10, 20)
print "Minimum =", result
69
AWK
So far, we displayed data on standard output stream. We can also redirect data to a
file. A redirection appears after the print or printf statement. Redirections in AWK are
written just like redirection in shell commands, except that they are written inside the
AWK program. This chapter explains redirection with suitable examples.
Redirection Operator
The syntax of the redirection operator is:
print DATA > output-file
It writes the data into the output-file. If the output-file does not exist, then it creates
one. When this type of redirection is used, the output-file is erased before the first
output is written to it. Subsequent write operations to the same output-file do not
erase the output-file, but append to it. For instance, the following example
writes Hello, World !!! to the file.
Let us create a file with some text data.
[jerry]$ echo "Old data" > /tmp/message.txt
[jerry]$ cat /tmp/message.txt
On executing this code, you get the following result:
Old data
Now let us redirect some contents into it using AWK's redirection operator.
[jerry]$ awk 'BEGIN { print "Hello, World !!!" > "/tmp/message.txt" }'
[jerry]$ cat /tmp/message.txt
On executing this code, you get the following result:
Hello, World !!!
Append Operator
The syntax of append operator is as follows:
print DATA >> output-file
70
AWK
It appends the data into the output-file. If the output-file does not exist, then it
creates one. When this type of redirection is used, new contents are appended at the
end of file. For instance, the following example appends Hello, World !!! to the file.
Let us create a file with some text data.
[jerry]$ echo "Old data" > /tmp/message.txt
[jerry]$ cat /tmp/message.txt
On executing this code, you get the following result:
Old data
Now let us append some contents to it using AWK's append operator.
[jerry]$ awk 'BEGIN { print "Hello, World !!!" >> "/tmp/message.txt" }'
[jerry]$ cat /tmp/message.txt
On executing this code, you get the following result:
Old data
Hello, World !!!
Pipe
It is possible to send output to another program through a pipe instead of using a file.
This redirection opens a pipe to command, and writes the values of items through this
pipe to another process to execute the command. The redirection argument command
is actually an AWK expression. Here is the syntax of pipe:
print items | command
Let us use tr command to convert lowercase letters to uppercase.
[jerry]$ awk 'BEGIN { print "hello, world !!!" | "tr [a-z] [A-Z]" }'
On executing this code, you get the following result:
HELLO, WORLD !!!
Two-Way Communication
AWK can communicate to an external process using |&, which is two-way
communication. For instance, the following example uses tr command to convert
lowercase letters to uppercase. Our command.awk file contains:
71
AWK
BEGIN {
cmd = "tr [a-z] [A-Z]"
print "hello, world !!!" |& cmd
close(cmd, "to")
cmd |& getline out
print out;
close(cmd);
}
On executing this code, you get the following result:
HELLO, WORLD !!!
Does the script look cryptic? Let us demystify it.
The first statement, cmd = "tr [a-z] [A-Z]", is the command to which we
establish the two-way communication from AWK.
The next statement, i.e., the print command provides input to the tr command.
Here &| indicates two-way communication.
The third statement, i.e., close(cmd, "to"), closes the to process after
competing its execution.
The next statement cmd |& getline out stores the output into out variable
with the aid of getline function.
The next print statement prints the output and finally the close function closes
the command.
72
AWK
So far, we have used AWK's print and printf functions to display data on standard
output. But the printf function is much more efficient. This function has been borrowed
from the C language and it is very helpful while producing formatted output. Here is
the syntax of the printf statement:
printf fmt, expr-list
In the above syntax, fmt is a string of format specifications and constants. expr-list is
a list of arguments corresponding to format specifiers.
Escape Sequences
Similar to any string, format can contain embedded escape sequences. Discussed
below are the escape sequences supported by AWK:
New Line
The following example prints Hello and World in separate lines using newline
character:
[jerry]$ awk 'BEGIN { printf "Hello\nWorld\n" }'
On executing this code, you get the following result:
Hello
World
Horizontal Tab
The following example uses horizontal tab to display different field:
[jerry]$ awk 'BEGIN { printf "Sr No\tName\tSub\tMarks\n" }'
On executing the above code, you get the following result:
Sr No
Name
Sub Marks
73
AWK
Vertical Tab
The following example uses vertical tab after each filed:
[jerry]$ awk 'BEGIN { printf "Sr No\vName\vSub\vMarks\n" }'
On executing this code, you get the following result:
Sr No
Name
Sub
Marks
Backspace
The following example prints a backspace after every field except the last one. It
erases the last number from the first three fields. For instance, Field 1 is displayed
as Field, because the last character is erased with backspace. However, the last
field Field 4 is displayed as it is, as we did not have a \b after Field 4.
[jerry]$ awk 'BEGIN { printf "Field 1\bField 2\bField 3\bField 4\n" }'
On executing this code, you get the following result:
Field Field Field Field 4
Carriage Return
In the following example, after printing every field, we do a Carriage Return and
print the next value on top of the current printed value. It means, in the final output,
you can see only Field 4, as it was the last thing to be printed on top of all the
previous fields.
[jerry]$ awk 'BEGIN { printf "Field 1\rField 2\rField 3\rField 4\n" }'
On executing this code, you get the following result:
Field 4
Form Feed
The following example uses form feed after printing each field.
[jerry]$ awk 'BEGIN { printf "Sr No\fName\fSub\fMarks\n" }'
74
AWK
Format Specifier
As in C-language, AWK also has format specifiers. The AWK version of the printf
statement accepts the following conversion specification formats:
%c
It prints a single character. If the argument used for %c is numeric, it is treated as a
character and printed. Otherwise, the argument is assumed to be a string, and the
only first character of that string is printed.
[jerry]$ awk 'BEGIN { printf "ASCII value 65 = character %c\n", 65 }'
On executing this code, you get the following result:
ASCII value 65 = character A
%d and %i
It prints only the integer part of a decimal number.
[jerry]$ awk 'BEGIN { printf "Percentages = %d\n", 80.66 }'
On executing this code, you get the following result:
Percentages = 80
%e and %E
It prints a floating point number of the form [-]d.dddddde[+-]dd.
[jerry]$ awk 'BEGIN { printf "Percentages = %E\n", 80.66 }'
On executing this code, you get the following result:
Percentages = 8.066000e+01
75
AWK
%f
It prints a floating point number of the form [-]ddd.dddddd.
[jerry]$ awk 'BEGIN { printf "Percentages = %f\n", 80.66 }'
On executing this code, you get the following result:
Percentages = 80.660000
%g and %G
Uses %e or %f conversion, whichever is shorter, with non-significant zeros
suppressed.
[jerry]$ awk 'BEGIN { printf "Percentages = %g\n", 80.66 }'
On executing this code, you get the following result:
Percentages = 80.66
The %G format uses %E instead of %e.
[jerry]$ awk 'BEGIN { printf "Percentages = %G\n", 80.66 }'
On executing this code, you get the following result:
Percentages = 80.66
%o
It prints an unsigned octal number.
[jerry]$ awk 'BEGIN { printf "Octal representation of decimal number 10 =
%o\n", 10}'
On executing this code, you get the following result:
Octal representation of decimal number 10 = 12
76
AWK
%u
It prints an unsigned decimal number.
[jerry]$ awk 'BEGIN { printf "Unsigned 10 = %u\n", 10 }'
On executing this code, you get the following result:
Unsigned 10 = 10
%s
It prints a character string.
[jerry]$ awk 'BEGIN { printf "Name = %s\n", "Sherlock Holmes" }'
On executing this code, you get the following result:
Name = Sherlock Holmes
%x and %X
It prints an unsigned hexadecimal number. The %X format uses uppercase letters
instead of lowercase.
[jerry]$ awk 'BEGIN { printf "Hexadecimal representation of decimal number
15 = %x\n", 15}'
On executing this code, you get the following result:
Hexadecimal representation of decimal number 15 = f
Now let use %X and observe the result:
[jerry]$ awk 'BEGIN { printf "Hexadecimal representation of decimal number
15 = %X\n", 15}'
On executing this code, you get the following result:
Hexadecimal representation of decimal number 15 = F
%%
It prints a single % character and no argument is converted.
[jerry]$ awk 'BEGIN { printf "Percentages = %d%%\n", 80.66 }'
77
AWK
Width
The field is padded to the width. By default, the field is padded with spaces but when
0 flag is used, it is padded with zeroes.
[jerry]$ awk 'BEGIN { num1 = 10; num2 = 20; printf "Num1 = %10d\nNum2 =
%10d\n", num1, num2 }'
On executing this code, you get the following result:
Num1 =
10
Num2 =
20
Leading Zeros
A leading zero acts as a flag, which indicates that the output should be padded with
zeroes instead of spaces. Please note that this flag only has an effect when the field
is wider than the value to be printed. The following example describes this:
[jerry]$ awk 'BEGIN { num1 = -10; num2 = 20; printf "Num1 = %05d\nNum2 =
%05d\n", num1, num2 }'
On executing this code, you get the following result:
Num1 = -0010
Num2 = 00020
Left Justification
The expression should be left-justified within its field. When the input-string is less
than the number of characters specified, and you want it to be left justified, i.e., by
adding spaces to the right, use a minus symbol () immediately after the % and before
the number.
In the following example, output of the AWK command is piped to the cat command
to display the END OF LINE($) character.
[jerry]$ awk 'BEGIN { num = 10; printf "Num = %-5d\n", num }' | cat -vte
78
AWK
Prefix Sign
It always prefixes numeric values with a sign, even if the value is positive.
[jerry]$ awk 'BEGIN { num1 = -10; num2 = 20; printf "Num1 = %+d\nNum2 =
%+d\n", num1, num2 }'
On executing this code, you get the following result:
Num1 = -10
Num2 = +20
Hash
For %o, it supplies a leading zero. For %x and %X, it supplies a leading 0x or 0X
respectively, only if the result is non-zero. For %e, %E, %f, and %F, the result always
contains a decimal point. For %g and %G, trailing zeros are not removed from the
result. The following example describes this:
[jerry]$ awk 'BEGIN { printf "Octal representation = %#o\nHexadecimal
representation = %#X\n", 10, 10}'
On executing this code, you get the following result:
Octal representation = 012
Hexadecimal representation = 0XA
79