WORKING WITH grep, sed, AND awk Pocket Primer: A Quick Guide to Mastering Powerful Command Line Tools
()
About this ebook
This book introduces readers to three powerful command-line utilities—grep, sed, and awk—that can create simple yet powerful shell scripts. Using the bash shell, it focuses on small text files to help readers understand these tools. Grep searches for patterns in data, sed modifies data, and awk performs tasks on pattern matches. Aimed at those new to the bash environment, the book is also valuable for those with some experience.
The journey starts with grep, teaching how to search for specific words or patterns in data. It then moves to sed, showing how to change or modify data efficiently. Finally, it delves into awk, a versatile programming language for searching and processing data files. The book also includes a chapter on using regular expressions with these tools, enhancing your scripting capabilities.
Mastering these utilities is crucial for efficient data handling and automation in a bash environment. This book transitions readers from basic to advanced command-line skills, blending theory with practical examples. It is an essential resource for anyone looking to harness the full power of bash scripting.
Read more from Mercury Learning And Information
Access 2021 / Microsoft 365 Programming by Example: Mastering VBA for Data Management and Automation Rating: 0 out of 5 stars0 ratingsAutoCAD 2024 Beginning and Intermediate: Mastering 2D Drafting Techniques for All Levels Rating: 0 out of 5 stars0 ratingsAccess 365 Project Book: Hands-On Database Creation Rating: 0 out of 5 stars0 ratings3D Printing: The Complete Guide to Mastering 3D Printing Techniques Rating: 0 out of 5 stars0 ratingsComputer Graphics Programming in OpenGL With C++ (Edition 3): Mastering 3D Graphics and Animation Techniques Rating: 0 out of 5 stars0 ratingsComputer Concepts and Management Information Systems: A Comprehensive Guide to Modern Computing and Information Management Rating: 0 out of 5 stars0 ratingsArtificial Intelligence and Expert Systems: Techniques and Applications for Problem Solving Rating: 0 out of 5 stars0 ratingsAngular and Deep Learning Pocket Primer: A Comprehensive Guide to AI and Expert Systems for Professionals Rating: 0 out of 5 stars0 ratingsData Wrangling Using Pandas, SQL, and Java: A Comprehensive Guide to Data Cleaning and Transformation Rating: 0 out of 5 stars0 ratingsClassic Game Design: From Pong to Pac-Man with Unity: Crafting Timeless Retro Games with Expert Techniques Rating: 0 out of 5 stars0 ratingsAutodesk Revit 2025 Architecture: Mastering Revit Techniques for Efficient Architectural Design Rating: 0 out of 5 stars0 ratingsDatabase Security: Master the Art of Protecting Your Data with Cutting-Edge Techniques Rating: 0 out of 5 stars0 ratingsGame Development Using Python: Mastering Interactive Game Creation and Development through Python Rating: 0 out of 5 stars0 ratingsComputer Graphics Programming in OpenGL with Java: A Comprehensive Guide to Modern 3D Graphics Programming Rating: 0 out of 5 stars0 ratingsText Analytics for Business Decisions: Mastering Techniques for Insightful Data Interpretation through a Case Study Approach Rating: 0 out of 5 stars0 ratingsPython Data Structures Pocket Primer: A concise guide to Python data structures to enhance your skills Rating: 0 out of 5 stars0 ratingsData Analytics: Master the Art of Data Analytics with Essential Tools and Techniques Rating: 0 out of 5 stars0 ratingsEmpirical Cloud Security: A Guide To Practical Intelligence to Evaluate Risks and Attacks Rating: 0 out of 5 stars0 ratingsTech Trends of the 4th Industrial Revolution: Navigating the Future of Technology in Business Rating: 0 out of 5 stars0 ratingsTensor Analysis for Engineers: Mastering Coordinate Systems, Transformations and Applications using Mathematics Rating: 0 out of 5 stars0 ratingsAutodesk® Revit® 2024 Architecture: Mastering Building Design with BIM Rating: 0 out of 5 stars0 ratingsPython for Programmers: A Comprehensive Guide for Intermediate to Advanced Python Programmers and Developers Rating: 0 out of 5 stars0 ratingsFlowchart and Algorithm Basics: Learn the Art of Programming through this Guide to Selection, Looping, Arrays, and File Processing Rating: 0 out of 5 stars0 ratingsIndustrial Automation and Robotics: A Comprehensive Guide to Automated Systems and Robotics Rating: 0 out of 5 stars0 ratingsPython 3 Data Visualization Using ChatGPT / GPT-4: Master Python Visualization Techniques with AI Integration Rating: 0 out of 5 stars0 ratingsData Science Fundamentals Pocket Primer: An Essential Guide to Data Science Concepts and Techniques Rating: 0 out of 5 stars0 ratingsEmbedded Vision: Mastering Advanced Techniques for Real-Time Image Processing and Analysis Rating: 0 out of 5 stars0 ratingsData Structures and Program Design Using C++: A Self-Teaching Introduction to Data Structures and C++ Rating: 0 out of 5 stars0 ratingsData Visualization for Business Decisions: Transforming Data into Actionable Insights Rating: 0 out of 5 stars0 ratingsUnmanned Aerial Vehicles: A Comprehensive Guide to UAV Technology and Applications Rating: 0 out of 5 stars0 ratings
Related to WORKING WITH grep, sed, AND awk Pocket Primer
Related ebooks
Bash for Data Scientists: A Comprehensive Guide to Shell Scripting for Data Science Tasks Rating: 0 out of 5 stars0 ratingsBash Command Line and Shell Scripts Pocket Primer: Mastering Bash Commands and Scripting Techniques Rating: 0 out of 5 stars0 ratingsPython 3 and Data Visualization: Mastering Graphics and Data Manipulation with Python Rating: 0 out of 5 stars0 ratingsPython for Programmers: A Comprehensive Guide for Intermediate to Advanced Python Programmers and Developers Rating: 0 out of 5 stars0 ratingsNatural Language Processing using R Pocket Primer: Learn Essential NLP Techniques and Tools for Developers Rating: 0 out of 5 stars0 ratingsData Wrangling Using Pandas, SQL, and Java: A Comprehensive Guide to Data Cleaning and Transformation Rating: 0 out of 5 stars0 ratingsPython 3 Data Visualization Using ChatGPT / GPT-4: Master Python Visualization Techniques with AI Integration Rating: 0 out of 5 stars0 ratingsGoogle Gemini for Python: Coding with Bard: Mastering Python with Google's AI Tools Rating: 0 out of 5 stars0 ratingsData Literacy With Python: A Comprehensive Guide to Understanding and Analyzing Data with Python Rating: 0 out of 5 stars0 ratingsPython Tools for Data Scientists Pocket Primer: A Quick Guide to Essential Python Libraries for Data Science Rating: 0 out of 5 stars0 ratingsData Science Fundamentals Pocket Primer: An Essential Guide to Data Science Concepts and Techniques Rating: 0 out of 5 stars0 ratingsPython 3 and Data Analytics Pocket Primer: A Quick Guide to NumPy, Pandas, and Data Visualization Rating: 0 out of 5 stars0 ratingsPandas Basics: Mastering Data Analysis with Pandas Rating: 0 out of 5 stars0 ratingsAngular and Deep Learning Pocket Primer: A Comprehensive Guide to AI and Expert Systems for Professionals Rating: 0 out of 5 stars0 ratingsPython Data Structures Pocket Primer: A concise guide to Python data structures to enhance your skills Rating: 0 out of 5 stars0 ratingsAngular and Machine Learning Pocket Primer: A Comprehensive Guide to Angular and Integrating Machine Learning Rating: 0 out of 5 stars0 ratingsJava for Developers Pocket Primer: A Concise Guide to Mastering Java Programming Rating: 0 out of 5 stars0 ratingsNatural Language Processing and Machine Learning for Developers: A Practical Guide to Advanced Techniques and Applications of NLP Rating: 0 out of 5 stars0 ratingsPython 3 for Machine Learning: Harness the Power of Python for Advanced Machine Learning Projects Rating: 0 out of 5 stars0 ratingsDealing With Data Pocket Primer: A Comprehensive Guide to Data Handling Techniques Rating: 0 out of 5 stars0 ratingsExcel 2021 / Microsoft 365 Programming By Example: A Comprehensive Guide to Mastering Excel VBA Rating: 0 out of 5 stars0 ratingsPython 3 Data Visualization Using Google Gemini: Unlock the Power of Python and Google Gemini for Stunning Data Visualizations Rating: 0 out of 5 stars0 ratingsArtificial Intelligence, Machine Learning, and Deep Learning: A Practical Guide to Advanced AI Techniques Rating: 0 out of 5 stars0 ratingsMicrosoft Access 2021 Programming Pocket Primer: A Comprehensive Guide to Mastering Access VBA Rating: 0 out of 5 stars0 ratingsPython 3 and Machine Learning Using ChatGPT / GPT-4: Harness the Power of Python, Machine Learning, and Generative AI Rating: 0 out of 5 stars0 ratingsMicrosoft Excel 2021 Programming Pocket Primer: A Comprehensive Guide to Mastering Excel VBA Rating: 0 out of 5 stars0 ratingsSQL Pocket Primer: A Comprehensive Guide to SQL and MySQL for Data Professionals Rating: 0 out of 5 stars0 ratingsDigital Signal Processing: An Introduction to Mastering Advanced Techniques for Transforming and Analyzing Signals Rating: 0 out of 5 stars0 ratingsWeb Applications with ASP.NET Core Blazor: Create Powerful, Responsive, and Engaging Web Applications Rating: 0 out of 5 stars0 ratings
Programming For You
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsSQL All-in-One For Dummies Rating: 3 out of 5 stars3/5HTML & CSS: Learn the Fundaments in 7 Days Rating: 4 out of 5 stars4/5Python: For Beginners A Crash Course Guide To Learn Python in 1 Week Rating: 4 out of 5 stars4/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5HTML in 30 Pages Rating: 5 out of 5 stars5/5Learn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS Rating: 5 out of 5 stars5/5JavaScript All-in-One For Dummies Rating: 5 out of 5 stars5/5C Programming For Beginners: The Simple Guide to Learning C Programming Language Fast! Rating: 5 out of 5 stars5/5Linux: Learn in 24 Hours Rating: 5 out of 5 stars5/5Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications Rating: 0 out of 5 stars0 ratingsBeginning Programming with C++ For Dummies Rating: 4 out of 5 stars4/5Coding with JavaScript For Dummies Rating: 0 out of 5 stars0 ratingsC# Programming from Zero to Proficiency (Beginner): C# from Zero to Proficiency, #2 Rating: 0 out of 5 stars0 ratingsSQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days Rating: 5 out of 5 stars5/5
Reviews for WORKING WITH grep, sed, AND awk Pocket Primer
0 ratings0 reviews
Book preview
WORKING WITH grep, sed, AND awk Pocket Primer - Mercury Learning and Information
PREFACE
WHAT IS THE GOAL?
The goal of this book is to introduce readers to three powerful command line utilities that can be combined to create simple yet powerful shell scripts for performing a multitude of tasks. The code samples and scripts use the bash shell, and typically involve small text files, so you can focus on understanding the features of grep, sed, and awk. Aimed at a reader new to working in a bash environment, the book is comprehensive enough to be a good reference and teaches new tricks to those who already have some experience with these command line utilities.
This book takes introductory concepts and demonstrates their use in simple yet powerful shell scripts. Keep in mind that this book does not cover pure
system administration functionality.
IS THIS BOOK IS FOR ME AND WHAT WILL I LEARN?
This book is intended for general users as well as anyone who wants to perform a variety of tasks from the command line.
You will acquire an understanding of how to use grep, sed, and awk whose functionality is discussed in the first five chapters. Specifically, Chapter 1 introduces the grep command, Chapter 2 introduces the sed command, and Chapters 3 through 5 discuss the awk command. The sixth and final chapter introduces you to regular expressions.
This book saves you the time required to search for relevant code samples, adapting them to your specific needs, which is a potentially time-consuming process.
HOW WERE THE CODE SAMPLES CREATED?
The code samples in this book were created and tested using bash on a MacBook Pro with OS X 10.15.7 (macOS Catalina). Regarding their content: the code samples are derived primarily from scripts prepared by the author, and in some cases, there are code samples that incorporate short sections of code from discussions in online forums. The key point to remember is that the code samples follow the Four Cs
: they must be Clear, Concise, Complete, and Correct to the extent that it is possible to do so, given the size of this book.
WHAT YOU NEED TO KNOW FOR THIS BOOK
You need some familiarity with working from the command line in a Unix-like environment. However, there are subjective prerequisites, such as a desire to learn shell programming, along with the motivation and discipline to read and understand the code samples. In any case, if you’re not sure whether or not you can absorb the material in this book, glance through the code samples to get a feel for the level of complexity.
HOW DO I SET UP A COMMAND SHELL?
If you are a Mac user, there are three ways to do so. The first method is to use Finder to navigate to Applications > Utilities and then double click on the Utilities application. Next, if you already have a command shell available, you can launch a new command shell by typing the following command:
open /Applications/Utilities/Terminal.app
A second method for Mac users is to open a new command shell on a MacBook from a command shell that is already visible simply by clicking command+n in that command shell, and your Mac will launch another command shell.
If you are a PC user, you can install Cygwin (open source https://cygwin.com/) that simulates bash commands or use another toolkit such as MKS (a commercial product). Please read the online documentation that describes the download and installation process.
If you use RStudio, you need to launch a command shell inside of RStudio by navigating to Tools > Command Line, and then you can launch bash commands. Note that custom aliases are not automatically set if they are defined in a file other than the main start-up file (such as .bash_login).
WHAT ARE THE NEXT STEPS
AFTER FINISHING THIS BOOK?
The answer to this question varies widely, mainly because the answer depends heavily on your objectives. The best answer is to try a new tool or technique from the book out on a problem or task you care about, professionally, or personally. Precisely what that might be depends on who you are, as the needs of a data scientist, manager, student, or developer are all different. In addition, keep what you learned in mind as you tackle new data cleaning or manipulation challenges. Sometimes knowing a technique is possible will make finding a solution easier, even if you have to re-read the section to remember exactly how the syntax works.
If you have reached the limits of what you have learned here and want to get further technical depth on these commands, there is a wide variety of literature published and online resources describing the bash shell, Unix programming, and the grep, sed, and awk commands.
CHAPTER 1
Working with GREP
This chapter introduces you to the versatile grep command that can process an input text stream to generate a desired output text stream. This command also works well with other Unix commands. This chapter contains many short code samples that illustrate various options of the grep command.
The first part of this chapter introduces the grep command used in isolation, in conjunction with meta characters (such as ^, $, and so forth), and with code snippets that illustrate how to use some of the options of the grep command. Next, you will learn how to match ranges of lines, how to use the back references in grep, and how to escape
meta characters in grep.
The second part of this chapter shows you how to use the grep command to find empty lines and common lines in datasets, as well as the use of keys to match rows in datasets. Next, you will learn how to use character classes with the grep command, as well as the backslash (\) character, and how to specify multiple matching patterns. You will learn how to combine the grep command with the find command and the xargs command, which is useful for matching a pattern in files that reside in different directories. This section contains some examples of common mistakes that people make with the grep command.
The third section briefly discusses the egrep command and the fgrep command, which are related commands that provide additional functionality that is unavailable in the standard grep utility. The fourth section contains a use case that illustrates how to use the grep command to find matching lines that are then merged to create a new dataset.
What is the grep Command?
The grep (Global Regular Expression Print
) command is useful for finding strings in one or more files. Several examples are here:
grepabc *sh displays all the lines of abc in files with suffix sh.
grep –i abc *sh is the same as the preceding query, but case-insensitive.
grep –l abc *sh displays all the filenames with suffix sh that contain abc.
grep –n abc *sh displays all the line numbers of the occurrences of the string abc in files with suffix sh.
You can perform logical AND and logical OR operations with this syntax:
grep abc *sh | grep def matches lines containing abc AND def.
grep abc\|def
*sh matches lines containing abc OR def.
You can combine switches as well: the following command displays the names of the files that contain the string abc (case insensitive):
grep –il abc *sh
In other words, the preceding command matches filenames that contain abc, Abc, ABc, ABC, abC, and so forth.
Another (less efficient way) to display the lines containing abc (case insensitive) is here:
cat file1 |grep –i abc
The preceding command involves two processes, whereas the grep using –l switch instead of cat to input the files you want
approach involves a single process. The execution time is roughly the same for small text files, but the execution time can become more significant if you are working with multiple large text files.
You can combine the sort command, the pipe symbol, and the grep command. For example, the following command displays the files with a Jan
date in increasing size:
ls -l |grep Jan
| sort -n
A sample output from the preceding command is here:
-rw-r--r-- 1 oswaldcampesato2 staff 3 Sep 27 2022 abc.txt
-rw-r--r-- 1 oswaldcampesato2 staff 6 Sep 21 2022 control1.txt
-rw-r--r-- 1 oswaldcampesato2 staff 27 Sep 28 2022 fiblist.txt
-rw-r--r-- 1 oswaldcampesato2 staff 28 Sep 14 2022 dest
-rw-r--r-- 1 oswaldcampesato2 staff 36 Sep 14 2022 source
-rw-r--r-- 1 oswaldcampesato2 staff 195 Sep 28 2022 Divisors.py
-rw-r--r-- 1 oswaldcampesato2 staff 267 Sep 28 2022 Divisors2.py
Meta Characters and the grep Command
The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any meta-character with special meaning may be quoted by preceding it with a backslash.
A regular expression may be followed by one of several repetition operators, as shown here:
.
matches any single character.
?
indicates that the preceding item is optional and will be matched at most once: Z? matches Z or ZZ.
*
indicates that the preceding item will be matched zero or more times: Z* matches Z, ZZ, ZZZ, and so forth.
+
indicates that the preceding item will be matched one or more times: Z+ matches ZZ, ZZZ, and so forth.
{n}
indicates that the preceding item is matched exactly n times: Z{3} matches ZZZ.
{n,}
indicates that the preceding item is matched n or more times: Z{3} matches ZZZ, ZZZZ, and so forth.
{,m}
indicates that the preceding item is matched at most m times: Z{,3} matches Z, ZZ, and ZZZ.
{n,m}
indicates that the preceding item is matched at least n times, but not more than m times: Z{2,4} matches ZZ, ZZZ, and ZZZZ.
The empty regular expression matches the empty string (i.e., a line in the input stream with no data). Two regular expressions may be joined by the infix operator (|). When used in this manner, the infix operator behaves exactly like a logical OR
statement, which directs the grep command to return any line that matches either regular expression.
Escaping Meta Characters with the grep Command
Listing 1.1 displays the content of lines.txt that contains lines with words and metacharacters.
Listing 1.1: lines.txt
abcd
ab
abc
cd
defg
.*.
..
The following grep command lists the lines of length 2 (using the ^ to begin and $ to end, with operators to restrict the length) in lines.txt:
grep '^..$' lines.txt
The following command lists the lines of length two in lines.txt that contain two dots (the backslash tells grep to interpret the dots as actual dots, not as metacharacters):
grep '^\.\.$' lines.txt
The result is shown here:
..
The following command also displays lines of length 2 that begins and ends with a dot. Note that the * matches any text of any length, including no text at all, and is used as a metacharacter because it is not preceded with a backslash:
grep '^\.*\.$' lines.txt
The following command lists the lines that contain a period, followed by an asterisk, and then another period (the * is now a character that must be matched because it is preceded by a backslash):
grep '^\.\*\.$' lines.txt
Useful Options for the grep Command
There are many types of pattern matching possibilities with the grep command, and this section contains an eclectic mix of such commands that handle common scenarios.
In the following examples, we have four text files (two .sh and two .txt) and two Word documents in a directory. The string abc is found on one line in abc1.txt and three lines in abc3.sh. The string ABC is found on two lines in in ABC2.txt and four lines in ABC4.sh. Notice that abc is not found in ABC files, and ABC is not found in abc files.
ls *
ABC.doc ABC4.sh abc1.txt ABC2.txt abc.doc abc3.sh
The following code snippet searches for occurrences of the string abc in all the files in the current directory that have sh as a suffix:
grep abc *sh
abc3.sh:abc at start
abc3.sh:ends with -abc
abc3.sh:the abc is in the middle
The -c
option counts the number of occurrences of a string: even though ABC4.sh has no matches, it still counts them and returns zero:
grep –c abc *sh
The output of the preceding command is here:
ABC4.sh:0
abc3.sh:3
The -e
option lets you match patterns that would otherwise cause syntax problems (the –
character normally is interpreted as an argument for grep):
grep –e -abc
*sh
abc3.sh:ends with -abc
The -e
option also lets you match multiple patterns:
grep –e -abc
-e comment
*sh
ABC4.sh:# ABC in a comment
abc3.sh:ends with -abc
The -i
option is to perform a case insensitive match:
grep –i abc *sh
ABC4.sh:ABC at start
ABC4.sh:ends with ABC
ABC4.sh:the ABC is in the middle
ABC4.sh:# ABC in a comment
abc3.sh:abc at start
abc3.sh:ends with -abc
abc3.sh:the abc is in the middle
The -v
option inverts
the matching string, which means that the output consists of the lines that do not contain the specified string (ABC does not match because -i is not used, and ABC4.sh has an entirely empty line):
grep –v abc *sh
Use the -iv
options to display the lines that do not contain a specified string using a case insensitive match:
grep –iv abc *sh
ABC4.sh:
abc3.sh:this line won't match
The -l
option is to list only the filenames that contain a successful match (note this matches contents of files, not the filenames). The Word document matches because the actual text is still visible to grep, it is just surrounded by proprietary formatting gibberish. You can do similar things with other formats that contain text, such as XML, HTML, CSV, and so forth:
grep -l abc *
abc1.txt
abc3.sh
abc.doc
The -l
option is to list only the filenames that contain a successful match:
grep –l abc *sh
Use the -il
options to display the filenames that contain a specified string using a case insensitive match:
grep –il abc *doc
The preceding command is very useful when you want to check for the occurrence of a string in Word documents.
The -n
option specifies line numbers of any matching file:
grep –n abc *sh
abc3.sh:1:abc at start
abc3.sh:2:ends with -abc
abc3.sh:3:the abc is in the middle
The -h
option suppresses the display of the filename for a successful match:
grep –h abc *sh
abc at start
ends with -abc
the abc is in the middle
For the next series of examples, we will use columns4.txt, as shown in Listing 1.2.
Listing 1.2: columns4.txt
123 ONE TWO
456 three four
ONE TWO THREE FOUR
five 123 six
one two three
four five
The -o
option shows only the matched string (this is how you avoid returning the entire line that matches):
grep –o one columns4.txt
The -o
option followed by the -b
option shows the position of the matched string (returns character position, not line number. The o
in one
is the 59th character of the file):
grep –o –b one columns4.txt
You can specify a recursive search, as shown here (output not shown because it will be different on every client or account. This searches not only every file in directory /etc, but every file in every subdirectory of etc):
grep –r abc /etc
The preceding commands match lines where the specified string is a substring of a longer string in the file. For instance, the preceding commands will match occurrences of abc as well as abcd, dabc, abcde, and so forth.
grep ABC *txt
ABC2.txt:ABC at start or ABC in middle or end in ABC
ABC2.txt:ABCD DABC
If you want to exclude everything except for an exact match, you can use the –w option, as shown here:
grep –w ABC *txt
ABC2.txt:ABC at start or ABC in middle or end in ABC
The --color switch displays the matching string in color:
grep --color abc *sh
abc3.sh:abc at start
abc3.sh:ends with -abc
abc3.sh:the abc is in the middle
You can use the pair of metacharacters (.*) to find the occurrences of two words that are separated by an arbitrary number of intermediate characters.
The following command finds all lines that contain the strings one and three with any number of intermediate characters:
grep one.*three
columns4.txt
one two three
You can invert
the preceding result by using the –v switch, as shown here:
grep –v one.*three
columns4.txt
123 ONE TWO
456 three four
ONE TWO THREE FOUR
five 123 six
four five
The following command finds all lines that contain the strings one and three with any number of intermediate characters, where the match involves a case-insensitive comparison:
grep -i one.*three
columns4.txt
ONE TWO THREE FOUR
one two three
You can invert
the preceding result by using the –v switch, as shown here:
grep –iv one.*three
columns4.txt
123 ONE TWO
456 three four
five 123 six
four five
Sometimes you need to search a file for the presence of either of two strings. For example, the following command finds the files that contain start or end:
grep -l 'start\|end' *
ABC2.txt
ABC4.sh
abc3.sh
Later in the chapter, you will see how to find files that contain a pair of strings via the grep and xargs commands.
Character Classes and the grep Command
This section contains some simple one-line commands that combine the grep command with character classes.
echo abc
| grep '[:alpha:]'
abc
echo 123
| grep '[:alpha:]'
(returns nothing, no match)
echo abc123
| grep '[:alpha:]'
abc123
echo abc
| grep '[:alnum:]'
abc
echo 123
| grep '[:alnum:]'
(returns nothing, no match)
echo abc123
| grep '[:alnum:]'
abc123
echo 123
| grep '[:alnum:]'
(returns nothing, no match)
echo abc123
| grep '[:alnum:]'
abc123
echo abc
| grep '[0-9]'
(returns nothing, no match)
echo 123
| grep '[0-9]'
123
echo abc123
| grep '[0-9]'
abc123
echo abc123
| grep -w '[0-9]'
(returns nothing, no match)
Working with the –c Option in grep
Consider a scenario in which a directory (such as a log directory) has files created by an outside program. Your task is to write a shell script that determines which (if any) of the files that contain two occurrences of a string, after which additional processing is performed on the matching files (e.g., use email to send log files containing two or more errors messages to a system administrator for investigation).
One solution involves the –c option for grep, followed by additional invocations of the grep command.
The command snippets in this section assume the following data files whose contents are shown below.
The file hello1.txt contains the