Linux Commands – grep
Hello Everyone
Welcome to CloudAffaire and this is Debjeet.
In the last blog post, we have discussed find command in Linux which is used to search for files in a directory hierarchy.
https://cloudaffaire.com/linux-commands-find/
In this blog post, we will discuss grep command in Linux. grep command is used to match a PATTERN in a given file. grep command compares the search PATTERN with each character of the file from left to right and returns the line if a match is found. By default, PATTERN is case sensitive and supports basic regular expression.
Regular Expression:
A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions. Regular expression has different meta-character which you can use to formulate the search PATTERN. Below are the most commonly used meta-character available in regular expression:
- . Period matches any single character except a line break.
- * Matches 0 or more repetitions of the preceding symbol.
- + Matches 1 or more repetitions of the preceding symbol.
- ? Makes the preceding symbol optional.
- ^ Matches the beginning of the input.
- $ Matches the end of the input.
- [ ] Matches any character contained between the square brackets.
- [^ ] Matches any character that is not contained between the square brackets
- {n,m} Matches at least “n” but not more than “m” repetitions of the preceding symbol.
- (xyz) Matches the characters xyz in that exact order.
- | Matches either the characters before or the characters after the symbol.
- \ This allows you to match reserved characters [ ] ( ) { } . * + ? ^ $ \ |
Note: grep by default, only supports basic regular expression. Hence you have to use the backslashed versions \?, \+, \{, \|, \(, and \) meta-character or you can use grep -E option or egrep command for normal versions of ?, +, {, |, (, and ) meta-character.
You can refer below GitHub repository to get more details on regular expression
https://github.com/ziishaned/learn-regex
Linux Commands – grep:
Search Pattern:
You provide a search PATTERN and the target file where the search will be made as input in the grep command and the grep command will search each line of the file from left to right and returns the line matching the search PATTERN. grep supports regular expression which you can use to formulate your PATTERN.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
########################### ## Linux Commands | grep ## ########################### ## Prerequisites: One Unix/Linux/POSIX-compliant operating system with bash shell ##----- ## grep ##----- ## grep [OPTION]... PATTERN [FILE]... ## ------- ## PATTERN ## ------- ## create some file and directory used in this demo cd mkdir mydir && cd mydir echo "The cat sat on the mat watching the rat" > data echo "aaaa abba baab bbbb aabb abcd" >> data echo "0123456789" >> data echo "abcdefghij" >> data echo "ABCDEFGHIJ" >> data echo "1 22 333 4444 55555 666666 7777777 88888888 999999999" >> data echo "AWS comes first" >> data echo "aws is awsome" >> data echo "GCP comes next" >> data echo "hello world" >> data echo -e "POSITIVE kind\nCOMPARATIVE kinder\nSUPERLATIVE kindest" >> data mkdir -p mydir1/mydir2/mydir3 echo "hello" > mydir1/myfile1 echo "hello" > mydir1/mydir2/myfile2 ln -s ../myfile1 mydir1/mydir2/mylink1 cp data mydir1/mydir2/mydir3 ## Exact Match grep "The" data ## returns "The" grep "the" data ## returns "the" grep "RRAT" data ## returns empty, no match found ## . (dot): Period matches any single character except a match break. grep "baa." data ## returns "baa grep "aa.." data ## returns "aa grep ".bb." data ## returns " grep -w ".bb." data ## returns " ## -w option can be used for exact word search, will exclude space ## * (star): Matches 0 or more repetitions of the preceding symbol. grep ".*" data ## returns everything grep "bb*" data ## returns match having "b" or "bb" followed by any character grep "kindx*" data ## returns match having "kind" or "kindx" followed by any character grep "x*TIVE" data ## returns match having "TIVE" or "xTIVE" followed by any character ## + (plus): Matches 1 or more repetitions of the preceding symbol. grep "aa\+" data ## returns match having "aa" followed by any number of "aa" including zero occurance grep -E "aa+" data ## "\" escape of + not required for egrep egrep "aa+" data ## "\" escape of + not required for egrep ## [] (squar brackets): Character class. ## [] Matches any character contained between the square brackets. ## [a-z] Matches any character from a to z ## [0-9] Matches any number from 0 to 9 ## [^] Negats any character contained between the square brackets. grep "[cmrs]at" data ## returns cat or mat or rat or sat grep "[0-9]" data ## returns match containing numbers grep "[0-9A-Z]" data ## returns match containing numbers and capital letters grep "[0-9A-Za-z]" data ## returns match containing any alphanumeric characters grep "[^A-Z]" data ## returns match not containing capital letters ## ? (question mark): Makes the preceding symbol optional. grep "c\?at" data ## returns match having " grep "T\?he" data ## returns match having " ## {n,m} (curly brace): Braces. Matches at least "n" ## but not more than "m" repetitions of the preceding symbol. grep "[0-9]\{2\}" data ## returns match having atleast two digit grep "[0-9]\{5\}" data ## returns match having atleast five digit grep -w "[0-9]\{3,6\}" data ## returns match having atleast three digit or maximum 6 digits ## (xyz) (first brace): Character group. Matches the characters xyz in that exact order. grep "\(345\)" data ## returns match having "345" grep "\(c\|s\|m\|r\)at" data ## returns match having " grep "\(\?\:c\|s\|m\|r\)at" data ## returns match having " ## | (vertical bar): Alternation (or) grep "\(T\|t\)he\|\(c\|s\|m\|r\)at" data ## returns match having " grep -E "(T|t)he|(c|s|m|r)at" data ## same ## ^ (caret): Matches at the beginning of the input. grep "^AWS" data ## returns line starting with "AWS" ## $ (doller): Matches at the end of the input. grep "next\$" data ## returns line ending with "next" |
Search Options:
You can use grep -E or –extended-regexp options to interpret PATTERN as an extended regular expression. By default, grep only supports basic regular expression. Hence you have to use the backslashed versions \?, \+, \{, \|, \(, and \) meta-character or you can use grep -E option or egrep command for normal versions of ?, +, {, |, (, and ) meta-character.
1 2 3 4 5 6 7 8 9 |
## ------- ## OPTIONS ## ------- ## grep -E or --extended-regexp options grep "(123456789)" data ## returns empty as () is regexp grep "\(123456789\)" data ## returns match, or use -E option with grep or use egrep grep -E "(123456789)" data ## returns match egrep "(123456789)" data ## returns match |
You can use grep -e PATTERN, –regexp=PATTERN options to specify multiple search patterns.
1 2 3 4 |
## grep -e PATTERN, --regexp=PATTERN options grep "cat mat rat" data ## returns empty grep -e "cat" -e "mat" -e "rat" data ## returns match |
You can use grep -F or –fixed-strings or –fixed-regexp options to interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.
1 2 3 4 5 |
## grep -F or --fixed-strings or --fixed-regexp options grep -e "AWS" -e "GCP" data ## returns match grep -F "AWS GCP" data |
You can use grep -G or –basic-regexp options to interpret PATTERN as a basic regular expression. This is the default behavior of grep command.
1 2 3 4 5 |
## grep -G or --basic-regexp options grep "(123456789)" data ## returns empty grep -G "(123456789)" data ## same grep -G "\(123456789\)" data ## returns match |
You can use grep -P or –perl-regexp options to interpret PATTERN as a Perl regular expression. Some of the meta-character for example ‘/d’ which matches digits are perl specific and can only be used with grep using -P option.
1 2 3 4 |
## grep -P or --perl-regexp options grep "\d" data ## returns match for char 'd' grep -P "\d" data ## returns match for any numbers |
You can use grep -f FILE or –file=FILE options to obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing.
1 2 3 4 |
## grep -f FILE or --file=FILE options echo -e "AWS\nGCP" > pattern ## create a file with pattern grep -f pattern data ## returns match |
You can use grep -i or –ignore-case options to ignore case distinctions in both the PATTERN and the input files. By default, grep search PATTERN is case sensitive.
1 2 3 4 |
## grep -i or --ignore-case options grep "the" data ## returns match for "the" grep -i "the" data ## returns match for "The" and "the" |
You can use grep -w or –word-regexp options to select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore.
1 2 3 4 |
## grep -w or --word-regexp options grep "aws" data ## returns match for aws and awsome grep -w "aws" data ## returns match for only aws |
You can use grep -x or –line-regexp options to select only those matches that exactly match the whole line.
1 2 3 4 |
## grep -x or --line-regexp options grep -x "aws" data ## returns empty grep -x "aws is awsome" data ## returns match, entire line needs to be matched |
You can use grep -s or –no-messages options to suppress error messages about non-existent or unreadable files.
1 2 3 4 |
## grep -s or --no-messages options grep "hello" newdata ## returns an error message grep -s "hello" newdata ## returns empty |
You can use grep -q or –quiet or –silent options to not write anything to standard output. This option makes grep exit immediately with zero status if any match is found, even if an error was detected.
1 2 3 4 |
## grep -q or --quiet or --silent options grep "aws" data ## returns match grep -q "aws" data ## returns empty |
You can use grep -v or –invert-match options to invert the sense of matching, to select non-matching lines.
1 2 3 4 |
## grep -v or --invert-match options grep "[^A-Z]" data ## returns match not containing capital letters grep -v "[^A-Z]" data ## returns match containing only capital letters |
You can use grep -n or –line-number options to prefix each line of output with the 1-based line number within its input file.
1 2 3 |
## grep -n or --line-number options grep -n "aws" data ## returns the line no. along with match |
You can use grep -o or –only-matchin options to print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
1 2 3 4 |
## grep -o or --only-matchin options grep "aws" data ## returns the entire line grep -o "aws" data ## returns only the matching word or line |
You can use grep -r or –recursive options to read all files under each directory, recursively, following symbolic links only if they are on the command line. This is equivalent to the -d recurse option.
1 2 3 |
## grep -r or --recursive options grep -r "hello" ./mydir1 ## recursive search in mydir1 dir for any file containing "hello" |
You can use grep -R or –dereference-recursive options which is similar to -r option but follow any symbolic links.
1 2 3 4 |
## grep -R or --dereference-recursive grep -R "hello" ./mydir1 ## recursive search in mydir1 dir for any file containing "hello" ## follows symbolic links, hence four matches in the output |
You can use grep -d or –directories=ACTION options which tell grep if an input file is a directory, use ACTION to process it. By default, ACTION is read, i.e., read directories just as if they were ordinary files. If ACTION is skip, silently skip directories. If ACTION is recurse, read all files under each directory, recursively, following symbolic links only if they are on the command line. This is equivalent to the -r option.
1 2 3 4 5 6 |
## grep -d or --directories=ACTION options grep "hello" ./mydir1 ## error, grep: ./mydir1: Is a directory grep -d read "hello" ./mydir1 ## error, grep: ./mydir1: Is a directory grep -d skip "hello" ./mydir1 ## returns empty grep -d recurse "hello" ./mydir1 ## returns match |
You can use grep -D or –devices=ACTION options which tells grep If an input file is a device, FIFO or socket, use ACTION to process it. By default, ACTION is read, which means that devices are read just as if they were ordinary files. If ACTION is skip, devices are silently skipped.
1 2 3 |
## grep -D or --devices=ACTION options sudo grep -D skip "hello" /dev/random ## returns empty |
You can use grep -l or –files-with-matches options to suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match.
1 2 3 4 |
## grep -l or --files-with-matches options grep -r -l "hello" ./mydir1 ## recursive search in mydir1 dir for any file containing "hello" ## returns only file names containing "hello" |
You can use grep -L or –files-without-match options to suppress normal output; instead print the name of each input file from which no output would normally have been printed. The scanning will stop on the first match.
1 2 3 4 |
## grep -L or --files-without-match options grep -r -L "hello" ./mydir1 ## recursive search in mydir1 dir for any file containing "hello" ## returns file names not containing "hello" |
You can use grep -H or –with-filename options to print the file name for each match. This is the default when there is more than one file to search.
1 2 3 4 |
## grep -H or --with-filename options grep "aws" data ## returns match grep -H "aws" data ## returns match with filename |
You can use grep -h or –no-filename options to suppress the prefixing of filenames on output. This is the default when there is only one file (or only standard input) to search.
1 2 3 4 |
## grep -h or --no-filename options grep -r "hello" ./mydir1 ## returns match with filename grep -r -h "hello" ./mydir1 ## returns match |
You can use grep -m or –max-count=NUM options to stop reading a file after NUM matching lines. If the input is standard input from a regular file, and NUM matching lines are output, grep ensures that the standard input is positioned to just after the last matching line before exiting, regardless of the presence of trailing context lines. This enables a calling process to resume a search. When grep stops after NUM matching lines, it outputs any trailing context lines. When the -c or –count option is also used, grep does not output a count greater than NUM. When the -v or –invert-match option is also used, grep stops after outputting NUM nonmatching lines.
1 2 3 4 |
## grep -m or --max-count=NUM options grep "kind" data ## returns all lines having "kind" grep -m 2 "kind" data ## returns two lines having "kind" |
You can use grep -c or –count options to suppress normal output; instead print a count of matching lines for each input file.
1 2 3 |
## grep -c or --count options grep -c "kind" data ## returns 3 |
You can use grep –include=FILE_PATTERN option to search only files whose base name matches FILE_PATTERN. A FILE_PATTERN can use *, ?, and […] as wildcards, and \ to quote a wildcard or backslash character literally.
1 2 3 4 |
## grep --include=FILE_PATTERN option grep -R "hello" ./mydir1 ## returns all files having match grep -R --include=myfile1 "hello" ./mydir1 ## returns myfile1 if there is a match |
You can use grep –exclude=FILE_PATTERN option to skip files whose base name matches FILE_PATTERN. A FILE_PATTERN can use *, ?, and […] as wildcards, and \ to quote a wildcard or backslash character literally.
1 2 3 4 |
## grep --exclude=FILE_PATTERN option grep -R "hello" ./mydir1 ## returns all files having match grep -R --exclude=myfile1 "hello" ./mydir1 ## returns all files excluding myfile1 having a match |
You can use grep –exclude-from=FILE option to skip files whose base name matches any of the file-name PATTERN read from FILE. A PATTERN can use *, ?, and […] as wildcards, and \ to quote a wildcard or backslash character literally.
1 2 3 4 |
## grep --exclude-from=FILE option echo "myfile1" > pattern grep -R --exclude-from=pattern "hello" ./mydir1 ## returns all files excluding myfile1 having a match |
You can use grep –exclude-dir=DIR option to exclude directories matching the pattern DIR from recursive searches.
1 2 3 4 |
## grep --exclude-dir=PATTERN option grep -R "hello" ./mydir1 ## returns all files having match grep -R --exclude-dir=mydir2 "hello" ./mydir1 ## returns all files having a match excluding mydir2 files |
You can use grep -B NUM or –before-context=NUM options to print NUM lines of leading context before matching lines. Places a line containing a group separator (described under –group-separator) between contiguous groups of matches. With the -o or –only-matching option, this has no effect and a warning is given.
1 2 3 4 |
## grep -B or --before-context=NUM options grep "AWS" data ## returns only matching line grep -B 3 "AWS" data ## returns 3 lines before matching line |
You can use grep -A NUM or –after-context=NUM options to print NUM lines of trailing context after matching lines. Places a line containing a group separator (described under –group-separator) between contiguous groups of matches. With the -o or –only-matching option, this has no effect and a warning is given.
1 2 3 4 |
## grep -A or --after-context=NUM options grep "AWS" data ## returns only matching line grep -A 3 "AWS" data ## returns 3 lines after matching line |
You can use grep -C NUM or -NUM or–context=NUM options to print NUM lines of output context. Places a line containing a group separator (described under –group-separator) between contiguous groups of matches. With the -o or –only-matching option, this has no effect and a warning is given.
1 2 3 4 |
## grep -C or --context=NUM options grep "AWS" data ## returns only matching line grep -C 3 "AWS" data ## returns 3 lines before and after matching line |
You can use grep –group-separator=SEP option to use SEP as a group separator. By default, SEP is a double hyphen (–).
1 2 3 4 5 |
## grep --group-separator=SEP option echo "AWS stands for Amazon Web Services" >> data grep -B 3 "AWS" data ## "--" used as a group separator grep -B 3 --group-separator="...." "AWS" data ## "...." used as a group separator |
You can use grep –color[=WHEN] or –colour[=WHEN] options to surround the matched (non-empty) strings, matching lines, context lines, file names, line numbers, byte offsets, and separators (for fields and groups of context lines) with escape sequences to display them in color on the terminal. The colors are defined by the environment variable GREP_COLORS. The deprecated environment variable GREP_COLOR is still supported, but its setting does not have priority. WHEN is never, always, or auto.
1 2 3 4 5 6 7 8 9 10 11 |
## grep --color[=WHEN] or --colour[=WHEN] options grep --color=always "AWS" data grep --color=auto "AWS" data grep --color=never "AWS" data ## ------- ## cleanup ## ------- cd && rm -rf mydir |
Hope you have enjoyed this article. In the next blog post, we will discuss awk command in Linux.