You are currently viewing Linux Commands – grep

Linux Commands – grep

Linux Commands – grep

Hello Everyone

Welcome to CloudAffaire and this is Debjeet.

In the last blog post, we have discussed find command in Linux which is used to search for files in a directory hierarchy.

https://cloudaffaire.com/linux-commands-find/

In this blog post, we will discuss grep command in Linux. grep command is used to match a PATTERN in a given file. grep command compares the search PATTERN with each character of the file from left to right and returns the line if a match is found. By default, PATTERN is case sensitive and supports basic regular expression.

Regular Expression:

A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions. Regular expression has different meta-character which you can use to formulate the search PATTERN. Below are the most commonly used meta-character available in regular expression:

  • . Period matches any single character except a line break.
  • * Matches 0 or more repetitions of the preceding symbol.
  • + Matches 1 or more repetitions of the preceding symbol.
  • ? Makes the preceding symbol optional.
  • ^ Matches the beginning of the input.
  • $ Matches the end of the input.
  • [ ] Matches any character contained between the square brackets.
  • [^ ] Matches any character that is not contained between the square brackets
  • {n,m} Matches at least “n” but not more than “m” repetitions of the preceding symbol.
  • (xyz) Matches the characters xyz in that exact order.
  • | Matches either the characters before or the characters after the symbol.
  • \ This allows you to match reserved characters [ ] ( ) { } . * + ? ^ $ \ |

Note: grep by default, only supports basic regular expression. Hence you have to use the backslashed versions \?, \+, \{, \|, \(, and \) meta-character or you can use grep -E option or egrep command for normal versions of ?, +, {, |, (, and ) meta-character.

You can refer below GitHub repository to get more details on regular expression

https://github.com/ziishaned/learn-regex

Linux Commands – grep:

Search Pattern:

You provide a search PATTERN and the target file where the search will be made as input in the grep command and the grep command will search each line of the file from left to right and returns the line matching the search PATTERN. grep supports regular expression which you can use to formulate your PATTERN.

Search Options:

You can use grep -E or –extended-regexp options to interpret PATTERN as an extended regular expression. By default, grep only supports basic regular expression. Hence you have to use the backslashed versions \?, \+, \{, \|, \(, and \) meta-character or you can use grep -E option or egrep command for normal versions of ?, +, {, |, (, and ) meta-character.

You can use grep -e PATTERN, –regexp=PATTERN options to specify multiple search patterns.

You can use grep -F or –fixed-strings or –fixed-regexp options to interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.

You can use grep -G or –basic-regexp options to interpret PATTERN as a basic regular expression. This is the default behavior of grep command.

You can use grep -P or –perl-regexp options to interpret PATTERN as a Perl regular expression. Some of the meta-character for example ‘/d’ which matches digits are perl specific and can only be used with grep using -P option.

You can use grep -f FILE or –file=FILE options to obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing.

You can use grep -i or –ignore-case options to ignore case distinctions in both the PATTERN and the input files. By default, grep search PATTERN is case sensitive.

You can use grep -w or –word-regexp options to select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore.

You can use grep -x or –line-regexp options to select only those matches that exactly match the whole line.

You can use grep -s or –no-messages options to suppress error messages about non-existent or unreadable files.

You can use grep -q or –quiet or –silent options to not write anything to standard output. This option makes grep exit immediately with zero status if any match is found, even if an error was detected.

You can use grep -v or –invert-match options to invert the sense of matching, to select non-matching lines.

You can use grep -n or –line-number options to prefix each line of output with the 1-based line number within its input file.

You can use grep -o or –only-matchin options to print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

You can use grep -r or –recursive options to read all files under each directory, recursively, following symbolic links only if they are on the command line. This is equivalent to the -d recurse option.

You can use grep -R or –dereference-recursive options which is similar to -r option but follow any symbolic links.

You can use grep -d or –directories=ACTION options which tell grep if an input file is a directory, use ACTION to process it. By default, ACTION is read, i.e., read directories just as if they were ordinary files. If ACTION is skip, silently skip directories. If ACTION is recurse, read all files under each directory, recursively, following symbolic links only if they are on the command line. This is equivalent to the -r option.

You can use grep -D or –devices=ACTION options which tells grep If an input file is a device, FIFO or socket, use ACTION to process it. By default, ACTION is read, which means that devices are read just as if they were ordinary files. If ACTION is skip, devices are silently skipped.

You can use grep -l or –files-with-matches options to suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning will stop on the first match.

You can use grep -L or –files-without-match options to suppress normal output; instead print the name of each input file from which no output would normally have been printed. The scanning will stop on the first match.

You can use grep -H or –with-filename options to print the file name for each match. This is the default when there is more than one file to search.

You can use grep -h or –no-filename options to suppress the prefixing of filenames on output. This is the default when there is only one file (or only standard input) to search.

You can use grep -m or –max-count=NUM options to stop reading a file after NUM matching lines. If the input is standard input from a regular file, and NUM matching lines are output, grep ensures that the standard input is positioned to just after the last matching line before exiting, regardless of the presence of trailing context lines. This enables a calling process to resume a search. When grep stops after NUM matching lines, it outputs any trailing context lines. When the -c or –count option is also used, grep does not output a count greater than NUM. When the -v or –invert-match option is also used, grep stops after outputting NUM nonmatching lines.

You can use grep -c or –count options to suppress normal output; instead print a count of matching lines for each input file.

You can use grep –include=FILE_PATTERN option to search only files whose base name matches FILE_PATTERN. A FILE_PATTERN can use *, ?, and […] as wildcards, and \ to quote a wildcard or backslash character literally.

You can use grep –exclude=FILE_PATTERN option to skip files whose base name matches FILE_PATTERN. A FILE_PATTERN can use *, ?, and […] as wildcards, and \ to quote a wildcard or backslash character literally.

You can use grep –exclude-from=FILE option to skip files whose base name matches any of the file-name PATTERN read from FILE. A PATTERN can use *, ?, and […] as wildcards, and \ to quote a wildcard or backslash character literally.

You can use grep –exclude-dir=DIR option to exclude directories matching the pattern DIR from recursive searches.

You can use grep -B NUM or –before-context=NUM options to print NUM lines of leading context before matching lines. Places a line containing a group separator (described under –group-separator) between contiguous groups of matches. With the -o or –only-matching option, this has no effect and a warning is given.

You can use grep -A NUM or –after-context=NUM options to print NUM lines of trailing context after matching lines. Places a line containing a group separator (described under –group-separator) between contiguous groups of matches. With the -o or –only-matching option, this has no effect and a warning is given.

You can use grep -C NUM or -NUM or–context=NUM options to print NUM lines of output context. Places a line containing a group separator (described under –group-separator) between contiguous groups of matches. With the -o or –only-matching option, this has no effect and a warning is given.

You can use grep –group-separator=SEP option to use SEP as a group separator. By default, SEP is a double hyphen (–).

You can use grep –color[=WHEN] or –colour[=WHEN] options to surround the matched (non-empty) strings, matching lines, context lines, file names, line numbers, byte offsets, and separators (for fields and groups of context lines) with escape sequences to display them in color on the terminal. The colors are defined by the environment variable GREP_COLORS. The deprecated environment variable GREP_COLOR is still supported, but its setting does not have priority. WHEN is never, always, or auto.

Hope you have enjoyed this article. In the next blog post, we will discuss awk command in Linux.

 

Leave a Reply