BrownMath.com → Free Software → GREP Quick Start Guide
Updated 9 Dec 2021 (What’s New?)

GREP — Find Regular Expressions in Files

Quick Start Guide

Program release 8.01 dated 9 Dec 2021

Copyright © 1986–2022 by Stan Brown, BrownMath.com

Summary: GREP searches named input files, or the standard input, and displays lines that match one or more patterns called regular expressions or regexes. GREP can also search binary files and display records or buffers that contain matches.

Begin with this Quick Start Guide, and then use the GREP Reference Manual for complete details on every feature, an annotated list of all messages, and more.

Contents:

Why GREP? Why This GREP?

At the Windows command prompt, FIND is useful for finding a given string in one or more files. But what if you want to find the word the in caps or lower case, without also finding other, There, then, and so on? You don’t really want to search for a specific string. Rather, what you’re looking for is a regular expression or regex, namely the preceded and followed by something other than a letter. GREP to the rescue!

GREP takes one or more regexes, matches them against the input files, and displays the hits.

BrownMath.com’s GREP (formerly at Oak Road Systems) combines most features of UNIX grep, egrep, and fgrep. GREP has many other advantages over FIND besides using regular expressions. Indeed, customers have cited some of these as features they couldn’t find in competing GREPs:

Getting Started

Installation and Tour

Compatibility: All DOS and Windows versions, from DOS 2.0 through Windows 11.

GREP handles text in 256-character sets like the ISO-8859 group, ANSI, and Windows-1252. It doesn’t understand multibyte Unicode characters, though you may be able to find what you’re looking for by using binary mode (/R option).

There’s no special installation process. Simply unzip the downloaded ZIP file in any convenient directory.

An interactive program tour is included as a batch program in the TOUR subfolder. After unzipping the archive, just type

        cd tour
        tour

The ZIP file includes 16-bit and 32-bit versions of the program.

There should be no need to apply any Windows compatibility settings.

You may wish to rename the executable you use, grep32.exe or grep16.exe, to the simpler grep.exe. All the examples in this Quick Start Guide assume you’ve done that. Otherwise, just substitute grep32 or grep16 wherever you see grep in the examples.

You may choose to move the actual program file somewhere else. It’s completely self-contained; you can even delete all the other files if you wish. You may wish to set your PATH variable to include the directory where you have placed the GREP executable.

The exact method for setting environment variables varies from one version of Windows to another. In general, go to System Properties or Computer » Properties, and then select Advanced System Settings. In very old Windows, or “classic” DOS, set the variable in your AUTOEXEC.BAT file.

Uninstall

There’s no special uninstall procedure; simply delete the GREP files. GREP doesn’t write any secret files or modify the Windows registry.

Command Line

Because this program helps you,
please click to donate!
Because this program helps you,
please donate at
BrownMath.com/donate.

The basic GREP command form is

        grep options regex inputfiles 

(You can also GREP from the Windows desktop, as explained in the Reference Manual.

As with any command, you can redirect or pipe inputs or output. GREP can return a useful value in ERRORLEVEL, as explained in the Reference Manual.

Here are two simple examples. First,

        grep /I pic[t\s] \proj\*.cob 

examines every COBOL source file in the root-level PROJ directory and displays every line that contains a picture clause (“pic” followed by either “t” or a space) in caps or lower case (the /I option). Adding the /S option

        grep /I pic[t\s] \*.cob /S 

examines every COBOL source file in all directories on the current disk.

For a summary of operating instructions, type

        grep /? | more 

Since the help text is over 150 lines long, you might prefer to redirect it to a file for viewing:

        grep /? >grephelp.txt 

Inputs

GREP scans either named input files or the standard input — the standard input can be a named file, a pipe, or the keyboard.

Named Input Files

Named input files provide the greatest flexibility. They can be read as text or binary, and you can search subdirectory trees.

GREP32 can use long filenames; GREP16 requires short (8.3) filenames.

GREP expands any wildcards in named input files. Not only DOS-style * and ?, but UNIX-style […] can be used. For instance, "c:\My Documents\[abc]*doc" tells GREP to read every file in the indicated directory whose name starts with A, B, or C and ends with DOC (including “.DOC”). Please see Named Input Files in the Reference Manual for complete rules.

You can use the /X option to exclude some files or groups of files from consideration. For instance, if you want all 2001 reports except December, you might specify something like

        grep [options] [regex] *2001* -x*dec2001* 

If you have many named input files, you may want to store the list in a file; see the /@ option.

Subdirectory Searches

If you set the /S option, GREP searches not only the files indicated on the command line, but also the same-named files in subdirectories, all the way down to the bottom of the folder tree.

For example, with the command

        grep /S regex \hazax* *.c g:\mumble\*.htm 

GREP examines all files on the entire current drive whose names start with “hazax”; then it looks at all C source files in the current directory and all subdirectories under it; finally it looks at all .htm files in directory “g:\mumble” and all subdirectories under it.

Perhaps a more realistic example: you have a document about Vandelay Industries somewhere on your disk, but you can’t remember where. You can find it this way:

        grep /S Vandelay \*
or:     grep /S Vandelay \*.* 

(Both * and *.* select all files; see Wildcard Expansion in the Reference Manual.) You might want to add the /I option if you can’t remember how “Vandelay” was capitalized.

Standard Input and Redirection

If you don’t specify any named input files, GREP takes its input from the standard input. That can mean any of these three sources:

Example:

        tracert brownmath.com | grep 123

tells GREP to read the tracert command’s output and display any lines that contain the string “123”.

Binary Files and Text Files

GREP was originally written with plain text files in mind, but you can also use it quite well with binary files like word-processing files, databases, and executable programs. GREP not only reads binary files differently, it also adjusts the display format for matches.

Windows doesn’t mark a file as text or binary; the program that reads the file just has to know. GREP “knows” files are binary when you tell it via the /R2 or /R3 option; otherwise it treats input files as text. Use the /R3 option when you don’t know any details of the internal structure of the binary file; please see Binary Files and Text Files in the Reference Manual for much more about binary files.

You can also use the /R-1 or /R-2 option to have GREP examine each file and decide whether it’s text or free-form binary; please see the /R option in the Reference Manual for details. I recommend /R-1.

Outputs

Normally, GREP displays hits on your screen. “Hits” are the text lines, binary records, or binary buffers that contain matches for the regex(es). As part of the output, GREP displays the file path and name as a header above the group of hits from that file. You can use various options to display abbreviated or expanded forms of hits or to suppress those headers, move them to the lines with the hits, or display headers even for files that had no hits.

You can also redirect GREP’s output into a file or pipe GREP’s output to another command (even another GREP command). To redirect GREP output, follow the usual rules and put one of these at the end of the GREP command line:

You can pipe or redirect output regardless of whether input was piped or redirected.

Only the hits (and file path\name headers, if present) are redirected by the above syntax. Errors and warning messages are still sent to the standard error stream. That is usually your screen, though some OSes or shell replacements let you redirect error output. For example, in 4DOS, 4NT, and TCC type help piping or help redirection for information.

The /D option lets you create extra debugging output and send it to a named file or the standard error output.

Options

Because this program helps you,
please click to donate!
Because this program helps you,
please donate at
BrownMath.com/donate.

List of Options

Each description is hyperlinked to the full description in the Reference Manual.

Option and Effect UNIX
grep*
Windows
FIND*
 ?  Display help for files, regexes, and options. --help/?
 @  Take input file names from keyboard or file.   
 A  Include hidden and system files when expanding wildcards.   
 B  Display a header for every file, even if it contains no hits.   
 C  Display the hit count, not the actual hits. -c/C
 D  Display debugging output.   
 E  Select extended regular expressions or strings, or search for a word. (-E), (-w) 
 F  Read regexes from keyboard or file. (-f) 
 G  Read variable-length text lines or paragraphs.   
 H  Don’t display headers (file names) in output. -h 
 I  Ignore case when matching. -i/I
 J  Display just the part of each line that matches the regex. -o 
 K  Report only the first few hits.   
 L  List the files that contain hits, not the actual hits. -l 
 M  Specify character mapping and define “word”.   
 N  Show line numbers with hits. -n/N
 O  Set output format.   
 P  Show context lines around matching lines. (-A, -B, -C) 
 Q  Suppress program logo and some or all warnings. (-s) 
 R  Read and display input files as binary or text. -U, (-a) 
 S  Scan files in subdirectories too. -r 
 U  UNIX-style output: show filespec with each hit. (implied) 
 V  Display lines that don’t contain a match. -v/V
 W  Specify line width or binary block length.   
 X  Exclude specified files from scan. -x 
 Y  Multiple regexes must all match.   
 Z  Reset all options (recommended for batch files).   
 0  Set ERRORLEVEL to 0 if any hits were found.   
 1  Set ERRORLEVEL to 1 if any hits were found. (-v) 
 3  Set ERRORLEVEL to 3 if warnings were displayed.   
* UNIX grep options are case sensitive; GREP and FIND options are not.
(An option is shown in parentheses if the GREP option’s effect is similar but not identical.)

How to Specify Options

On the command line, options can appear anywhere, before or after the regex and the input files. All options are processed before any files are read.

You have a lot of freedom about how you enter options: use a leading hyphen or slash, use upper- or lower-case letters, leave spaces between options or combine them. For instance, the following are just some of the different ways of turning on the /P3 option and /B option:

        /p3 -b    /b/P3    /p3B    -B/P3    -P3 -b 

This Quick Start Guide always uses capital letters for the options, to make it easier to distinguish letter l and figure 1.

For clarity, you should always use a hyphen or slash before the numeric /0 option, /1 option, or /3 option. Example: /E0 means the /E option with a value of 0, but /E/0 means the /E option with no value specified, followed by the /0 option.

The Reference Manual gives more information about the environment variable, including instructions for overriding a particular stored option on the command line.

Regular Expressions (Regexes)

A regular expression or regex is a pattern of characters to compare to lines, records, or buffers from one or more input files. GREP reports a hit if the input contains a match with the pattern in the regex.

A regex can be a simple text string, like mother, or something more complex. (If you want to search only for simple strings, use the /E0 option and ignore all this regex stuff.)

Regexes by Example

Example 1: If you want both the English and the American spellings of the word “grey/gray”, use

        gr[ea]y

as your regex. (See Example 5 for “colour/color”.)

Example 2: The basic regex for any word starting with “moth” is

        moth[a-z]*

which is the letters “moth” followed by any number of letters a through z. Yes, that regex does match “moth” itself: see * or + for Repetition in the Reference Manual.

Example 3: A word in double quotes would be matched by

        \"[a-z]+\"

Read that regex as “a double quote mark, followed by one or more letters, followed by another double quote mark.” (You need the backslashes \ to tell the Windows command prompt to pass the quote marks forward to GREP. See Quotes in a Regex in the Reference Manual.)

Example 4: A U.S. local telephone number has the basic regex

        [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9] 

That signifies three digits, followed by a hyphen, followed by four digits. (You could express it more simply with an extended regex: [0-9]{3}-[0-9]{4} or even \d{3}-\d{4}.)

Example 5: To get the American and English spellings of “color/colour” is easy with GREP32: specify an extended regex (with the /E2 option) of

        colou?r

GREP16 doesn’t support extended regexes, so you could either use colou*r (which would also match the non-words colouur, colouuuuur, etc.), or else use the /F- option and enter color and colour as two regexes.

Regex Language Summary

From the examples you can see that a regex is essentially a string of characters with a bunch of operators thrown in to express possibilities like “any of these characters” and “repeated”. Here’s a quick summary of the characters that have special meaning in a regex; note that some work in any regex and others only in an extended regex (/E2 option). Each one is hyperlinked to the section of the Reference Manual where you’ll find a full description.

which regexes? description
Characters with special meaning outside square brackets:       
. period any Match any character.
* asterisk any Match 0 or more occurrences of the preceding.
+ plus sign any Match 1 or more occurrences of the preceding.
? question mark extended Match 0 or 1 occurrence of the preceding.
[ left square bracket any Start a character class, like [abcde] to match any one of a, b, c, d, e.
^ caret any Match start of line in text mode or start of record in binary mode.
$ dollar sign any Match end of line in text mode or end of record in binary mode.
\ backslash any Treat any of the listed special characters as normal.
\ backslash extended (1) character types like \w for a word character;
(2) simple assertions like \b for a word boundary;
(3) back references to parenthesized subexpressions;
(4) character encoding for odd characters like \x3c for <.
{ left brace extended Repetition count, like {3,} for three or more occurrences of the preceding.
| vertical bar extended Alternatives, like mother|father to match “mother” or “father”.
(…) parentheses
or round brackets
extended Subexpressions, like (&nbsp;)+ to match one or more occurrences of “&nbsp;”.
Characters with special meaning inside square brackets:       
] right square bracket any End the character class.
- minus sign or hyphen any Character range, like [a-z] to match any lower-case English letter.
^ caret any Negate the character class, like [^a-z] to match any character except a lower-case English letter.
\ backslash any Treat the next character as normal.
\ backslash extended Character encoding.
[: left square bracket
followed by colon
extended Introduce a named character class, like [[:punct:]0-9] for any punctuation character or a digit.

[ on to the GREP Reference Manual ]

What’s New in this Quick Start Guide?

Because this program helps you,
please click to donate!
Because this program helps you,
please donate at
BrownMath.com/donate.

Updates and new info: https://BrownMath.com/utils/

Site Map | Searches | Home Page | Contact