UNIX: How to Use Sed and AWK Commands

Sed Command

The sed command does much the same thing as ed. The main difference is that sed performs in a non-interactive way.

Sed is a stream editor and it is designed to work on a specified stream of text according to rules set by the user beforehand.

This text stream is usually the output of a previous operation, whether instigated by the user or part of a list of commands that run automatically. For example, the output of the ls command produces a stream of text — a directory listing — that can be piped through sed and edited. In addition, sed can work on files.

If you have a group of files with similar content and need to make a particular edit to the contents of all these files, sed will enable you to do that very easily. For example, have a go at the following “Try it Out” section, in which you combine the contents of two files while at the same time performing a substitution for the name “Paul” in both files.

How Sed Command Works.

Everything you can imagine is real
– Pablo Picasso

Editing commands for sed can be provided at the command line:

Create two files, each with a list of first names, in vi:

% vi names1.txt

Paul
Craig
Debra
Joe
Jeremy

% vi names2.txt

Paul
Katie
Mike
Tom
Pat
Practice

At the command line enter and run the following command:

% sed -e s/Paul/Pablo/g names1.txt names2.txt &>; names3.txt

Display the output of the third file to discover the resulting list of names:

% cat names3.txt

Pablo
Craig
Debra
Joe
Jeremy
Pablo
Katie
Mike
Tom
Pat
%

How It Works

The sed utility reads the specified files and/or the standard input and modifies the input as directed by a list of commands. The input is then written to the standard output, which can be redirected if need be.

In this example, the sed command is searching for all instances of the name Paul in the two files provided in the command-line argument and replacing them with the name Pablo.

After the search and replace has been completed, the output is redirected from standard out to a new file called names3.txt.

Notice the trailing g in the command, s/Paul/Pablo/g:

% sed s/Paul/Pablo/<strong>g</strong> names1.txt names2.txt &>; names3.txt

This specifies that sed should look globally. Without that trailing g, if the name Paul happened to be on the same line twice, only the first would be substituted.

Note that while only one line from each file was affected by substitution, all the lines from both files are displayed, in the order they are processed, in the output from sed.

The original files are unchanged; only the output, or in this example the file created from the output, contains the substitution of Pablo for Paul.

-e Option

Yesterday you said tomorrow. Just do it.
– Nike

Using the -e Option

Multiple commands may be specified by using the -e option:

% sed -e 's/Paul/Pablo/; s/Pat/Patricia/' names1.txt names2.txt

Pablo
Craig
Debra
Joe
Jeremy
Pablo
Katie
Mike
Tom
Patricia
%

The -e option is necessary when supplying more than one editing command as a command-line argument to sed.

Note that while enclosing the instructions in single quotes is not required (they weren’t used in the first sed example), they should be used in all cases.

Enclosing the instructions in quotes helps the user visualize what arguments are related to editing and what arguments are related to other information, such as which files to edit.

Moreover, the enclosing single quotes will prevent the shell from interpreting special characters or spaces found in the editing instruction.

awk Command

Sed works much like editing commands manually in any type of text editor, so it’s a good choice for editing text in a file or from other commands in a noninteractive, batch environment.

But sed does have some shortcomings, such as a limited capability to work on more than one line at a time, and it has few rudimentary programming constructs that can be used to build more complicated scripts.

So there are other solutions when it comes to scripting complex text processing; AWK, which offers a more general computational model for processing a file, is one of them.

A typical example of an AWK program is one that transforms data into a formatted report.

The data might be a log file generated by a Unix program such as traceroute, and the report might summarize the data in a format useful to a system administrator.

Or the data might be extracted from a text file with a specific format, such as the following example. In other words, AWK is a pattern-matching program, akin to sed.

Try awk command at the command line:

%awk '{ print $0 }' /etc/passwd

The results will look something like the following, depending on the entries in the /etc/passwd file:

root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
apache:x:48:48:Apache:/var/www:/sbin/nologin
webalizer:x:67:67:Webalizer:/var/www/usage:/sbin/nologin
ldap:x:55:55:LDAP User:/var/lib/ldap:/bin/false
mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/bash
pdw:x:500:500:Paul Weinstein:/home/pdw:/bin/bash
%

How It Works

AWK takes two inputs: a command, set of commands, or a command file and a data or data file. As with sed the command or command file contains pattern-matching instructions for which AWK is to use as a guideline for processing the data or data file.

In this example, AWK isn’t processing any data but is simply reading the /etc/passwd file’s contents and sending the data unfiltered to standard out, much like the cat command.

When AWK was invoked, it was provided with the two pieces of information it needs: an editing command and data to edit.

The example specifies /etc/passwd as input file for data, and the edit command simply directs AWK to print each line in the file in order.

All output is sent to standard out (which can be directed elsewhere), a file, or another command.