If you’re just getting started with Terminal, it can take a while to become familiar with the basic ins and outs of the command line interface. Once you have settled into a groove, you can start exploring some of the extremely powerful utilities available through the interface. One of the most useful is sed. It’s a slightly obscure program from the old days of Unix command line utilities. It uses powerful pattern matching to change text within files without opening them. What is sed, and how can we take advantage of its power?
What is sed?
sed is a command line stream editor. Its name comes from a portmanteau of those two words. While it does modify text, the application is not a text editor precisely. sed received text input as a “stream.” Then, it edits the stream according to your instructions. It can use complex pattern matching to make substitutions, or basic find and replace.
While this makes it useful for running find and replace functions on a batch of files, its not as easy to use as something from Excel or Word. However, sed is especially powerful when editing a number of plaintext files simultaneously. Though you’ll find that the syntax takes a little getting used to, once you’ve got that down, you’ll see how incredibly powerful this compact editor can be.
Using sed for substitution
To use sed, you’ll need to pop open a Terminal window. Because sed takes input, typing
sed alone won’t do anything. Instead, you need give the application a command, as well as some parameters
The most common command is substitute, or
s. A basic sed substitution functions like what’s below.
$ sed 's/cat/dog/'
That command will order sed to scan incoming text and replace every instance of the string “cat” with “dog.” To demonstrate, we’ll use the echo command to give sed something to work with.
There are three main parts to the sed command. First, there’s
sed, which starts the program. Then there’s the
s, which starts the substitution command. The forward slash ( / ) is a separator, called a “delimiter.” The string after that will be the string sed looks for. The string after the second slash will be the replacement characters that sed writes to the stream. The final slash closes the command. All together, it’s
Note the single quotes in that example. The single quotes allow you to use meta characters in your command. While quotes aren’t essential for most of our examples, you should develop the habit of using single quotes for all commands.
Matching partial strings
By default, sed does pay attention to word boundaries. If you ask it to replace the string “cat,” it will replace every appearance of those three letters. For example, “catapult” with become “dogapult” and “catamaran” will become “dogamaran.” To match word boundaries, you’ll want to use regular expressions in your pattern.
Like many GNU utilities, sed is line based. This means that sed will only operate on the first match per line of the document. When the command encounters a newline, it resets and begins replacing again. Typically, this isn’t ideal, but it’s simple enough to disable. To make sed operate on every appearance of the pattern, add the
g flag at the end of the command, like so:
$ sed 's/cat/dog/g'
This has the desired effect of replacing every appearance of that pattern with your replacement string.
$ echo cat catapult catamaran | sed 's/cat/dog/g' dog dogapult dogamaran
Setting input and output files
By default, sed will operate on standard input. As a result, this means that it works on Terminal’s own text output, and process results to standard output. By and large, it’s more useful to process specific files. We can run sed on a file by specifying the filename at the end of the command:
$ sed 's/cat/dog/g' oldfile.txt > newfile.txt
This will run sed on “oldfile.txt” and save the command’s output “newfile.txt.” The caret ( > )routes the output of “oldfile.txt” to “newfile.txt.” If “newfile.txt” does not exist, sed will create it as specified. If you don’t specify an output file, sed will print its results into Terminals standard output.
Writing files in place
We don’t always want route our replacements to a new file. If you want to overwrite the contents of a file, you’ll need to use the
-i flag. This flag makes the edits “in place.” Keep in mind this is destructive, and will permanently eliminate the old version of the file.
$ sed -i '' 's/cat/dog' oldfile.txt
If you want a little bit of a safety net, create a backup when the command executes. To create a backup of the file, put any file extension after the
-i. It doesn’t need to be a functional extension: “.bu” or “.bak” are often chosen. While these extensions make the file inoperable, it still contains all the right bits. Just remove “.bu” or “.bak” from the filename and you’ll be able to open it.
$ sed -i.bak 's/cat/dog' oldfile.txt
This creates “oldfile.txt.bak”, which contains the unedited text data from the original version of “oldfile.txt.” The specified file, “oldfile.txt,” will now contain the updated contents based on sed’s replacements.
Matching regular expressions
The real power of
sed comes from matching regular expressions. This allows you to set patterns for matching, rather than literal strings. For example, we could use something like the string below to repeat any set of one or more numbers:
$ echo "123 abc" | sed 's/[0-9][0-9]*/& &/' 123 123 abc
Because sed an extremely powerful utility, with a depth of power similar to a scripting language, this guide is insufficient to cover the whole thing. To learn more about sed’s full capabilities, check out Bruce Barnett’s comprehensive guide to sed. But be ready to do some serious reading!