Today, web developers have a number of platform options to develop on. Some like the familiarity of Windows combined with an IDE like Eclipse or Zend Development Environment, while others (like myself) prefer the more down-and-dirty method of vi on a Linux workstation. Whatever your taste (or skillset) requires, if you develop on the LAMP stack as we do, then eventually you may have to interact with the Linux command line interface (CLI).

In my few years of experience in LAMP development, I have worked with a number of developers with wide-ranging abilities on the CLI. I have oftentimes been surprised that even some of the very best PHP coders can feel a bit uncomfortable when faced with the CLI (hey, sometimes it’s unavoidable). So I thought I might write a series of how-to articles on some of the more useful CLI tools, to help the budding or even advanced PHP developer increase their familiarity with the CLI when they need it. Dive in, after the jump.

Since I’m sure most of us have at least a basic understanding of the Linux CLI, I first wanted to cover some of the lesser-known yet incredibly simple Bash tools. In this post, I’ll be covering the basics of cut, paste, tr, and sort.

Cut

Cut is a very useful command that is used to, as you may expect, extract various fields of data on a line-by-line basis. It has only a few arguments, the most useful of which I will cover here. Basic usage looks like this:

$ cut –c1-10 file.txt

The “–c” option specifies the range of characters on the line to print. The above command prints out the first 10 characters of each line in file.txt. (Note that it’s not zero-relative, like, say, PHP’s substr.) You can also specify an open ended character range like so:

</code><code>$ cut –c1,5-, file.txt

This will print the first character, followed by the fifth character continuing through the end of the line, for every line in the file.

The “-d” and “-f” options specify a delimiter and field number. These are very useful for, say, parsing out log files with a standard delimiter. The default delimiter, when not specified with “-d” is the tab character.

The /etc/passwd file is a good example to parse with cut. Let’s say we wanted to get a list of all the users of a system and their home directories:

$ cut –d: -f1,6 /etc/passwd

The output would look something like:

user1:/home/user1
user2:/home/user2

In laymen terms, this might read something like, “cut out fields 1 and 6, as delimited by a ‘:’, from /etc/passwd.” Now let’s look at paste.

Paste

The paste command functions pretty much as you’d expect it to: as the opposite of the cut command; it pastes multiple lines in two or more files together. Let’s say we have three files with the following contents:

file1:
Hello

file2:
there

file3:
everyone

If we ran paste on these three files, using a + as a delimiter (again, the default is a tab), we would end up with a single line:

$ paste -d+ file1 file2 file3
Hello+there+everyone

Basically, each line in each file is pasted together, 1->1, 2->2, etc. Now, if you specify the “-s” option and only list one file, then each line of that file will be pasted together to form one giant line:

file1:
hello
there!

$ paste -d' ' -s file1
hello there!

Tr

The tr command stands for translate and basically acts like a substitution filter. Let’s say I wanted to replace all capital letters in a document to lowercase; you could run this:

$ tr '[A-Z]' '[a-z]' < file1

You will notice that tr does not take a file argument. This is because tr is a filter, not a command and thus you will have to pass it text through pipes, redirection, or standard in.

One of the more useful features of tr is that it takes octal values of certain ASCII characters:

Bell = 7
Backspace = 10
Tab = 11
Newline = 12
Linefeed = 12
Formfeed = 14
Carriage return = 15
Escape = 33<

So, if you wanted to replace all spaces with a newline, you’d run the following:

$ tr ' ' '\12' < file1

tr also has the “-s” option which will squeeze out multiple occurrences of the replacing character after the input is translated. So given a file like:

file1:
this

is some

text

If you ran:

$ tr -s '\12' < file1
this
is some
text

The above command replaces multiple newlines with only one.

Finally, the “-d” option will delete the specified character from the input stream:

$ tr -d '\12' < file1
thisis sometext

Sort

The sort command has a number of options available so I’ll just cover the basics here. The most obvious usage is to simply sort a file alphabetically.

file1:
orange
apple
grape

$ sort file1
apple
grape
orange

Sort will convert the characters into your system’s internal encoding, (typically ASCII on Linux boxes) and then order them based on the encoded value.

Here are a few easy to use options:

-u = Removes duplicates
-r = Reverses the order
-o = Output file (this is useful if you want to replace the source file, since using a redirect into the input file will obviously not work since it is still being parsed by sort)
-n = Tells sort to order the line arithmetically (used for sorting files containing numbers)

Conclusion

Next time, I will be covering the sed and grep tools as they are extremely useful, yet a bit more involved than the simple tools I covered today. In the future, you can look forward to topics on process and memory management, a vi primer, using PHP as a CLI scripting tool, and much more. And in the mean time, please feel free to post your helpful hints for getting around CLI or using Linux as a dev environment in general.

And if even these commands have gone over your head, you should probably read through these tutorials first to get the hang of CLI basics:

Linux Command Line Tutorials [tuXfiles.org]

Posted in: How To