3.10. External Filters, Programs and Commands

This is a descriptive listing of standard UNIX commands useful in shell scripts. The power of scripts comes from coupling system commands and shell directives with simple programming constructs.

3.10.1. Basic Commands

ls

The basic file "list" command. It is all too easy to underestimate the power of this humble command. For example, using the -R, recursive option, ls provides a tree-like listing of a directory structure.


Example 3-84. Using ls to create a table of contents for burning a CDR disk

   1 #!/bin/bash
   2 
   3 SPEED=2    # May use higher speed if supported.
   4 IMAGEFILE=cdimage.iso
   5 CONTENTSFILE=contents
   6 DEFAULTDIR=/opt
   7 
   8 # Script to automate burning a CDR.
   9 
  10 # Uses Joerg Schilling's "cdrecord" package.
  11 # (http://www.fokus.gmd.de/nthp/employees/schilling/cdrecord.html)
  12 
  13 # If this script invoked as an ordinary user, need to suid cdrecord
  14 # (chmod u+s /usr/bin/cdrecord, as root).
  15 
  16 if [ -z "$1" ]
  17 then
  18   IMAGE_DIRECTORY=$DEFAULTDIR
  19 # Default directory, if not specified on command line.
  20 else
  21     IMAGE_DIRECTORY=$1
  22 fi
  23     
  24 ls -lRF $IMAGE_DIRECTORY > $IMAGE_DIRECTORY/$CONTENTSFILE
  25 # The "l" option gives a "long" file listing.
  26 # The "R" option makes the listing recursive.
  27 # The "F" option marks the file types (directories suffixed by a /).
  28 echo "Creating table of contents."
  29 
  30 mkisofs -r -o $IMAGFILE $IMAGE_DIRECTORY
  31 echo "Creating ISO9660 file system image ($IMAGEFILE)."
  32 
  33 cdrecord -v -isosize speed=$SPEED dev=0,0 $IMAGEFILE
  34 echo "Burning the disk."
  35 echo "Please be patient, this will take a while."
  36 
  37 exit 0

cat, tac

cat, an acronym for concatenate, lists a file to stdout. When combined with redirection (> or >>), it is commonly used to concatenate files.

   1 cat filename cat file.1 file.2 file.3 > file.123
The -n option to cat inserts consecutive numbers before all lines of the target file(s). The -b option numbers only the non-blank lines. The -v option echoes nonprintable characters, using ^ notation.

See also Example 3-101 and Example 3-98.

tac, is the inverse of cat, listing a file backwards from its end.

rev

reverses each line of a file, and outputs to stdout. This is not the same effect as tac, as it preserves the order of the lines, but flips each one around.

 bash$ cat file1.txt
 This is line 1.
 This is line 2.
 
 
 bash$ tac file1.txt
 This is line 2.
 This is line 1.
 
 
 bash$ rev file1.txt
 .1 enil si sihT
 .2 enil si sihT
 	      

cp

This is the file copy command. cp file1 file2 copies file1 to file2, overwriting file2 if it already exists (see Example 3-87).

Tip

Particularly useful are the -a archive flag (for copying an entire directory tree) and the -r and -R recursive flags.

mv

This is the file move command. It is equivalent to a combination of cp and rm. It may be used to move multiple files to a directory. For some examples of using mv in a script, see Example 3-10 and Example A-3.

rm

Delete (remove) a file or files. The -f forces removal of even readonly files.

Warning

When used with the recursive flag -r, this command removes files all the way down the directory tree.

rmdir

Remove directory. The directory must be empty of all files, including invisible "dotfiles", [1] for this command to succeed.

mkdir

Make directory, creates a new directory. mkdir -p project/programs/December creates the named directory. The -p option automatically creates any necessary parent directories.

chmod

Changes the attributes of an existing file (see Example 3-72).

   1 chmod +x filename
   2 # Makes "filename" executable for all users.
   3 
   4 chmod u+s filename
   5 # Sets "suid" bit on "filename" permissions.
   6 # An ordinary user may execute "filename" with same privileges as the file's owner.
   7 # (This does not apply to shell scripts.)

   1 chmod 644 filename
   2 # Makes "filename" readable/writable to owner, readable to
   3 # others
   4 # (octal mode).

   1 chmod 1777 directory-name
   2 # Gives everyone read, write, and execute permission in directory,
   3 # however also sets the "sticky bit".
   4 # This means that only the owner of the directory,
   5 # owner of the file, and, of course, root
   6 # can delete any particular file in that directory.

chattr

Change file attributes. This has the same effect as chmod above, but with a different invocation syntax, and it works only on an ext2 filesystem.

ln

Creates links to pre-existings files. Most often used with the -s, symbolic or "soft" link flag. This permits referencing the linked file by more than one name and is a superior alternative to aliasing (see Example 3-34).

ln -s oldfile newfile links the previously existing oldfile to the newly created link, newfile.

3.10.2. Complex Commands

find

-exec COMMAND \;

Carries out COMMAND on each file that find scores a hit on. COMMAND terminates with \; (the ; is escaped to make certain the shell passes it to find literally, which concludes the command sequence). If COMMAND contains {}, then find substitutes the full path name of the selected file.

 bash$ find ~/ -name '*.txt'
 /home/bozo/.kde/share/apps/karm/karmdata.txt
 /home/bozo/misc/irmeyc.txt
 /home/bozo/test-scripts/1.txt
 	      

   1 find /home/bozo/projects -mtime 1
   2 # Lists all files in /home/bozo/projects directory tree
   3 # that were modified within the last day.

   1 find /etc -exec grep '[0-9][0-9]*[.][0-9][0-9]*[.][0-9][0-9]*[.][0-9][0-9]*' {} \;
   2 
   3 # Finds all IP addresses (xxx.xxx.xxx.xxx) in /etc directory files.
   4 # There a few extraneous hits - how can they be filtered out?
   5 
   6 # Perhaps by:
   7 
   8 find /etc -type f -exec cat '{}' \; | tr -c '.[:digit:]' '\n' \
   9  | grep '^[^.][^.]*\.[^.][^.]*\.[^.][^.]*\.[^.][^.]*$'
  10 
  11 # Thanks, S.C. 

Caution

The -exec option to find should not be confused with the exec shell builtin.


Example 3-85. Badname, eliminate file names in current directory containing bad characters and whitespace.

   1 #!/bin/bash
   2 
   3 # Delete filenames in current directory containing bad characters.
   4 
   5 for filename in *
   6 do
   7 badname=`echo "$filename" | sed -n /[\+\{\;\"\\\=\?~\(\)\<\>\&\*\|\$]/p`
   8 # Files containing those nasties:   + { ; " \ = ? ~ ( ) < > & * | $
   9 rm $badname 2>/dev/null
  10 #           So error messages deep-sixed.
  11 done
  12 
  13 # Now, take care of files containing all manner of whitespace.
  14 find . -name "* *" -exec rm -f {} \;
  15 # The path name of the file that "find" finds replaces the "{}".
  16 # The '\' ensures that the ';' is interpreted literally, as end of command.
  17 
  18 exit 0
  19 
  20 #---------------------------------------------------------------------
  21 # Commands below this line will not execute because of "exit" command.
  22 
  23 # An alternative to the above script:
  24 find . -name '*[+{;"\\=?~()<>&*|$ ]*' -exec rm -f '{}' \;
  25 exit 0
  26 # (Thanks, S.C.)
  27 

See Example 3-102, Example 3-5, and Example 3-50 for scripts using find. Its man page provides more detail on this complex and powerful command.

xargs

A filter for feeding arguments to a command, and also a tool for assembling the commands themselves. It breaks a data stream into small enough chunks for filters and commands to process. Consider it as a powerful replacement for backquotes. In situations where backquotes fail with a too many arguments error, substituting xargs often works. Normally, xargs reads from stdin or from a pipe, but it can also be given the output of a file.

The default command for xargs is echo.

ls | xargs -p -l gzip gzips every file in current directory, one at a time, prompting before each operation.

Tip

An interesting xargs option is -n XX, which limits the number of arguments passed to XX.

ls | xargs -n 8 echo lists the files in the current directory in 8 columns.

Tip

Another useful option is -0, in combination with find -print0 or grep -lZ. This allows handling arguments containing whitespace or quotes.

find / -type f -print0 | xargs -0 grep -liwZ GUI | xargs -0 rm -f

grep -rliwZ GUI / | xargs -0 rm -f

Either of the above will remove any file containing "GUI". (Thanks, S.C.)


Example 3-86. Log file using xargs to monitor system log

   1 #!/bin/bash
   2 
   3 # Generates a log file in current directory
   4 # from the tail end of /var/log/messages.
   5 
   6 # Note: /var/log/messages must be readable by ordinary users
   7 #       if invoked by same (#root chmod 755 /var/log/messages).
   8 
   9 ( date; uname -a ) >>logfile
  10 # Time and machine name
  11 echo --------------------------------------------------------------------- >>logfile
  12 tail -5 /var/log/messages | xargs |  fmt -s >>logfile
  13 echo >>logfile
  14 echo >>logfile
  15 
  16 exit 0


Example 3-87. copydir, copying files in current directory to another, using xargs

   1 #!/bin/bash
   2 
   3 # Copy (verbose) all files in current directory
   4 # to directory specified on command line.
   5 
   6 if [ -z "$1" ]
   7 # Exit if no argument given.
   8 then
   9   echo "Usage: `basename $0` directory-to-copy-to"
  10   exit 65
  11 fi  
  12 
  13 ls . | xargs -i -t cp ./{} $1
  14 # This is the exact equivalent of
  15 #    cp * $1
  16 # unless any of the filenames has "whitespace" characters.
  17 
  18 exit 0

expr arg1 operation arg2 ...

All-purpose expression evaluator: Concatenates and evaluates the arguments according to the operation given (arguments must be separated by spaces). Operations may be arithmetic, comparison, string, or logical.

expr 3 + 5

returns 8

expr 5 % 3

returns 2

y=`expr $y + 1`

Increment a variable, with the same effect as let y=y+1 and y=$(($y+1)) This is an example of arithmetic expansion.

z=`expr substr $string28 $position $length`

External programs, such as sed and Perl have far superior string parsing facilities, and it might well be advisable to use these rather than the built-in Bash ones.


Example 3-88. Using expr

   1 #!/bin/bash
   2 
   3 # Demonstrating some of the uses of 'expr'
   4 # +++++++++++++++++++++++++++++++++++++++
   5 
   6 echo
   7 
   8 # Arithmetic Operators
   9 
  10 echo "Arithmetic Operators"
  11 echo
  12 a=`expr 5 + 3`
  13 echo "5 + 3 = $a"
  14 
  15 a=`expr $a + 1`
  16 echo
  17 echo "a + 1 = $a"
  18 echo "(incrementing a variable)"
  19 
  20 a=`expr 5 % 3`
  21 # modulo
  22 echo
  23 echo "5 mod 3 = $a"
  24 
  25 echo
  26 echo
  27 
  28 # Logical Operators
  29 
  30 echo "Logical Operators"
  31 echo
  32 
  33 a=3
  34 echo "a = $a"
  35 b=`expr $a \> 10`
  36 echo 'b=`expr $a \> 10`, therefore...'
  37 echo "If a > 10, b = 0 (false)"
  38 echo "b = $b"
  39 
  40 b=`expr $a \< 10`
  41 echo "If a < 10, b = 1 (true)"
  42 echo "b = $b"
  43 
  44 
  45 echo
  46 echo
  47 
  48 # Comparison Operators
  49 
  50 echo "Comparison Operators"
  51 echo
  52 a=zipper
  53 echo "a is $a"
  54 if [ `expr $a = snap` ]
  55 # Force re-evaluation of variable 'a'
  56 then
  57    echo "a is not zipper"
  58 fi   
  59 
  60 echo
  61 echo
  62 
  63 
  64 
  65 # String Operators
  66 
  67 echo "String Operators"
  68 echo
  69 
  70 a=1234zipper43231
  71 echo "The string being operated upon is \"$a\"."
  72 
  73 # index: position of substring
  74 b=`expr index $a 23`
  75 echo "Numerical position of first \"23\" in \"$a\" is \"$b\"."
  76 
  77 # substr: print substring, starting position & length specified
  78 b=`expr substr $a 2 6`
  79 echo "Substring of \"$a\", starting at position 2, and 6 chars long is \"$b\"."
  80 
  81 # length: length of string
  82 b=`expr length $a`
  83 echo "Length of \"$a\" is $b."
  84 
  85 # 'match' operations similarly to 'grep'
  86 b=`expr match "$a" '[0-9]*'`
  87 echo Number of digits at the beginning of \"$a\" is $b.
  88 b=`expr match "$a" '\([0-9]*\)'`
  89 echo "The digits at the beginning of \"$a\" are \"$b\"."
  90 
  91 echo
  92 
  93 exit 0

Note that : can substitute for match. Indeed, b=`expr $a : [0-9]*` is the exact equivalent of b=`expr match $a [0-9]*` in the above example.

3.10.3. Time / Date Commands

date

Simply invoked, date prints the date and time to stdout. Where this command gets interesting is in its formatting and parsing options.


Example 3-89. Using date

   1 #!/bin/bash
   2 
   3 #Using the 'date' command
   4 
   5 # Needs a leading '+' to invoke formatting.
   6 
   7 echo "The number of days since the year's beginning is `date +%j`."
   8 # %j gives day of year.
   9 
  10 
  11 echo "The number of seconds elapsed since 01/01/1970 is `date +%s`."
  12 # %s yields number of seconds since "UNIX epoch" began,
  13 # but how is this useful?
  14 
  15 prefix=temp
  16 suffix=`eval date +%s`
  17 filename=$prefix.$suffix
  18 echo $filename
  19 # It's great for creating "unique" temp filenames,
  20 # even better than using $$.
  21 
  22 # Read the 'date' man page for more formatting options.
  23 
  24 exit 0

time

Outputs very verbose timing statistics for executing a command.

time ls -l / gives something like this:

 0.00user 0.01system 0:00.05elapsed 16%CPU (0avgtext+0avgdata 0maxresident)k
 0inputs+0outputs (149major+27minor)pagefaults 0swaps

See also the very similar times command in the previous section.

Note

As of version 2.0 of Bash, time became a shell reserved word, with slightly altered behavior in a pipeline.

touch

Utility for updating access/modification times of a file to current system time or other specified time, but also useful for creating a new file. The command touch zzz will create a new file of zero length, named zzz, assuming that zzz did not previously exist. Time-stamping empty files in this way is useful for storing date information, for example in keeping track of modification times on a project.

The touch command is equivalent to : >> newfile (for ordinary files).

at

The at job control command executes a given set of commands at a specified time. This is a user version of cron.

at 2pm January 15 prompts for a set of commands to execute at that time. These commands may include executable shell scripts.

Using either the -f option or input redirection (<), at reads a command list from a file. This file can include shell scripts, though they should, of course, be noninteractive.

 bash$ at 2:30 am Friday < at-jobs.list
 job 2 at 2000-10-27 02:30
 	      

batch

The batch job control command is similar to at, but it runs a command list when the system load drops below .8. Like at, it can read commands from a file with the -f option.

cal

Prints a neatly formatted monthly calendar to stdout. Will do current year or a large range of past and future years.

sleep

This is the shell equivalent of a wait loop. It pauses for a specified number of seconds, doing nothing. This can be useful for timing or in processes running in the background, checking for a specific event every so often (see Example 3-163).

   1 sleep 3
   2 # Pauses 3 seconds.

Note

The sleep command defaults to seconds, but minute, hours, or days may also be specified.

   1 sleep 3 h
   2 # Pauses 3 hours!

usleep

Microsleep (the "u" may be read as the Greek "mu", or micro prefix). This is the same as sleep, above, but "sleeps" in microsecond intervals. This can be used for fine-grain timing, or for polling an ongoing process at very frequent intervals.

   1 usleep 30
   2 # Pauses 30 microseconds.

Caution

The usleep command does not provide particularly accurate timing, and is therefore unsuitable for critical timing loops.

hwclock, clock

The hwclock command accesses or adjusts the machine's hardware clock. Some options require root privileges. The /etc/rc.d/rc.sysinit startup file uses hwclock to set the system time from the hardware clock at bootup.

The clock command is a synonym for hwclock.

3.10.4. Text Processing Commands

sort

File sorter, often used as a filter in a pipe. This command can sort a text stream or file forwards or backwards, or according to various keys or character positions. The info page lists its many options. See Example 3-50 and Example 3-51.

diff, patch

diff: flexible file comparison utility. It compares the target files line-by-line sequentially. In some applications, such as comparing word dictionaries, it may be helpful to filter the files through sort and uniq before piping them to diff. diff file-1 file-2 outputs the lines in the files that differ, with carets showing which file each particular line belongs to.

The --side-by-side option to diff outputs each compared file, line by line, in separate columns, with non-matching lines marked.

There are available various fancy frontends for diff, such as spiff, wdiff, xdiff, and mgdiff.

Tip

The diff command returns an exit status of 0 if the compared files are identical, and 1 if they differ. This permits use of diff in a test construct within a shell script (see below).

A common use for diff is generating difference files to be used with patch The -e option outputs files suitable for ed or ex scripts.

patch: flexible versioning utility. Given a difference file generated by diff, patch can upgrade a previous version of a package to a newer version. It is much more convenient to distribute a relatively small "diff" file than the entire body of a newly revised package. Kernel "patches" have become the preferred method of distributing the frequent releases of the Linux kernel.

   1 patch -p1 <patch-file
   2 # Takes all the changes listed in 'patch-file'
   3 # and applies them to the files referenced therein.
   4 # This upgrades to a newer version of the package.

Patching the kernel:

   1 cd /usr/src
   2 gzip -cd patchXX.gz | patch -p0
   3 # Upgrading kernel source using 'patch'.
   4 # From the Linux kernel docs "README",
   5 # by anonymous author (Alan Cox?).

Note

The diff command can also recursively compare directories (for the filenames present).

 bash$ diff ~/notes1 ~/notes2
 Only in /home/bozo/notes1: file02
 Only in /home/bozo/notes1: file03
 Only in /home/bozo/notes2: file04
 	      

diff3

An extended version of diff that compares three files at a time. This command returns an exit value of 0 upon successful execution, but unfortunately this gives no information about the results of the comparison.

sdiff

Compare and/or edit two files in order to merge them into an output file. Because of its interactive nature, this command would find little use in a script.

cmp

The cmp command is a simpler version of diff, above. Whereas diff reports the differences between two files, cmp merely shows at what point they differ.

Note

Like diff, cmp returns an exit status of 0 if the compared files are identical, and 1 if they differ. This permits use in a test construct within a shell script.


Example 3-90. Using cmp to compare two files within a script.

   1 #!/bin/bash
   2 
   3 ARGS=2  # Two args to script expected.
   4 
   5 if [ $# -ne "$ARGS" ]
   6 then
   7   echo "Usage: `basename $0` file1 file2"
   8   exit 65
   9 fi
  10 
  11 
  12 cmp $1 $2 > /dev/null  # /dev/null buries the output of the "cmp" command.
  13 # Also works with 'diff', i.e.
  14 # diff $1 $2 > /dev/null
  15 
  16 if [ $? -eq 0 ]  # Test exit status of "cmp" command.
  17 then
  18   echo "File \"$1\" is identical to file \"$2\"."
  19 else  
  20   echo "File \"$1\" differs from file \"$2\"."
  21 fi
  22 
  23 exit 0

comm

Versatile file comparison utility. The files must be sorted for this to be useful.

comm -options first-file second-file

comm file-1 file-2 outputs three columns:

  • column 1 = lines unique to file-1

  • column 2 = lines unique to file-2

  • column 3 = lines common to both.

The options allow suppressing output of one or more columns.

  • -1 suppresses column 1

  • -2 suppresses column 2

  • -3 suppresses column 3

  • -12 suppresses both columns 1 and 2, etc.

uniq

This filter removes duplicate lines from a sorted file. It is often seen in a pipe coupled with sort.

   1 cat list-1 list-2 list-3 | sort | uniq > final.list
   2 # Concatenates the list files,
   3 # sorts them,
   4 # removes duplicate lines,
   5 # and finally writes the result to an output file.

The useful -c option prefixes each line of the input file with the number of occurrences.

expand, unexpand

The expand filter converts tabs to spaces. It is often used in a pipe.

The unexpand filter converts spaces to tabs. This reverses the effect of expand.

cut

A tool for extracting fields from files. It is similar to the print $N command set in awk, but more limited. It may be simpler to use cut in a script than awk. Particularly important are the -d (delimiter) and -f (field specifier) options.

Using cut to obtain a listing of the mounted filesystems:

   1 cat /etc/mtab | cut -d ' ' -f1,2

Using cut to list the OS and kernel version:

   1 uname -a | cut -d" " -f1,3,11,12

cut -d ' ' -f2,3 filename is equivalent to awk -F'[ ]' '{ print $2, $3 }' filename

See also Example 3-108.

colrm

Column removal filter. This removes columns (characters) from a file and writes the file, lacking the range of specified columns, back to stdout. colrm 2 4 <filename removes the second through fourth characters from each line of the text file filename.

Warning

If the file contains tabs or nonprintable characters, this may cause unpredictable behavior. In such cases, consider using expand and unexpand in a pipe preceding colrm.

paste

Tool for merging together different files into a single, multi-column file. In combination with cut, useful for creating system log files.

join

Consider this a special-purpose cousin of paste. This powerful utility allows merging two files in a meaningful fashion, which essentially creates a simple version of a relational database.

The join command operates on exactly two files, but pastes together only those lines with a common tagged field (usually a numerical label), and writes the result to stdout. The files to be joined should be sorted according to the tagged field for the matchups to work properly.

   1 File: 1.data
   2 
   3 100 Shoes
   4 200 Laces
   5 300 Socks

   1 File: 2.data
   2 
   3 100 $40.00
   4 200 $1.00
   5 300 $2.00

 bash$ join 1.data 2.data
 File: 1.data 2.data

 100 Shoes $40.00
 200 Laces $1.00
 300 Socks $2.00
 	      

Note

The tagged field appears only once in the output.

head

lists the beginning of a file to stdout (the default is 10 lines, but this can be changed). It has a number of interesting options.


Example 3-91. Generating 10-digit random numbers

   1 #!/bin/bash
   2 # rnd.sh: Outputs a 10-digit random number
   3 
   4 # Script by Stephane Chazelas.
   5 
   6 head -c4 /dev/urandom | od -N4 -tu4 | sed -ne '1s/.* //p'
   7 
   8 
   9 # =================================================================== #
  10 
  11 # Analysis
  12 # --------
  13 
  14 # head:
  15 # -c4 option takes first 4 bytes.
  16 
  17 # od:
  18 # -N4 option limits output to 4 bytes.
  19 # -tu4 option selects unsigned decimal format for output.
  20 
  21 # sed: 
  22 # -n option, in combination with "p" flag to the "s" command,
  23 # outputs only matched lines.
  24 
  25 
  26 
  27 # The author of this script explains the action of 'sed', as follows.
  28 
  29 # head -c4 /dev/urandom | od -N4 -tu4 | sed -ne '1s/.* //p'
  30 # ----------------------------------> |
  31 
  32 # Assume output up to "sed" --------> |
  33 # is 0000000 1198195154\n
  34 
  35 # sed begins reading characters: 0000000 1198195154\n.
  36 # Here it finds a newline character,
  37 # so it is ready to process the first line (0000000 1198195154).
  38 # It looks at its <range><action>s. The first and only one is
  39 
  40 #   range     action
  41 #   1         s/.* //p
  42 
  43 # The line number is in the range, so it executes the action:
  44 # tries to substitute the longest string ending with a space in the line
  45 # ("0000000 ") with nothing (//), and if it succeeds, prints the result
  46 # ("p" is a flag to the "s" command here, this is different from the "p" command).
  47 
  48 # sed is now ready to continue reading its input. (Note that before
  49 # continuing, if -n option had not been passed, sed would have printed
  50 # the line once again).
  51 
  52 # Now, sed reads the remainder of the characters, and finds the end of the file.
  53 # It is now ready to process its 2nd line (which is also numbered '$' as
  54 # it's the last one).
  55 # It sees it is not matched by any <range>, so its job is done.
  56 
  57 # In few word this sed commmand means:
  58 # "On the first line only, remove any character up to the right-most space,
  59 # then print it."
  60 
  61 # A better way to do this would have been:
  62 #           sed -e 's/.* //;q'
  63 
  64 # Here, two <range><action>s (could have been written
  65 #           sed -e 's/.* //' -e q):
  66 
  67 #   range                    action
  68 #   nothing (matches line)   s/.* //
  69 #   nothing (matches line)   q (quit)
  70 
  71 # Here, sed only reads its first line of input.
  72 # It performs both actions, and prints the line (substituted) before quitting
  73 # (because of the "q" action) since the "-n" option is not passed.
  74 
  75 # =================================================================== #
  76 
  77 # A simpler altenative to the above 1-line script would be:
  78 #           head -c4 /dev/urandom| od -An -tu4
  79 
  80 exit 0

See also Example 3-106.

tail

lists the end of a file to stdout (the default is 10 lines). Commonly used to keep track of changes to a system logfile, using the -f option, which outputs lines appended to the file.


Example 3-92. Using tail to monitor the system log

   1 #!/bin/bash
   2 
   3 filename=sys.log
   4 
   5 cat /dev/null > $filename; echo "Creating / cleaning out file."
   6 # Creates file if it does not already exist,
   7 # and truncates it to zero length if it does.
   8 # : > filename   would also work.
   9 
  10 tail /var/log/messages > $filename  
  11 # /var/log/messages must have world read permission for this to work.
  12 
  13 echo "$filename contains tail end of system log."
  14 
  15 exit 0

See also Example 3-86, Example 3-106 and Example 3-163.

grep

A multi-purpose file search tool that uses regular expressions. Originally a command/filter in the ancient ed line editor, g/re/p, or global - regular expression - print.

grep pattern [file...]

search the files file, etc. for occurrences of pattern.

ls -l | grep '\.txt$' has the same effect as ls -l *.txt (although symbolic links may cause problems).

The -i option causes a case-insensitive search.

The -l option lists only the files in which matches were found, but not the matching lines.

The -v (or --invert-match) option filters out matches.

   1 grep pattern1 *.txt | grep -v pattern2
   2 
   3 # Matches all lines in "*.txt" files containing "pattern1",
   4 # but ***not*** "pattern2".	      

The -c (--count) option gives a numerical count of matches, rather than actually listing the matches.

   1 grep -c txt *.sgml   # (number of occurrences of "txt" in "*.sgml" files)
   2 
   3 
   4 #   grep -cz .
   5 #            ^ dot
   6 # means count (-c) zero-separated (-z) items matching "."
   7 # that is, non-empty ones (containing at least 1 character).
   8 # 
   9 printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz .     # 4
  10 printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '$'   # 5
  11 printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -cz '^'   # 5
  12 #
  13 printf 'a b\nc  d\n\n\n\n\n\000\n\000e\000\000\nf' | grep -c '$'    # 9
  14 # By default, newline chars (\n) separate items to match. 
  15 
  16 # Note that the -z option is GNU "grep" specific.
  17 
  18 
  19 # Thanks, S.C.

Example 3-163 demonstrates how to use grep to search for a word pattern in a system log file.

If there is a successful match, grep returns an exit status of 0, which makes it useful in a condition test in a script.


Example 3-93. Emulating "grep" in a script

   1 #!/bin/bash
   2 
   3 # Very crude reimplementation of 'grep'.
   4 
   5 if [ -z "$1" ]  # Check for argument to script.
   6 then
   7   echo "Usage: `basename $0` pattern"
   8   exit 65
   9 fi  
  10 
  11 echo
  12 
  13 for file in *     # Traverse all files in $PWD.
  14 do
  15   output=$(sed -n /"$1"/p $file)  # Command substitution.
  16 
  17   if [ ! -z "$output" ]  # What happens if "$output" is not quoted?
  18   then
  19     echo -n "$file: "
  20     echo $output
  21   fi            #  sed -ne "/$1/s|^|${file}: |p"  is equivalent to above.
  22 
  23   echo
  24 done  
  25 
  26 echo
  27 
  28 exit 0
  29 
  30 # Exercises for reader:
  31 # -------------------
  32 # 1) Add newlines to output, if more than one match in any given file.
  33 # 2) Add features.

Note

egrep is the same as grep -E. This uses a somewhat different, extended set of regular expressions, which can make the search somewhat more flexible.

Note

fgrep is the same as grep -F. It does a literal string search (no regular expressions), which generally speeds things up quite a bit.

Tip

To search compressed files, use zgrep. It also works on non-compressed files, though slower than plain grep. This is handy for searching through a mixed set of files, some of them compressed, some not.

look

The command look works like grep, but does a lookup on a "dictionary", a sorted word list. By default, look searches for a match in /usr/dict/words, but a different dictionary file may be specified.


Example 3-94. Checking words in a list for validity

   1 #!/bin/bash
   2 # lookup:
   3 # Does a dictionary lookup on each word in a data file.
   4 
   5 file=words.data  # Data file from which to read words to test.
   6 
   7 echo
   8 
   9 while [ "$word" != end ]  # Last word in data file.
  10 do
  11   read word   # From data file, because of redirection at end of loop.
  12   look $word > /dev/null  # Don't want to display lines in dictionary file.
  13   lookup=$?   # Exit status of 'look' command.
  14 
  15   if [ "$lookup" -eq 0 ]
  16   then
  17     echo "\"$word\" is valid."
  18   else
  19     echo "\"$word\" is invalid."
  20   fi  
  21 
  22 done <"$file"  # Redirects stdin to $file, so "reads" come from there.
  23 
  24 echo
  25 
  26 exit 0
  27 
  28 # ----------------------------------------------------------------------
  29 # Code below this line will not execute because of "exit" command above.
  30 
  31 
  32 # Stephane Chazelas proposes the following, more concise alternative:
  33 
  34 while read word && [[ $word != end ]]
  35 do if look "$word" > /dev/null
  36    then echo "\"$word\" is valid."
  37    else echo "\"$word\" is invalid."
  38    fi
  39 done <"$file"
  40 
  41 exit 0

sed, awk

Scripting languages especially suited for parsing text files and command output. May be embedded singly or in combination in pipes and shell scripts.

sed

Non-interactive "stream editor", permits using many ex commands in batch mode. It finds many uses in shell scripts.

awk

Programmable file extractor and formatter, good for manipulating and/or extracting fields (columns) in structured text files. Its syntax is similar to C.

groff, gs, TeX

Groff, TeX, and Postscript are text markup languages used for preparing copy for printing or formatted video display.

Man pages use groff (see Example A-1). Ghostscript (gs) is a GPL Postscript interpreter. TeX is Donald Knuth's elaborate typsetting system. It is often convenient to write a shell script encapsulating all the options and arguments passed to one of these markup languages.

wc

wc gives a "word count" on a file or I/O stream:

 bash $ wc /usr/doc/sed-3.02/README
 20     127     838 /usr/doc/sed-3.02/README
 [20 lines  127 words  838 characters]

wc -w gives only the word count.

wc -l gives only the line count.

wc -c gives only the character count.

wc -L gives only the length of the longest line.

Using wc to count how many .txt files are in current working directory:

   1 $ ls *.txt | wc -l
   2 # Will work as long as none of the "*.txt" files have a linefeed in their name.
   3 
   4 # Alternative ways of doing this are:
   5 #      find . -maxdepth 1 -name \*.txt -print0 | grep -cz .
   6 #      (shopt -s nullglob; set -- *.txt; echo $#)
   7 
   8 # Thanks, S.C.

See also Example 3-106 and Example 3-122.

Certain commands include some of the functionality of wc as options.

   1 ... | grep foo | wc -l
   2 # This frequently used construct can be more concisely rendered.
   3 
   4 ... | grep -c foo
   5 # Just use the "-c" (or "--count") option of grep.
   6 
   7 # Thanks, S.C.

tr

character translation filter.

Caution

Must use quoting and/or brackets, as appropriate. Quotes prevent the shell from reinterpreting the special characters in tr command sequences. Brackets should be quoted to prevent expansion by the shell.

Either tr "A-Z" "*" <filename or tr A-Z \* <filename changes all the uppercase letters in filename to asterisks (writes to stdout). On some systems this may not work, but tr A-Z '[**]' will.

The -d option deletes a range of characters.

   1 tr -d 0-9 <filename
   2 # Deletes all digits from the file "filename".

The --squeeze-repeats (or -s) option deletes all but the first instance of a string of consecutive characters. This option is useful for removing excess whitespace.

 bash$ echo "XXXXX" | tr --squeeze-repeats 'X'
 X


Example 3-95. toupper: Transforms a file to all uppercase.

   1 #!/bin/bash
   2 # Changes a file to all uppercase.
   3 
   4 E_BADARGS=65
   5 
   6 if [ -z "$1" ]
   7 # Standard check whether command line arg is present.
   8 then
   9   echo "Usage: `basename $0` filename"
  10   exit $E_BADARGS
  11 fi  
  12 
  13 tr a-z A-Z <"$1"
  14 
  15 # Same effect as above, but using character set notation:
  16 #        tr '[:lower:]' '[:upper:]' <"$1"
  17 # Thanks, S.C.
  18 
  19 exit 0


Example 3-96. lowercase: Changes all filenames in working directory to lowercase.

   1 #! /bin/bash
   2 #
   3 # Changes every filename in working directory to all lowercase.
   4 #
   5 # Inspired by a script of john dubois,
   6 # which was translated into into bash by Chet Ramey,
   7 # and considerably simplified by Mendel Cooper, author of this document.
   8 
   9 
  10 for filename in *  #Traverse all files in directory.
  11 do
  12    fname=`basename $filename`
  13    n=`echo $fname | tr A-Z a-z`  #Change name to lowercase.
  14    if [ "$fname" != "$n" ]  # Rename only files not already lowercase.
  15    then
  16      mv $fname $n
  17    fi  
  18 done   
  19 
  20 exit 0
  21 
  22 
  23 # Code below this line will not execute because of "exit".
  24 #--------------------------------------------------------#
  25 # To run it, delete script above line.
  26 
  27 # The above script will not work on filenames containing blanks or newlines.
  28 
  29 # Stephane Chazelas therefore suggests the following alternative:
  30 
  31 
  32 for filename in *
  33 # Not necessary to use basename, since "*" won't return any file containing /
  34 do n=`echo "$filename/" | tr '[:upper:]' '[:lower:]'`
  35 #                    Slash added so that trailing newlines are not
  36 #                    removed by command substitution.
  37    # Variable substitution:
  38    n=${n%/}   #Removes trailing slash, added above, from filename.
  39    [[ $filename == $n ]] || mv "$filename" "$n"
  40    # Checks if filename already lowercase.
  41 done
  42 
  43 exit 0


Example 3-97. du: DOS to UNIX text file conversion.

   1 #!/bin/bash
   2 # du.sh: DOS to UNIX text file converter.
   3 
   4 WRONGARGS=65
   5 
   6 if [ -z "$1" ]
   7 then
   8   echo "Usage: `basename $0` filename-to-convert"
   9   exit $WRONGARGS
  10 fi
  11 
  12 NEWFILENAME=$1.unx
  13 
  14 CR='\015'  # Carriage return.
  15 # Lines in DOS text files end in a CR-LF.
  16 
  17 tr -d $CR < $1 > $NEWFILENAME
  18 # Delete CR and write to new file.
  19 
  20 echo "Original DOS text file is \"$1\"."
  21 echo "Converted UNIX text file is \"$NEWFILENAME\"."
  22 
  23 exit 0


Example 3-98. rot13: rot13, ultra-weak encryption.

   1 #!/bin/bash
   2 # Classic rot13 algorithm, encryption that might fool a 3-year old.
   3 # Usage: ./rot13.sh filename
   4 # or     ./rot13.sh <filename
   5 # or     ./rot13.sh and supply keyboard input (stdin)
   6 
   7 cat "$@" | tr 'a-zA-Z' 'n-za-mN-ZA-M'   # "a" goes to "n", "b" to "o", etc.
   8 # The 'cat "$@"' construction
   9 # permits getting input either from stdin or from files.
  10 
  11 exit 0

fold

A filter that wraps lines of input to a specified width. This is especially useful with the -s option, which breaks lines at word spaces (see Example 3-99 and Example A-2).

fmt

Simple-minded file formatter, used as a filter in a pipe to "wrap" long lines of text output.


Example 3-99. Formatted file listing.

   1 #!/bin/bash
   2 
   3 # Get a file listing...
   4 
   5 b=`ls /usr/local/bin`
   6 
   7 # ...40 columns wide.
   8 echo $b | fmt -w 40
   9 
  10 # Could also have been done by
  11 # echo $b | fold - -s -w 40
  12  
  13 exit 0

See also Example 3-86.

Tip

A powerful alternative to fmt is Kamil Toman's par utility, available from http://www.cs.berkeley.edu/~amc/Par/.

ptx

The ptx [targetfile] command outputs a permuted index (cross-reference list) of the targetfile. This may be further filtered and formatted in a pipe, if necessary.

column

Column formatter. This filter transforms list-type text output into a "pretty-printed" table by inserting tabs at appropriate places.


Example 3-100. Using column to format a directory listing

   1 #!/bin/bash
   2 # This is a slight modification of the example file in the "column" man page.
   3 
   4 
   5 (printf "PERMISSIONS LINKS OWNER GROUP SIZE MONTH DAY HH:MM PROG-NAME\n" \
   6 ; ls -l | sed 1d) | column -t
   7 
   8 # The "sed 1d" in the pipe deletes the first line of output,
   9 # which would be "total        N",
  10 # where "N" is the total number of files found by "ls -l".
  11 
  12 # The -t option to "column" pretty-prints a table.
  13 
  14 exit 0

nl

Line numbering filter. nl filename lists filename to stdout, but inserts consecutive numbers at the beginning of each non-blank line. If filename omitted, operates on stdin.


Example 3-101. nl: A self-numbering script.

   1 #!/bin/bash
   2 
   3 # This file echoes itself twice to stdout with its lines numbered.
   4 
   5 # 'nl' sees this as line 3 since it does not number blank lines.
   6 # 'cat -n' sees the above line as number 5.
   7 
   8 nl `basename $0`
   9 
  10 echo; echo  # Now, let's try it with 'cat -n'
  11 
  12 cat -n `basename $0`
  13 # The difference is that 'cat -n' numbers the blank lines.
  14 # Note that 'nl -ba' will also do so.
  15 
  16 exit 0

pr

Print formatting filter. This will paginate files (or stdout) into sections suitable for hard copy printing or viewing on screen. Various options permit row and column manipulation, joining lines, setting margins, numbering lines, adding page headers, and merging files, among other things. The pr command combines much of the functionality of nl, paste, fold, column, and expand.

pr -o 5 --width=65 fileZZZ | more gives a nice paginated listing to screen of fileZZZ with margins set at 5 and 65.

A particularly useful option is -d, forcing double-spacing (same effect as sed -G).

gettext

A GNU utility for localization and translating the text output of programs into foreign languages. While primarily intended for C programs, gettext also finds use in shell scripts. See the info page.

3.10.5. File and Archiving Commands

tar

The standard UNIX archiving utility. Originally a Tape ARchiving program, it has developed into a general purpose package that can handle all manner of archiving with all types of destination devices, ranging from tape drives to regular files to even stdout (see Example 3-5). GNU tar has long since been patched to accept gzip compression options, such as tar czvf archive-name.tar.gz *, which recursively archives and compresses all files (except dotfiles) in a directory tree.

Some useful tar options:

  1. -c create (a new archive)

  2. --delete delete (files from the archive)

  3. -r append (files to the archive)

  4. -t list (archive contents)

  5. -u update archive

  6. -x extract (files from the archive)

  7. -z gzip the archive

Caution

It may be difficult to recover data from a corrupted gzipped tar archive. When archiving important files, make multiple backups.

cpio

This specialized archiving copy command is rarely seen any more, having been supplanted by tar/gzip. It still has its uses, such as moving a directory tree.


Example 3-102. Using cpio to move a directory tree

   1 #!/bin/bash
   2 
   3 # Copying a directory tree using cpio.
   4 
   5 ARGS=2
   6 
   7 if [ $# -ne "$ARGS" ]
   8 then
   9   echo Usage: `basename $0` source destination
  10   exit 65
  11 fi  
  12 
  13 source=$1
  14 destination=$2
  15 
  16 find "$source" -depth | cpio -admvp "$destination"
  17 
  18 exit 0

install

Special purpose file copying command, similar to cp, but capable of setting permissions and attributes of the copied files. This command seems tailormade for installing software packages, and as such it shows up frequently in Makefiles (in the make install : section). It could likewise find use in installation scripts.

gzip

The standard GNU/UNIX compression utility, replacing the inferior and proprietary compress. The corresponding decompression command is gunzip, which is the equivalent of gzip -d.

The zcat filter decompresses a gzipped file to stdout, as possible input to a pipe or redirection. This is, in effect, a cat command that works on compressed files (including files processed with the older compress utility). The zcat command is equivalent to gzip -dc.

Caution

On some commercial UNIX systems, zcat is a synonym for uncompress -c, and will not work on gzipped files.

See also Example 3-19.

bzip2

An alternate compression utility, usually more efficient than gzip, especially on large files. The corresponding decompression command is bunzip2.

compress, uncompress

This is an older, proprietary compression utility found in commercial UNIX distributions. The more efficient gzip has largely replaced it. Linux distributions generally include a compress workalike for compatibility, although gunzip can unarchive files treated with compress.

sq

Yet another compression utility, a filter that works only on sorted ASCII word lists. It uses the standard invocation syntax for a filter, sq < input-file > output-file. Fast, but not nearly as efficient as gzip. The corresponding uncompression filter is unsq, invoked like sq.

Tip

The output of sq may be piped to gzip for further compression.

shar

Shell archiving utility. The files in a shell archive are concatenated without compression, and the resultant archive is essentially a shell script, complete with #!/bin/sh header, and containing all the necessary unarchiving commands. Shar archives still show up in Internet newsgroups, but otherwise shar has been pretty well replaced by tar/gzip. The unshar command unpacks shar archives.

split

Utility for splitting a file into smaller chunks. Usually used for splitting up large files in order to back them up on floppies or preparatory to e-mailing or uploading them.

file

A utility for identifying file types. The command file file-name will return a file specification for file-name, such as ascii text or data. It references the magic numbers found in /usr/share/magic, /etc/magic, or /usr/lib/magic, depending on the Linux/UNIX distribution.


Example 3-103. stripping comments from C program files

   1 #!/bin/bash
   2 
   3 # Strips out the comments (/* comment */) in a C program.
   4 
   5 NOARGS=0
   6 ARGERROR=65
   7 WRONG_FILE_TYPE=66
   8 
   9 if [ $# -eq "$NOARGS" ]
  10 then
  11   echo "Usage: `basename $0` C-program-file" >&2 # Error message to stderr.
  12   exit $ARGERROR
  13 fi  
  14 
  15 # Test for correct file type.
  16 type=`eval file $1 | awk '{ print $2, $3, $4, $5 }'`
  17 # "file $1" echoes file type...
  18 # then awk removes the first field of this, the filename...
  19 # then the result is fed into the variable "type".
  20 correct_type="ASCII C program text"
  21 
  22 if [ "$type" != "$correct_type" ]
  23 then
  24   echo
  25   echo "This script works on C program files only."
  26   echo
  27   exit $WRONG_FILE_TYPE
  28 fi  
  29 
  30 
  31 # Rather cryptic sed script:
  32 #--------
  33 sed '
  34 /^\/\*/d
  35 /.*\/\*/d
  36 ' $1
  37 #--------
  38 # Easy to understand if you take several hours to learn sed fundamentals.
  39 
  40 
  41 # Need to add one more line to the sed script to deal with
  42 # case where line of code has a comment following it on same line.
  43 # This is left as a non-trivial exercise for the reader.
  44 
  45 # Also, the above code deletes lines with a "*/" or "/*",
  46 # not a desirable result.
  47 
  48 exit 0
  49 
  50 
  51 # ----------------------------------------------------------------
  52 # Code below this line will not execute because of 'exit 0' above.
  53 
  54 # Stephane Chazelas suggests the following alternative:
  55 
  56 usage() {
  57   echo "Usage: `basename $0` C-program-file" >&2
  58   exit 1
  59 }
  60 
  61 WEIRD=`echo -n -e '\377'`   # or WEIRD=$'\377'
  62 [[ $# -eq 1 ]] || usage
  63 case `file "$1"` in
  64   *"C program text"*) sed -e "s%/\*%${WEIRD}%g;s%\*/%${WEIRD}%g" "$1" \
  65      | tr '\377\n' '\n\377' \
  66      | sed -ne 'p;n' \
  67      | tr -d '\n' | tr '\377' '\n';;
  68   *) usage;;
  69 esac
  70 
  71 # This is still fooled by things like:
  72 # printf("/*");
  73 # or
  74 # /*  /* buggy embedded comment */
  75 #
  76 # To handle all special cases (comments in strings, comments in string
  77 # where there is a \", \\" ...) the only way is to write a C parser
  78 # (lex or yacc perhaps?).
  79 
  80 exit 0

which

which command-xxx gives the full path to "command-xxx". This is useful for finding out whether a particular command or utility is installed on the system.

$bash which rm

 /usr/bin/rm

whereis

Similar to which, above, whereis command-xxx gives the full path to "command-xxx", but also to its man page.

$bash whereis rm

 rm: /bin/rm /usr/share/man/man1/rm.1.bz2

whatis

whatis filexxx looks up "filexxx" in the whatis database. This is useful for identifying system commands and important configuration files. Consider it a simplified man command.

$bash whatis whatis

 whatis               (1)  - search the whatis database for complete words


Example 3-104. Exploring /usr/X11R6/bin

   1 #!/bin/bash
   2 
   3 # What are all those mysterious binaries in /usr/X11R6/bin?
   4 
   5 DIRECTORY="/usr/X11R6/bin"
   6 # Try also "/bin", "/usr/bin", "/usr/local/bin", etc.
   7 
   8 for file in $DIRECTORY/*
   9 do
  10   whatis `basename $file`
  11   # Echoes info about the binary.
  12 done
  13 
  14 exit 0
  15 # You may wish to redirect output of this script, like so:
  16 # ./what.sh >>whatis.db
  17 # or view it a page at a time on stdout,
  18 # ./what.sh | less

See also Example 3-45.

locate, slocate

The locate command searches for files using a database stored for just that purpose. The slocate command is the secure version of locate (which may be aliased to slocate).

$bash locate hickson

 /usr/lib/xephem/catalogs/hickson.edb

basename

Strips the path information from a file name, printing only the file name. The construction basename $0 lets the script know its name, that is, the name it was invoked by. This can be used for "usage" messages if, for example a script is called with missing arguments:

   1 echo "Usage: `basename $0` arg1 arg2 ... argn"

dirname

Strips the basename from a file name, printing only the path information.

Note

basename and dirname can operate on any arbitrary string. The filename given as an argument does not need to refer to an existing file.


Example 3-105. basename and dirname

   1 #!/bin/bash
   2 
   3 a=/home/heraclius/daily-journal.txt
   4 
   5 echo "Basename of /home/heraclius/daily-journal.txt = `basename $a`"
   6 echo "Dirname of /home/heraclius/daily-journal.txt = `dirname $a`"
   7 
   8 exit 0

uuencode

This utility encodes binary files into ASCII characters, making them suitable for transmission in the body of an e-mail message or in a newsgroup posting.

uudecode

This reverses the encoding, decoding uuencoded files back into the original binaries.


Example 3-106. uudecoding encoded files

   1 #!/bin/bash
   2 
   3 lines=35
   4 # Allow 35 lines for the header (very generous).
   5 
   6 for File in *
   7 # Test all the files in the current working directory...
   8 do
   9   search1=`head -$lines $File | grep begin | wc -w`
  10   search2=`tail -$lines $File | grep end | wc -w`
  11   # Uuencoded files have a "begin" near the beginning, and an "end" near the end.
  12   if [ "$search1" -gt 0 ]
  13   then
  14     if [ "$search2" -gt 0 ]
  15     then
  16       echo "uudecoding - $File -"
  17       uudecode $File
  18     fi  
  19   fi
  20 done  
  21 
  22 # Note that running this script upon itself fools it
  23 # into thinking it is a uuencoded file,
  24 # because it contains both "begin" and "end".
  25 
  26 exit 0

Tip

The fold -s command may be useful (possibly in a pipe) to process long uudecoded text messages downloaded from Usenet newsgroups.

sum, cksum, md5sum

These are utilities for generating checksums. A checksum is a number mathematically calculated from the contents of a file, for the purpose of checking its integrity. A script might refer to a list of checksums for security purposes, such as ensuring that the contents of key system files have not been altered or corrupted. The md5sum command is the most appropriate of these in security applications.

crypt

At one time, this was the standard UNIX file encryption utility. [2] Politically motivated government regulations prohibiting the export of encryption software resulted in the disappearance of crypt from much of the UNIX world, and it is still missing from most Linux distributions. Fortunately, programmers have come up with a number of decent alternatives to it, among them the author's very own cruft (see Example A-4).

strings

Use the strings command to find printable strings in a binary or data file. It will list sequences of printable characters found in the target file. This might be handy for a quick 'n dirty examination of a core dump or for looking at an unknown graphic image file (strings image-file | more might show something like JFIF, which would identify the file as a jpeg graphic). In a script, you would probably parse the output of strings with grep or sed. See Example 3-49 and Example 3-50.

more, less

Pagers that display a text file or stream to stdout, one screenful at a time. These may be used to filter the output of a script.

3.10.6. Communications Commands

host

Searches for information about an Internet host by name or IP address, using DNS.

vrfy

Verify an Internet e-mail address.

nslookup

Do an Internet "name server lookup" on a host by IP address. This may be run either interactively or noninteractively, i.e., from within a script.

dig

Similar to nslookup, do an Internet "name server lookup" on a host. May be run either interactively or noninteractively, i.e., from within a script.

traceroute

Trace the route taken by packets sent to a remote host. This command works within a LAN, WAN, or over the Internet. The remote host may be specified by an IP address. The output of this command may be filtered by grep or sed in a pipe.

rcp

"Remote copy", copies files between two different networked machines. Using rcp and similar utilities with security implications in a shell script may not be advisable. Consider instead, using an expect script.

sx, rx

The sx and rx command set serves to transfer files to and from a remote host using the xmodem protocol. These are generally part of a communications package, such as minicom.

sz, rz

The sz and rz command set serves to transfer files to and from a remote host using the zmodem protocol. Zmodem has certain advantages over xmodem, such as greater transmission rate and resumption of interrupted file transfers. Like sx and rx, these are generally part of a communications package.

write

This is a utility for terminal-to-terminal communication. It allows sending lines from your terminal (console or xterm) to that of another user. The mesg command may, of course, be used to disable write access to a terminal

Since write is interactive, it would not normally find use in a script.

uucp

UNIX to UNIX copy. This is a communications package for transferring files between UNIX servers. A shell script is an effective way to handle a uucp command sequence.

Since the advent of the Internet and e-mail, uucp seems to have faded into obscurity, but it still exists and remains perfectly workable in situations where an Internet connection is not available or appropriate.

3.10.7. Terminal Control Commands

tput

Initialize terminal and/or fetch information about it from terminfo data. Various options permit certain terminal operations. tput clear is the equivalent of clear, below. tput reset is the equivalent of reset, below.

 bash$ tput longname
 xterm terminal emulator (XFree86 4.0 Window System)
 	      

Note that stty offers a more powerful command set for controlling a terminal.

reset

Reset terminal parameters and clear text screen. As with clear, the cursor and prompt reappear in the upper lefthand corner of the terminal.

clear

The clear command simply clears the text screen at the console or in an xterm. The prompt and cursor reappear at the upper lefthand corner of the screen or xterm window. This command may be used either at the command line or in a script. See Example 3-64.

script

This utility records (saves to a file) all the user keystrokes at the command line in a console or an xterm window. This, in effect, create a record of a session.

3.10.8. Math Commands

factor

Factor an integer into prime factors.

 bash$ factor 27417
 27417: 3 13 19 37
 	      

bc, dc

These are flexible, arbitrary precision calculation utilities.

bc has a syntax vaguely resembling C.

dc uses RPN ("Reverse Polish Notation").

Of the two, bc seems more useful in scripting. It is a fairly well-behaved UNIX utility, and may therefore be used in a pipe.

Bash can't handle floating point calculations, and it lacks operators for certain important mathematical functions. Fortunately, bc comes to the rescue.

Here is a simple template for using bc to calculate a script variable.

 	      variable=$(echo "OPTIONS; OPERATIONS" | bc)
 	      


Example 3-107. Monthly Payment on a Mortgage

   1 #!/bin/bash
   2 # monthlypmt.sh: Calculates monthly payment on a mortgage.
   3 
   4 
   5 # This is a modification of code in the "mcalc" (mortgage calculator) package,
   6 # by Jeff Schmidt and Mendel Cooper (yours truly, the author of this document).
   7 #   http://www.ibiblio.org/pub/Linux/apps/financial/mcalc-1.6.tar.gz  [15k]
   8 
   9 echo
  10 echo "Given the principal, interest rate, and term of a mortgage,"
  11 echo "calculate the monthly payment."
  12 
  13 bottom=1.0
  14 
  15 echo
  16 echo -n "Enter principal (no commas) "
  17 read principal
  18 echo -n "Enter interest rate (percent) "  # If 12%, enter "12", not ".12".
  19 read interest_r
  20 echo -n "Enter term (months) "
  21 read term
  22 
  23 
  24  interest_r=$(echo "scale=9; $interest_r/100.0" | bc) # Convert to decimal.
  25  # "scale" determines how many decimal places.
  26   
  27 
  28  interest_rate=$(echo "scale=9; $interest_r/12 + 1.0" | bc)
  29  
  30 
  31  top=$(echo "scale=9; $principal*$interest_rate^$term" | bc)
  32 
  33  echo; echo "Please be patient. This may take a while."
  34 
  35  let "months = $term - 1"
  36  for ((x=$months; x > 0; x--))
  37  do
  38    bot=$(echo "scale=9; $interest_rate^$x" | bc)
  39    bottom=$(echo "scale=9; $bottom+$bot" | bc)
  40 #  bottom = $(($bottom + $bot"))
  41  done
  42 
  43 # let "payment = $top/$bottom"
  44  payment=$(echo "scale=2; $top/$bottom" | bc)
  45  # Use two decimal places for dollars and cents.
  46  
  47  echo
  48  echo "monthly payment = \$$payment"  # Echo a dollar sign in front of amount.
  49  echo
  50 
  51 
  52  exit 0
  53 
  54  # Exercises:
  55  #   1) Filter input to permit commas in principal amount.
  56  #   2) Filter input to permit interest to be entered as percent or decimal.
  57  #   3) If you are really ambitious,
  58  #      expand this script to print complete amortization tables.


Example 3-108. Base Conversion

   1 :
   2 ##########################################################################
   3 # Shellscript:	base.sh - print number to different bases (Bourne Shell)
   4 # Author     :	Heiner Steven (heiner.steven@odn.de)
   5 # Date       :	07-03-95
   6 # Category   :	Desktop
   7 # $Id: base.sh,v 1.2 2000/02/06 19:55:35 heiner Exp $
   8 ##########################################################################
   9 # Description
  10 #
  11 # Changes
  12 # 21-03-95 stv	fixed error occuring with 0xb as input (0.2)
  13 ##########################################################################
  14 
  15 # ==> Used in this document with the script author's permission.
  16 # ==> Comments added by document author.
  17 
  18 PN=`basename "$0"`			# Program name
  19 VER=`echo '$Revision: 1.2 $' | cut -d' ' -f2`  # ==> VER=1.2
  20 
  21 Usage () {
  22     echo "$PN - print number to different bases, $VER (stv '95)
  23 usage: $PN [number ...]
  24 
  25 If no number is given, the numbers are read from standard input.
  26 A number may be
  27     binary (base 2)		starting with 0b (i.e. 0b1100)
  28     octal (base 8)		starting with 0  (i.e. 014)
  29     hexadecimal (base 16)	starting with 0x (i.e. 0xc)
  30     decimal			otherwise (i.e. 12)" >&2
  31     exit 65
  32 }
  33 
  34 Msg () {
  35     for i   # ==> in [list] missing.
  36     do echo "$PN: $i" >&2
  37     done
  38 }
  39 
  40 Fatal () { Msg "$@"; exit 66; }
  41 
  42 PrintBases () {
  43     # Determine base of the number
  44     for i      # ==> in [list] missing...
  45     do         # ==> so operates on command line arg(s).
  46 	case "$i" in
  47 	    0b*)		ibase=2;;	# binary
  48 	    0x*|[a-f]*|[A-F]*)	ibase=16;;	# hexadecimal
  49 	    0*)			ibase=8;;	# octal
  50 	    [1-9]*)		ibase=10;;	# decimal
  51 	    *)
  52 		Msg "illegal number $i - ignored"
  53 		continue;;
  54 	esac
  55 
  56 	# Remove prefix, convert hex digits to uppercase (bc needs this)
  57 	number=`echo "$i" | sed -e 's:^0[bBxX]::' | tr '[a-f]' '[A-F]'`
  58 	# ==> Uses ":" as sed separator, rather than "/".
  59 
  60 	# Convert number to decimal
  61 	dec=`echo "ibase=$ibase; $number" | bc`  # ==> 'bc' is calculator utility.
  62 	case "$dec" in
  63 	    [0-9]*)	;;			 # number ok
  64 	    *)		continue;;		 # error: ignore
  65 	esac
  66 
  67 	# Print all conversions in one line.
  68 	# ==> 'here document' feeds command list to 'bc'.
  69 	echo `bc <<!
  70 	    obase=16; "hex="; $dec
  71 	    obase=10; "dec="; $dec
  72 	    obase=8;  "oct="; $dec
  73 	    obase=2;  "bin="; $dec
  74 !
  75     ` | sed -e 's: :	:g'
  76 
  77     done
  78 }
  79 
  80 while [ $# -gt 0 ]
  81 do
  82     case "$1" in
  83 	--)	shift; break;;
  84 	-h)	Usage;;  # ==> Help message.
  85 	-*)	Usage;;
  86 	*)	break;;			# first number
  87     esac   # ==> More error checking for illegal input would be useful.
  88     shift
  89 done
  90 
  91 if [ $# -gt 0 ]
  92 then
  93     PrintBases "$@"
  94 else					# read from stdin
  95     while read line
  96     do
  97 	PrintBases $line
  98     done
  99 fi

3.10.9. Miscellaneous Commands

jot, seq

These utilities emit a sequence of integers, with a user selected increment. This can be used to advantage in a for loop.


Example 3-109. Using seq to generate loop arguments

   1 #!/bin/bash
   2 
   3 for a in `seq 80`  # or   for a in $( seq 80 )
   4 # Same as   for a in 1 2 3 4 5 ... 80   (saves much typing!).
   5 # May also use 'jot' (if present on system).
   6 do
   7   echo -n "$a "
   8 done
   9 # Example of using the output of a command to generate 
  10 # the [list] in a "for" loop.
  11 
  12 echo; echo
  13 
  14 # Yes, "seq" may also take a replaceable parameter.
  15 
  16 COUNT=80
  17 
  18 for a in `seq $COUNT`  # or   for a in $( seq $COUNT )
  19 do
  20   echo -n "$a "
  21 done
  22 
  23 echo
  24 
  25 exit 0

yes

In its default behavior the yes command feeds a continuous string of the character y followed by a line feed to stdout. A control-c terminates the run. A different output string may be specified, as in yes different string, which would continually output different string to stdout. One might well ask the purpose of this. From the command line or in a script, the output of yes can be redirected or piped into a program expecting user input. In effect, this becomes a sort of poor man's version of expect.

yes | fsck /dev/hda1 runs fsck non-interactively (careful!).

yes | rm -r dirname has same effect as rm -rf dirname (careful!).

Warning

Be very cautious when piping yes to a potentially dangerous system command, such as fsck or fdisk.

printenv

Show all the environmental variables set for a particular user.

 bash$ printenv | grep HOME
 HOME=/home/bozo
 	      

lp

The lp and lpr commands send file(s) to the print queue, to be printed as hard copy. [3] These commands trace the origin of their names to the line printers of another era.

bash$ lp file1.txt or bash lp <file1.txt

It is often useful to pipe the formatted output from pr to lp.

bash$ pr -options file1.txt | lp

Formatting packages, such as groff and Ghostscript may send their output directly to lp.

bash$ groff -Tascii file.tr | lp

bash$ gs -options | lp file.ps

Related commands are lpq, for viewing the print queue, and lprm, for removing jobs from the print queue.

tee

[UNIX borrows an idea here from the plumbing trade.]

This is a redirection operator, but with a difference. Like the plumber's tee, it permits "siponing off" the output of a command or commands within a pipe, but without affecting the result. This is useful for printing an ongoing process to a file or paper, perhaps to keep track of it for debugging purposes.

                    tee
                  |------> to file
                  |
   ===============|===============
   command--->----|-operator-->---> result of command(s)
   ===============================
 	      

   1 cat listfile* | sort | tee check.file | uniq > result.file
(The file check.file contains the concatenated sorted "listfiles", before the duplicate lines are removed by uniq.)

mkfifo

This obscure command creates a named pipe, a temporary first-in-first-out buffer for transferring data between processes. [4] Typically, one process writes to the FIFO, and the other reads from it. See Example A-9.

pathchk

This command checks the validity of a filename. If the filename exceeds the maximum allowable length (255 characters) or one or more of the directories in its path is not searchable, then an error message results. Unfortunately, pathchk does not return a recognizable error code, and it is therefore pretty much useless in a script.

dd

This is the somewhat obscure and much feared "data duplicator" command. Originally a utility for exchanging data on magnetic tapes between UNIX minicomputers and IBM mainframes, this command still has its uses. The dd command simply copies a file (or stdin/stdout), but with conversions. Possible conversions are ASCII/EBCDIC, upper/lower case, swapping of byte pairs between input and output, and skipping and/or truncating the head or tail of the input file. A dd --help lists the conversion and other options that this powerful utility takes.

   1 # Exercising 'dd'.
   2 
   3 n=3
   4 p=5
   5 input_file=project.txt
   6 output_file=log.txt
   7 
   8 dd if=$input_file of=$output_file bs=1 skip=$((n-1)) count=$((p-n+1)) 2> /dev/null
   9 # Extracts characters n to p from file $input_file.
  10 
  11 
  12 
  13 
  14 echo -n "hello world" | dd cbs=1 conv=unblock 2> /dev/null
  15 # Echoes "hello world" vertically.
  16 
  17 
  18 # Thanks, S.C.

To demonstrate just how versatile dd is, let's use it to capture keystrokes.


Example 3-110. Capturing Keystrokes

   1 #!/bin/bash
   2 # Capture keystrokes without needing to press ENTER.
   3 
   4 
   5 keypresses=4                      # Number of keypresses to capture.
   6 
   7 
   8 old_tty_setting=$(stty -g)        # Save old terminal settings.
   9 
  10 echo "Press $keypresses keys."
  11 stty -icanon -echo                # Disable canonical mode.
  12                                   # Disable local echo.
  13 keys=$(dd bs=1 count=$keypresses 2> /dev/null)
  14 # 'dd' uses stdin, if "if" not specified.
  15 
  16 stty "$old_tty_setting"           # Restore old terminal settings.
  17 
  18 echo "You pressed the \"$keys\" keys."
  19 
  20 # Thanks, S.C. for showing the way.
  21 exit 0

The dd command can do random access on a data stream.

   1 echo -n . | dd bs=1 seek=4 of=file conv=notrunc
   2 # The "conv=notrunc" option means that the output file will not be truncated.		
   3 
   4 # Thanks, S.C.

The dd command can copy raw data and disk images to and from devices, such as floppies and tape drives (Example A-5). A common use is creating boot floppies.

   1 dd if=kernel-image of=/dev/fd0H1440
One important use for dd is initializing temporary swap files (Example 3-159). It can even do a low-level copy of an entire hard drive partition, although this is not necessarily recommended.

od

The od, or octal dump command converts input (or files) to octal (base-8) or other bases. This is useful for viewing or processing binary data files or otherwise unreadable system device files, such as /dev/urandom, and as a filter for binary data. See Example 3-41 and Example 3-91.

Notes

[1]

These are files whose names begin with a dot, such as ~/.Xdefaults. Such filenames do not show up in a normal ls listing, and they cannot be deleted by an accidental rm -rf *. Dotfiles are generally used as setup and configuration files in a user's home directory.

[2]

This is a symmetric block cipher, used to encrypt files on a single system or local network, as opposed to the "public key" cipher class, of which pgp is a well-known example.

[3]

The print queue is the group of jobs "waiting in line" to be printed.

[4]

For an excellent overview of this topic, see Andy Vaught's article, Introduction to Named Pipes, in the September, 1997 issue of Linux Journal.