36.6. Optimizations

Most shell scripts are quick 'n dirty solutions to non-complex problems. As such, optimizing them for speed is not much of an issue. Consider the case, though, where a script carries out an important task, does it well, but runs too slowly. Rewriting it in a compiled language may not be a palatable option. The simplest fix would be to rewrite the parts of the script that slow it down. Is it possible to apply principles of code optimization even to a lowly shell script?

Check the loops in the script. Time consumed by repetitive operations adds up quickly. If at all possible, remove time-consuming operations from within loops.

Use builtin commands in preference to system commands. Builtins execute faster and usually do not launch a subshell when invoked.

Avoid unnecessary commands, particularly in a pipe.
   1 cat "$file" | grep "$word"
   2 
   3 grep "$word" "$file"
   4 
   5 #  The above command-lines have an identical effect,
   6 #+ but the second runs faster since it launches one fewer subprocess.
The cat command seems especially prone to overuse in scripts.

Note

Certain operators, notably expr, are very inefficient and might be replaced by double parentheses arithmetic expansion. See Example A-59.

   1 Math tests
   2 
   3 math via $(( ))
   4 real          0m0.294s
   5 user          0m0.288s
   6 sys           0m0.008s
   7 
   8 math via expr:
   9 real          1m17.879s   # Much slower!
  10 user          0m3.600s
  11 sys           0m8.765s
  12 
  13 math via let:
  14 real          0m0.364s
  15 user          0m0.372s
  16 sys           0m0.000s

Condition testing constructs in scripts deserve close scrutiny. Substitute case for if-then constructs and combine tests when possible, to minimize script execution time. Again, refer to Example A-59.

   1 Test using "case" construct:
   2 real          0m0.329s
   3 user          0m0.320s
   4 sys           0m0.000s
   5 
   6 
   7 Test with if [], no quotes:
   8 real          0m0.438s
   9 user          0m0.432s
  10 sys           0m0.008s
  11 
  12 
  13 Test with if [], quotes:
  14 real          0m0.476s
  15 user          0m0.452s
  16 sys           0m0.024s
  17 
  18 
  19 Test with if [], using -eq:
  20 real          0m0.457s
  21 user          0m0.456s
  22 sys           0m0.000s

Note

Erik Brandsberg recommends using associative arrays in preference to conventional numeric-indexed arrays in most cases. When overwriting values in a numeric array, there is a significant performance penalty vs. associative arrays. Running a test script confirms this. See Example A-60.

   1 Assignment tests
   2 
   3 Assigning a simple variable
   4 real          0m0.418s
   5 user          0m0.416s
   6 sys           0m0.004s
   7 
   8 Assigning a numeric index array entry
   9 real          0m0.582s
  10 user          0m0.564s
  11 sys           0m0.016s
  12 
  13 Overwriting a numeric index array entry
  14 real          0m21.931s
  15 user          0m21.913s
  16 sys           0m0.016s
  17 
  18 Linear reading of numeric index array
  19 real          0m0.422s
  20 user          0m0.416s
  21 sys           0m0.004s
  22 
  23 Assigning an associative array entry
  24 real          0m1.800s
  25 user          0m1.796s
  26 sys           0m0.004s
  27 
  28 Overwriting an associative array entry
  29 real          0m1.798s
  30 user          0m1.784s
  31 sys           0m0.012s
  32 
  33 Linear reading an associative array entry
  34 real          0m0.420s
  35 user          0m0.420s
  36 sys           0m0.000s
  37 
  38 Assigning a random number to a simple variable
  39 real          0m0.402s
  40 user          0m0.388s
  41 sys           0m0.016s
  42 
  43 Assigning a sparse numeric index array entry randomly into 64k cells
  44 real          0m12.678s
  45 user          0m12.649s
  46 sys           0m0.028s
  47 
  48 Reading sparse numeric index array entry
  49 real          0m0.087s
  50 user          0m0.084s
  51 sys           0m0.000s
  52 
  53 Assigning a sparse associative array entry randomly into 64k cells
  54 real          0m0.698s
  55 user          0m0.696s
  56 sys           0m0.004s
  57 
  58 Reading sparse associative index array entry
  59 real          0m0.083s
  60 user          0m0.084s
  61 sys           0m0.000s

Use the time and times tools to profile computation-intensive commands. Consider rewriting time-critical code sections in C, or even in assembler.

Try to minimize file I/O. Bash is not particularly efficient at handling files, so consider using more appropriate tools for this within the script, such as awk or Perl.

Write your scripts in a modular and coherent form, [1] so they can be reorganized and tightened up as necessary. Some of the optimization techniques applicable to high-level languages may work for scripts, but others, such as loop unrolling, are mostly irrelevant. Above all, use common sense.

For an excellent demonstration of how optimization can dramatically reduce the execution time of a script, see Example 16-47.

Notes

[1]

This usually means liberal use of functions.