Shell Programming Introduction by da chronic Shell Programming - WHY? While it is very nice to have a shell at which you can issue commands, have you had the feeling that something is missing? Do you feel the urge to issue multiple commands by only typing one word? Do you feel the need for variables, logic conditions and loops? Do you strive for automation? If so, then welcome to shell programming. (If you answered no to any of the above then you are obviously in the wrong frame of mind to be reading this - please try again later :) Shell programming allows system administrators (and users) to create small (and occasionally not-so-small) programs for various purposes including automation of system administration tasks, text processing and installation of software. Shell Programming - WHAT? A shell program (sometimes referred to as a shell script) is a text file containing shell and UNIX commands. Remember - a UNIX command is a physical program (like cat, cut and grep) where as a shell command is an "interpreted" command - there isn't a physical file associated with the command; when the shell sees the command, the shell itself performs certain actions (for example, echo) When a shell program is executed the shell reads the contents of the file line by line. Each line is executed as if you were typing it at the shell prompt. There isn't anything that you can place in a shell program that you can't type at the shell prompt. Shell programs contain most things you would expect to find in a simple programming language. Programs can contain services including: * variables * logic constructs (IF THEN AND OR etc) * looping constructs (WHILE FOR) * functions * comments (strangely the most least used service) The way in which these services are implemented is dependant on the shell that is being used (remember - there is more than one shell). While the variations are often not major it does mean that a program written for the bourne shell (sh/bash) will not run in the c shell (csh). All the examples in this chapter are written for the bourne shell. Shell Programming - HOW? Shell programs are a little different from what you'd usually class as a program. They are plain text and they don't need to be compiled. The shell "interprets" shell programs - the shell reads the shell program line by line and executes the commands it encounters. If it encounters an error (syntax or execution), it is just as if you typed the command at the shell prompt - an error is displayed. This is in contrast to C/C++, Pascal and Ada programs (to name but a few) which have source in plain text, but require compiling and linking to produce a final executable program. So, what are the real differences between the two types of programs? At the most basic level, interpreted programs are typically quick to write/modify and execute (generally in that order and in a seemingly endless loop :). Compiled programs typically require writing, compiling, linking and executing, thus are generally more time consuming to develop and test. However, when it comes to executing the finished programs, the execution speeds are often widely separated. A compiled/linked program is a binary file containing a collection direct systems calls. The interpreted program, on the other hand, must first be processed by the shell which then converts the commands to system calls or calls other binaries - this makes shell programs slow in comparison. In other words, shell programs are not generally efficient on CPU time. Is there a happy medium? Yes! It is called Perl. Perl is an interpreted language but is interpreted by an extremely fast, optimised interpreter. It is worth noting that a Perl program will be executed inside one process, whereas a shell program will be interpreted from a parent process but may launch many child processes in the form of UNIX commands (ie. each call to a UNIX command is executed in a new process). However, Perl is a far more difficult (but extremely powerful) tool to learn - and this chapter is called "Shell Programming"... The Basics A Basic Program It is traditional at this stage to write the standard "Hello World" program. To do this in a shell program is so obscenely easy that we're going to examine something a bit more complex - a hello world program that knows who you are... To create your shell program, you must first edit a file - name it something like "hello", "hello world" or something equally as imaginative - just don't call it "test" - we will explain why later. In the editor, type the following: #!/bin/bash # This is a program that says hello echo "Hello $LOGNAME, I hope you have a nice day!" (You may change the text of line three to reflect your current mood if you wish) Now, at the prompt, type the name of your program - you should see something like: bash: ./helloworld: Permission denied Why? The reason is that your shell program isn't executable because it doesn't have its execution permissions set. After setting these (Hint: something involving the chmod command), you may execute the program by again typing its name at the prompt. An alternate way of executing shell programs is to issue a command at the shell prompt to the effect of: eg bash helloworld This simply instructs the shell to take a list of commands from a given file (your shell script). This method does not require the shell script to have execute permissions. However, in general you will execute your shell scripts via the first method. And yet you may still find your script won't execute - why? On some UNIX systems (Red Hat Linux included) the current directory (.) is not included in the PATH environment variable. This mans that the shell can't find the script that you want to execute, even when it's sitting in the current directory! To get around this either: * Modify the PATH variable to include the "." directory: PATH=$PATH:. * Or, execute the program with an explicit path: ./helloworld An Explanation of the Program Line one, #!/bin/bash is used to indicate which shell the shell program is to be run in. If this program was written for the C shell, then you might have #!/bin/csh instead. It is probably worth mentioning at this point that UNIX "executes" programs by first looking at the first two bytes of the file (this is similar to the way MS-DOS looks at the first two bytes of executable programs; all .EXE programs start with "MZ"). From these two characters, the system knows if the file is an interpreted script (#!) or some other file type (more information can be obtained about this by typing man file). If the file is an interpreted script, then the system looks for a following path indicating an interpreter. For example: #!/bin/bash #!/usr/bin/perl #!/bin/sh Are all valid interpreters. Line two, # This is a program that says hello , is (you guessed it) a comment. The "#" in a shell script is interpreted as "anything to the right of this is a comment, go onto the next line". Note that it is similar to line one except that line one has the "!" mark after the comment. Comments are a very important part of any program - it is a really good idea to include some. The reasons why are standard to all languages - readability, maintenance and self congratulation. It is more so important for a system administrator as they very rarely remain at one site for their entire working career, therefore, they must work with other people's shell scripts (as other people must work with theirs). Always have a comment header; it should include things like: # AUTHOR: Who wrote it # DATE: Date first written # PROGRAM: Name of the program # USAGE: How to run the script; include any parameters # PURPOSE: Describe in more than three words what the # program does # # FILES: Files the shell script uses # # NOTES: Optional but can include a list of "features" # to be fixed # # HISTORY: Revisions/Changes This format isn't set in stone, but use common sense and write fairly self documenting programs. Line three, echo "Hello $LOGNAME, I hope you have a nice day!" is actually a command. The echo command prints text to the screen. Normal shell rules for interpreting special characters apply for the echo statement, so you should generally enclose most text in "". The only tricky bit about this line is the $LOGNAME . What is this? $LOGNAME is a shell variable; you can see it and others by typing "set" at the shell prompt. In the context of our program, the shell substitutes the $LOGNAME value with the username of the person running the program, so the output looks something like: Hello jamiesob, I hope you have a nice day! All variables are referenced for output by placing a "$" sign in front of them - we will examine this in the next section. Exercises 8.1 Modify the helloworld program so its output is something similar to: Hello , welcome to All You Ever Wanted to Know About Variables You have previously encountered shell variables and the way in which they are set. To quickly revise, variables may be set at the shell prompt by typing: Shell_Prompt: variable="a string" Since you can type this at the prompt, the same syntax applies within shell programs. You can also set variables to the results of commands, for example: Shell_Prompt: variable=`ls -al` (Remember - the ` is the execute quote) To print the contents of a variable, simply type: Shell_Prompt: echo $variable Note that we've added the "$" to the variable name. Variables are always accessed for output with the "$" sign, but without it for input/set operations. Returning to the previous example, what would you expect to be the output? You would probably expect the output from ls -al to be something like: drwxr-xr-x 2 jamiesob users 1024 Feb 27 19:05 ./ drwxr-xr-x 45 jamiesob users 2048 Feb 25 20:32 ../ -rw-r--r-- 1 jamiesob users 851 Feb 25 19:37 conX -rw-r--r-- 1 jamiesob users 12517 Feb 25 19:36 confile -rw-r--r-- 1 jamiesob users 8 Feb 26 22:50 helloworld -rw-r--r-- 1 jamiesob users 46604 Feb 25 19:34 net-acct and therefore, printing a variable that contains the output from that command would contain something similar, yet you may be surprised to find that it looks something like: drwxr-xr-x 2 jamiesob users 1024 Feb 27 19:05 ./ drwxr-xr-x 45 jamiesob users 2048 Feb 25 20:32 ../ -rw-r--r-- 1 jamiesob users 851 Feb 25 19:37 conX -rw-r--r-- 1 jamiesob users 12517 Feb 25 19:36 confile -rw-r--r-- 1 jamiesob users 8 Feb 26 22:50 helloworld -rw-r--r-- 1 jamiesob users 46604 Feb 25 19:34 net-acct Why? When placing the output of a command into a shell variable, the shell removes all the end-of-line markers, leaving a string separated only by spaces. The use for this will become more obvious later, but for the moment, consider what the following script will do: #!/bin/bash $filelist=`ls` cat $filelist Exercise 8.2 Type in the above program and run it. Explain what is happening. Would the above program work if "ls -al" was used rather than "ls" - Why/why not? Predefined Variables There are many predefined shell variables, most established during your login. Examples include $LOGNAME, $HOSTNAME and $TERM - these names are not always standard from system to system (for example, $LOGNAME can also be called $USER). There are however, several standard predefined shell variables you should be familiar with. These include: $$ (The current process ID) $? (The exits status of last command) How would these be useful? $$ $$ is extremely useful in creating unique temporary files. You will often find the following in shell programs: some command > /tmp/temp.$$ . . some commands using /tmp/temp.$$> . . rm /tmp/temp.$$ /tmp/temp.$$ would always be a unique file - this allows several people to run the same shell script simultaneously. Since one of the only unique things about a process is its PID (Process-Identifier), this is an ideal component in a temporary file name. It should be noted at this point that temporary files are generally located in the /tmp directory. $? $? becomes important when you need to know if the last command that was executed was successful. All programs have a numeric exit status - on UNIX systems 0 indicates that the program was successful, any other number indicates a failure. We will examine how to use this value at a later point in time. Is there a way you can show if your programs succeeded or failed? Yes! This is done via the use of the exit command. If placed as the last command in your shell program, it will enable you to indicate, to the calling program, the exit status of your script. exit is used as follows: exit 0 # Exit the script, $? = 0 (success) exit 1 # Exit the script, $? = 1 (fail) Another category of standard shell variables are shell parameters. Parameters - Special Shell Variables If you thought shell programming was the best thing since COBOL, then you haven't even begun to be awed - shell programs can actually take parameters. Table 8.1 lists each variable associated with parameters in shell programs: Variable Purpose $0 the name of the shell program $1 thru $9 the first thru to ninth parameters $# the number of parameters $* all the parameters passed represented as a single word with individual parameters separated $@ all the parameters passed with each parameter as a separate word Table 8.1 Shell Parameter Variables The following program demonstrates a very basic use of parameters: #!/bin/bash # FILE: parm1 VAL=`expr ${1:-0} + ${2:-0} + ${3:-0}` echo "The answer is $VAL" Pop Quiz: Why are we using ${1:-0} instead of $1? Hint: What would happen if any of the variables were not set? A sample testing of the program looks like: Shell_Prompt: parm1 2 3 5 The answer is 10 Shell_Prompt: parm1 2 3 The answer is 5 Shell_Prompt: The answer is 0 Consider the program below: #!/bin/bash # FILE: mywc FCOUNT='ls $* 2> /dev/null | wc -w' echo "Performing word count on $*" echo wc -w $* 2> /dev/null echo echo "Attempted to count words on $# files, found $FCOUNT" If the program that was run in a directory containing: conX net-acct notes.txt shellprog~ t1~ confile netnasties notes.txt~ study.htm ttt helloworld netnasties~ scanit* study.txt tes/ my_file netwatch scanit~ study_~1.htm mywc* netwatch~ shellprog parm1* Some sample testing would produce: Shell_Prompt: mywc mywc Performing word count on mywc 34 mywc Attempted to count words on 1 files, found 1 Shell_Prompt: mywc mywc anotherfile Performing word count on mywc anotherfile 34 mywc 34 total Attempted to count words on 2 files, found 1 Exercise 8.3 Explain line by line what this program is doing. What would happen if the user didn't enter any parameters? How could you fix this? Only Nine Parameters? Well that's what it looks like doesn't it? We have $1 to $9 - what happens if we try to access $10? Try the code below: #!/bin/bash # FILE: testparms echo "$1 $2 $3 $4 $5 $6 $7 $8 $9 $10 $11 $12" echo $* echo $# Run testparms as follows: Shell_Prompt: testparms a b c d e f g h I j k l The output will look something like: a b c d e f g h i a0 a1 a2 a b c d e f g h I j k l 12 Why? The shell only has 9 parameters defined at any one time $1 to $9. When the shell sees "$10" it interprets this as "$1" and "0" therefore resulting in the "1p0" string. Yet $* still shows all the parameters you typed! To our rescue comes the shift command. shift works by removing the first parameter from the parameter list and shuffling the parameters along. Thus $2 becomes $1, $3 becomes $2 etc. Finally, (what was originally) the tenth parameter becomes $9. However, beware! Once you've run shift, you have lost the original value of $1 forever - it is also removed from $* and $@. shift is executed by, well, placing the word "shift" in your shell script, for example: #!/bin/bash echo $1 $2 $3 shift echo $1 $2 $3 Exercise 8.4 Modify the testparms program so the output looks something like: a b c d e f g h i a0 a1 a2 a b c d e f g h I j k l 12 b c d e f g h i j b1 b2 b3 b c d e f g h i j k l 11 c d e f g h i j k c0 c1 c2 c d e f g h I j k l 10 The difference between $* and $@ While the definitions between the $* and $@ may seem subtle, it is important to distinguish between them. As you have seen $* represents the complete list of characters as one string. If your were to perform: echo $* and echo $@ the results would appear the same. However, when using these variables within your programs you should be aware that the shell stores them in two different ways. Example # $1 = x $2 = "helo fred" $3 = 345 $* = $1 $2 $3 ... eg. x helo fred 345 $@ = "$1" "$2" "$3" ... eg. "x" "helo fred" "345" As we progress through this chapter, remember this, as we will encounter it again when we examine the repeated action commands (while/for loops). The basics of input/output (IO) We have already encountered the "echo" command, yet this is only the "O" part of IO - how can we get user input into our programs? We use the "read" command. For example: #!/bin/bash # FILE: testread read X echo "You said $X" The purpose of this enormously exciting program should be obvious. Just in case you were bored with the echo command. Table 8.2 shows a few backslash characters that you can use to brighten your shell scripts: Character Purpose \a alert (bell) \b backspace \c don't display the trailing newline \n new line \r carriage return \t horizontal tab \v vertical tab \\ backslash \nnn the character with ASCII number nnn (octal) Table 8.2 echo backslash options (type "man echo" to see this exact table :) To enable echo to interpret these backslash characters within a string, you must issue the echo command with a "-e" switch. You may also add a "-n" switch to stop echo printing a new-line at the end of the string - this is a good thing if you want to output a prompting string. For example: #!/bin/bash # FILE: getname echo -n "Please enter your name: " read NAME echo "Your name is $NAME" (This program would be useful for those with a very short memory) At the moment, we've only examined reading from STDIN (standard input a.k.a. the keyboard) and STDOUT (standard output a.k.a. the screen) - if we want to be really clever we can change this. What do you think the following does? read X < afile or what about echo $X > anotherfile If you said that the first read the contents of afile into a variable $X and the second wrote the value of $X to anotherfile you'd almost be correct. The read operation will only read the first line (up to the end-of-line marker) from afile - it doesn't read the entire file. You can also use the ">>" and "<<" redirection operators. Exercises 8.5 What would you expect: read X << END would do? What do you think $X would hold if the input was: Dear Sir I have no idea why your computer blew up. Kind regards, me. END And now for the hard bits Scenario So far we have been dealing with very simple examples - mainly due to the fact we've been dealing with very simple commands. Shell scripting was not invented so you could write programs that ask you your name then display it. For this reason, we are going to be developing a real program that has a useful purpose. We will do this section by section as we examine more shell programming concepts. While you are reading each section, you should consider how the information could assist in writing part of the program. The actual problem is as follows: You've been appointed as a system administrator to an academic department within a small (anonymous) regional university. The previous system administrator left in rather a hurry after it was found that department's main server had being playing host to plethora of pornography, warez (pirate software) and documentation regarding interesting alternative uses for various farm chemicals. There is some concern that the previous sys admin wasn't the only individual within the department who had been availing themselves to such wonderful and diverse resources on the Internet. You have been instructed to identify those persons who have been visiting "undesirable" Internet sites and advise them of the department's policy on accessing inappropriate material (apparently there isn't one, but you've been advised to improvise). Ideally, you will produce a report of people accessing restricted sites, exactly which sites and the number of times they visited them. To assist you, a network monitoring program produces a datafile containing a list of users and sites they have accessed, an example of which is listed below: FILE: netwatch jamiesob mucus.slime.com tonsloye xboys.funnet.com.fr tonsloye sweet.dreams.com root sniffer.gov.au jamiesob marvin.ls.tc.hk jamiesob never.land.nz jamiesob guppy.pond.cqu.edu.au tonsloye xboys.funnet.com.fr tonsloye www.sony.com janesk horseland.org.uk root www.nasa.gov tonsloye warez.under.gr tonsloye mucus.slime.com root ftp.ns.gov.au tonsloye xboys.funnet.com.fr root linx.fare.com root crackz.city.bmr.au janesk smurf.city.gov.au jamiesob mucus.slime.com jamiesob mucus.slime.com After careful consideration (and many hours of painstaking research) a steering committee on the department's policy on accessing the internet has produced a list of sites that they have deemed "prohibited" - these sites are contained in a data file, an example of which is listed below: FILE: netnasties mucus.slime.com xboys.funnet.com.fr warez.under.gr crackz.city.bmr.au It is your task to develop a shell script that will fulfil these requirements (at the same time ignoring the privacy, ethics and censorship issues at hand :) (Oh, it might also be an idea to get Yahoo! to remove the link to your main server under the /Computers/Software/Hackz/Warez/Sites listing... ;) if ... then ... maybe? Shell programming provides the ability to test the exit status from commands and act on them. One way this is facilitated is: if command then do other commands fi You may also provide an "alternate" action by using the "if" command in the following format: if command then do other commands else do other commands fi And if you require even more complexity, you can issue the if command as: if command then do other commands elif anothercommand do other commands fi To test these structures, you may wish to use the true and false UNIX commands. true always sets $? to 0 and false sets $? to 1 after executing. Remember: if tests the exit code of a command - it isn't used to compare values; to do this, you must use the test command in combination with the if structure - test will be discussed in the next section. What if you wanted to test the output of two commands? In this case, you can use the shell's && and || operators. These are effectively "smart" AND and OR operators. The && works as follows: command1 && command2 command2 will only be executed if command1 succeeds. The || works as follows: command1 || command2 command2 will only be executed if command1 fails. These are sometimes referred to as "short circuit" operators in other languages. Given our problem, one of the first things we should do in our program is to check if our datafiles exist. How would we do this? #!/bin/bash # FILE: scanit if ls netwatch && ls netnasties then echo "Found netwatch and netnasties!" else echo "Can not find one of the data files - exiting" exit 1 fi Exercise 8.6 Enter the code above and run the program. Notice that the output from the ls commands (and the errors) appear on the screen - this isn't a very good thing. Modify the code so the only output to the screen is one of the echo messages. Testing Testing... Perhaps the most useful command available to shell programs is the test command. It is also the command that causes the most problems for first time shell programmers - the first program they ever write is usually (imaginatively) called test - they attempt to run it - and nothing happens - why? (Hint: type which test, then type echo $PATH - why does the system command test run before the programmer's shell script?) The test command allows you to: * test the length of a string * compare two strings * compare two numbers * check on a file's type * check on a file's permissions * combine conditions together test actually comes in two flavours: test an_expression and [ an_expression ] They are both the same thing - it's just that [ is soft-linked to /usr/bin/test ; test actually checks to see what name it is being called by; if it is [ then it expects a ] at the end of the expression. What do we mean by "expression"? The expression is the string you want evaluated. A simple example would be: if [ "$1" = "hello" ] then echo "hello to you too!" else echo "hello anyway" fi This simply tests if the first parameter was hello. Note that the first line could have been written as: if test "$1" = "hello" Tip: Note that we surrounded the variable $1 in quotes. This is to take care of the case when $1 doesn't exist - in other words, there were no parameters passed. If we had simply put $1 and there wasn't any $1, then an error would have been displayed: test: =: unary operator expected This is because you'd be effectively executing: test NOTHING = "hello" = expects a string to its left and right - thus the error. However, when placed in double quotes, you be executing: test "" = "hello" which is fine; you're testing an empty string against another string. You can also use test to tell if a variable has a value in it by: test $var This will return true if the variable has something in it, false if the variable doesn't exist OR it contains null (""). We could use this in our program. If the user enters at least one username to check on, them we scan for that username, else we write an error to the screen and exit: if [ $1 ] then the_user_list=echo $* else echo "No users entered - exiting! exit 2 fi Expressions, expressions! So far we've only examined expressions containing string based comparisons. The following tables list all the different types of comparisons you can perform with the test command. Expression True if -z string length of string is 0 -n string length of string is not 0 string1 = string2 if the two strings are identical string != string2 if the two strings are NOT identical String if string is not NULL Table 8.3 String based tests Expression True if int1 -eq int2 first int is equal to second int1 -ne int2 first int is not equal to second int1 -gt int2 first int is greater than second int1 -ge int2 first int is greater than or equal to second int1 -lt int2 first int is less than second int1 -le int2 first int is less than or equal to second Table 8.4 Numeric tests Expression True if -r file File exists and is readable -w file file exists and is writable -x file file exists and is executable -f file file exists and is a regular file -d file file exists and is directory -h file file exists and is a symbolic link -c file file exists and is a character special file -b file file exists and is a block special file -p file file exists and is a named pipe -u file file exists and it is setuid -g file file exists and it is setgid -k file file exists and the sticky bit is set -s file file exists and its size is greater than 0 Table 8.5 File tests Expression Purpose ! reverse the result of an expression -a AND operator -o OR operator ( expr ) group an expression, parentheses have special meaning to the shell so to use them in the test command you must quote them Table 8.6 Logic operators with test Remember: test uses different operators to compare strings and numbers - using -ne on a string comparison and != on a numeric comparison is incorrect and will give undesirable results. Exercise 8.7 Modify the code for scanit so it uses the test command to see if the datafiles exists. All about case Ok, so we know how to conditionally perform operations based on the return status of a command. However, like a combination between the if statement and the test $string = $string2, there exists the case statement. case value in pattern 1) command anothercommand ;; pattern 2) command anothercommand ;; esac case works by comparing value against the listed patterns. If a match is made, then the commands associated with that pattern are executed (up to the ";;" mark) and $? is set to 0. If a match isn't made by the end of the case statement (esac) then $? is set to 1. The really useful thing is that wildcards can be used, as can the "|" symbol which acts as an OR operator. The following example gets a Yes/No response from a user, but will accept anything starting with "Y" or "y" as YES, "N" or "n" as no and anything else as "MAYBE" echo -n "Your Answer: " read ANSWER case $ANSWER in Y* | y*) ANSWER="YES" ;; N* | n*) ANSWER="NO" ;; *) ANSWER="MAYBE" ;; esac echo $ANSWER Exercise 8.8 Write a shell script that inputs a date and converts it into a long date form. For example: $~ > mydate 12/3/97 12th of March 1997 $~ > mydate Enter the date: 1/11/74 1st of November 1974 Loops and Repeated Action Commands Looping - "the exciting process of doing something more than once" - and shell programming allows it. There are three constructs that implement looping: while - do - done for - do - done until - do - done while The format of the while construct is: while command do commands done (while command is true, commands are executed) Example while [ $1 ] do echo $1 shift done What does this segment of code do? Try running a script containing this code with a b c d e on the command line. while also allows the redirection of input. Consider the following: #!/bin/bash # FILE: linelist # count=0 while read BUFFER do count=`expr $count + 1` # Increment the count echo "$count $BUFFER" # Echo it out done < $1 # Take input from the file This program reads a file line by line and echo's it to the screen with a line number. Given our scanit program, the following could be used read the netwatch datafile and compare the username with the entries in the datafile: while read buffer do user=`echo $buffer | cut -d" " -f1` site=`echo $buffer | cut -d" " -f2` if [ "$user" = "$1" ] then echo "$user visited $site" fi done < netwatch Exercise 8.9 Modify the above code so that the site is compared with all sites in the prohibited sites file (netnasties). Do this by using another while loop. If the user has visited a prohibited site, then echo a message to the screen. for The format of the for construct is: for variable in list_of_variables do commands done (for each value in list_of_variables, "commands" are executed) Example echo $# for VAR in $* do echo $VAR done Herein lies the importance between $* and $@. Try the above program using: this is a sentence as the input. Now try it with: "this is" a sentence Your output for the first run should look something like: 4 this is a sentence and the second run 3 this is a sentence Remember that $* effectively is "$1 $2 $3 $4 $5 $6 $7 $8 $9 $10 ... $n". Exercise 8.10 Modify the previous segment of code, changing $* to $@. What do you think the output will be? Try it. Modifying scanit Given our scanit program, we might wish to report on a number of users. The following modifications will allow us to accept and process multiple users from the command line: for checkuser in $* do while read buffer do while read checksite do user=`echo $buffer | cut -d" " -f1` site=`echo $buffer | cut -d" " -f2` if [ "$user" = "$checkuser" -a "$site" = "$checksite" ] then echo "$user visited the prohibited site $site" fi done < netnasties done < netwatch done Exercise 8.11 The above code is very inefficient IO wise - for every entry in the netwatch file, the entire netnasties file is read in. Modify the code so that the while loop reading the netnasties file is replaced by a for loop. (Hint: what does: BADSITES=`cat netnasties` do?) EXTENSION: What other IO inefficiencies does the code have? Fix them. until The format of the until construct is: until command do commands done ("commands" are executed until "command" is true) Example until [ "$1" = "" ] do echo $1 shift done break and continue Occasionally you will want to jump out of a loop; to do this you need to use the break command. break is executed in the form: break or break n The first form simply stops the loop, for example: while true do read BUFFER if [ "$BUFFER" = "" ] then break fi echo $BUFFER done This code takes a line from the user and prints it until the user enters a blank line. The second form of break, break n (where n is a number) effectively works by executing break "n" times. This can break you out of embedded loops, for example: for file in $* do while read BUFFER do if [ "$BUFFER" = "ABORT" ] then break 2 fi echo $BUFFER done < $file done This code prints the contents of multiple files, but if it encounters a line containing the word "ABORT" in any one of the files, it stops processing. Like break, continue is used to alter the looping process. However, unlike break, continue keeps the looping process going; it just fails to finish the remainder of the current loop by returning to the top of the loop. For example: while read BUFFER do charcount=`echo $BUFFER | wc -c | cut -f1` if [ $x2 -gt 80 ] then continue fi echo $BUFFER done < $1 This code segment reads and echo's the contents of a file - however, it does not print lines that are over 80 characters long. Redirection Not just the while - do - done loops can have IO redirection; it is possible to perform piping, output to files and input from files on if, for and until as well. For example: if true then read x read y read x fi < afile This code will read the first three lines from afile. Pipes can also be used: read BUFFER while [ "$BUFFER" != "" ] do echo $BUFFER read BUFFER done | todos > tmp.$$ This code uses a non-standard command called todos. todos converts UNIX text files to DOS textfiles by making the EOL (End-Of-Line) character equivalent to CR (Carriage-Return) LF (Line-Feed). This code takes STDIN (until the user enters a blank line) and pipes it into todos, which inturn converts it to a DOS style text file ( tmp.$$ ) . In all, a totally useless program, but it does demonstrate the possibilities of piping. Now for the really hard bits Functional Functions A symptom of most useable programming languages is the existence of functions. Theoretically, functions provide the ability to break your code into reusable, logical compartments that are the by product of top-down design. In practice, they vastly improve the readability of shell programs, making it easier to modify and debug them. An alternative to functions is the grouping of code into separate shell scripts and calling these from your program. This isn't as efficient as functions, as functions are executed in the same process that they were called from; however other shell programs are launched in a separate process space - this is inefficient on memory and CPU resources. You may have noticed that our scanit program has grown to around 30 lines of code. While this is quite manageable, we will make some major changes later that really require the "modular" approach of functions. Shell functions are declared as: function_name() { somecommands } Functions are called by: function_name parameter_list YES! Shell functions support parameters. $1 to $9 represent the first nine parameters passed to the function and $* represents the entire parameter list. The value of $0 isn't changed. For example: #!/bin/bash # FILE: catfiles catfile() { for file in $* do cat $file done } FILELIST=`ls $1` cd $1 catfile $FILELIST This is a highly useless example (cat * would do the same thing) but you can see how the "main" program calls the function. local Shell functions also support the concept of declaring "local" variables. The local command is used to do this. For example: #!/bin/bash testvars() { local localX="testvars localX" X="testvars X" local GlobalX="testvars GlobalX" echo "testvars: localX= $localX X= $X GlobalX= $GlobalX" } X="Main X" GlobalX="Main GLobalX" echo "Main 1: localX= $localX X= $X GlobalX= $GlobalX" testvars echo "Main 2: localX= $localX X= $X GlobalX= $GlobalX" The output looks like: Main 1: localX= X= Main X GlobalX= Main GLobalX testvars: localX= testvars localX X= testvars X GlobalX= testvars GlobalX Main 2: localX= X= testvars X GlobalX= Main GLobalX The return trip After calling a shell function, the value of $? is set to the exit status of the last command executed in the shell script. If you want to explicitly set this, you can use the return command: return n (Where n is a number) This allows for code like: if function1 then do_this else do_that fi For example, we can introduce our first function into our scanit program by placing our datafile tests into a function: #!/bin/bash # FILE: scanit # check_data_files() { if [ -r netwatch -a -r netnasties ] then return 0 else return 1 fi } # Main Program if check_data_files then echo "Datafiles found" else echo "One of the datafiles missing - exiting" exit 1 fi # our other work... Recursion: (see "Recursion") Shell programming even supports recursion. Typically, recursion is used to process tree-like data structures - the following example illustrates this: #!/bin/bash # FILE: wctree wcfiles() { local BASEDIR=$1 # Set the local base directory local LOCALDIR=`pwd` # Where are we? cd $BASEDIR # Go to this directory (down) local filelist=`ls` # Get the files in this directory for file in $filelist do if [ -d $file ] # If we are a directory, recurs then # we are a directory wcfiles "$BASEDIR/$file" else fc=`wc -w < $file` # do word count and echo info echo "$BASEDIR/$file $fc words" fi done cd $LOCALDIR # Go back up to the calling directory } if [ $1 ] # Default to . if no parms then wcfile $1 else wcfile "." fi Exercise 8.12 What does the wctree program do? Why are certain variables declared as local? What would happen if they were not? Modify the program so it will only "recurs" 3 times. EXTENSION: There is actually a UNIX command that will do the same thing as this shell script - what is it? What would be the command line? (Hint: man find) wait'ing and trap'ing So far we have only examined linear, single process shell script examples. What if you want to have more than one action occurring at once? As you are aware, it is possible to launch programs to run in the background by placing an ampersand behind the command, for example: runcommand & You can also do this in your shell programs. It is occasionally useful to send a time consuming task to the background and proceed with your processing. An example of this would be a sort on a large file: sort $largefile > $newfile & do_a_function do_another_funtion $newfile The problem is, what if the sort hadn't finished by the time you wanted to use $newfile? The shell handles this by providing wait : sort $largefile > $newfile & do_a_function wait do_another_funtion $newfile When wait is encountered, processing stops and "waits" until the child process returns, then proceeds on with the program. But what if you had launched several processes in the background? The shell provides the shell variable $! (the PID of the child process launched) which can be given as a parameter to wait - effectively saying "wait for this PID". For example: sort $largefile1 > $newfile1 & $SortPID1=$! sort $largefile2 > $newfile2 & $SortPID2=$! sort $largefile3 > $newfile3 & $SortPID3=$! do_a_function wait $SortPID1 do_another_funtion $newfile1 wait $SortPID2 do_another_funtion $newfile2 wait $SortPID3 do_another_funtion $newfile3 Another useful command is trap. trap works by associating a set of commands with a signal from the operating system. You will probably be familiar with: kill -9 PID which is used to kill a process. This command is in fact sending the signal "9" to the process given by PID. Available signals are shown in Table 8.7. Signal Meaning 0 Exit from the shell 1 Hangup 2 Interrupt 3 Quit 4 Illegal Instruction 5 Trace trap 6 IOT instruction 7 EMT instruction 8 Floating point exception 10 Bus error 12 Bad argument 13 Pipe write error 14 Alarm 15 Software termination signal Table 8.7 UNIX signals (Taken from "UNIX Shell Programming" Kochan et al) While you can't actually trap signal 9, you can trap the others. This is useful in shell programs when you want to make sure your program exits gracefully in the event of a shutdown (or some such event) (often you will want to remove temporary files the program has created). The syntax of using trap is: trap commands signals For example: trap "rm /tmp/temp.$$" 1 2 will trap signals 1 and 2 - whenever these signals occur, processing will be suspended and the rm command will be executed. You can also list every trap'ed signal by issuing the command: trap To "un-trap" a signal, you must issue the command: trap "" signals The following is a somewhat clumsy form of IPC (Inter-Process Communication) that relies on trap and wait: #!/bin/bash # FILE: saymsg # USAGE: saymsg [total number of # children] readmsg() { read line < $$ # read a line from the file given by the PID echo "$ID - got $line!" # of my *this* process ($$) if [ $CHILD ] then writemsg $line # if I have children, send them message fi } writemsg() { echo $* > $CHILD # Write line to the file given by PID kill -1 $CHILD # of my child. Then signal the child. } stop() { kill -15 $CHILD # tell my child to stop if [ $CHILD ] then wait $CHILD # wait until they are dead rm $CHILD # remove the message file fi exit 0 } # Main Program if [ $# -eq 1 ] then NUMCHILD=`expr $1 - 1` saymsg $NUMCHILD $1 & # Launch another child CHILD=$! ID=0 touch $CHILD # Create empty message file echo "I am the parent and have child $CHILD" else if [ $1 -ne 0 ] # Must I create children? then NUMCHILD=`expr $1 - 1` # Yep, deduct one from the number saymsg $NUMCHILD $2 & # to be created, then launch them CHILD=$! ID=`expr $2 - $1` touch $CHILD # Create empty message file echo "I am $ID and have child $CHILD" else ID=`expr $2 - $1` # I don't need to create children echo "I am $ID and am the last child" fi fi trap "readmsg" 1 # Trap the read signal trap "stop" 15 # Trap the drop-dead signal if [ $# -eq 1 ] # If I have one parameter, then # then I am the parent - I just read read BUFFER # STDIN and pass the message on while [ "$BUFFER" ] do writemsg $BUFFER read BUFFER done echo "Parent - Stopping" stop else # Else I am the child who does nothing - while true # I am totally driven by signals. do true done fi So what is happening here? It may help if you look at the output: psyche:~/sanotesShell_Prompt: saymsg 3 I am the parent and have child 8090 I am 1 and have child 8094 I am 2 and have child 8109 I am 3 and am the last child this is the first thing I type 1 - got this is the first thing I type! 2 - got this is the first thing I type! 3 - got this is the first thing I type! Parent - Stopping psyche:~/sanotesShell_Prompt: Initially, the parent program starts, accepting a number of children to create. The parent then launches another program, passing it the remaining number of children to create and the total number of children. This happens on every launch of the program until there are no more children to launch. From this point onwards the program works rather like Chinese whispers - the parent accepts a string from the user which it then passes to its child by sending a signal to the child - the signal is caught by the child and readmsg is executed. The child writes the message to the screen, then passes the message to its child (if it has one) by signalling it and so on and so on. The messages are passed by being written to files - the parent writes the message into a file named by the PID of the child process. When the user enters a blank line, the parent process sends a signal to its child - the signal is caught by the child and stop is executed. The child then sends a message to its child to stop, and so on and so on down the line. The parent process can't exit until all the children have exited. This is a very contrived example - but it does show how processes (even at a shell programming level) can communicate. It also demonstrates how you can give a function name to trap (instead of a command set). Exercise 8.13 saymsg is riddled with problems - there isn't any checking on the parent process command line parameters (what if there wasn't any?) and it isn't very well commented or written - make modifications to fix these problems. What other problems can you see? EXTENSION: Fundamentally saymsg isn't implementing very safe inter-process communication - how could this be fixed? Remember, one of the main problems concerning IPC is the race condition - could this happen? Bugs and Debugging If by now you have typed every example program in, completed every exercise and have not encountered one single error then you are a truly amazing person. However, if you are like me, you would have made at least 70 billion mistakes/ typos or TSE's (totally stupid errors) - and now I tell you the easy way to find them! Method 1 - set Issuing the truly inspired command of: set -x within your program will do wonderful things. As your program executes, each code line will be printed to the screen - that way you can find your mistakes, err, well, a little bit quicker. Turning tracing off is a good idea once your program works - this is done by: set +x Method 2 - echo Placing a few echo statements in your code during your debugging is one of the easiest ways to find errors - for the most part this will be the quickest way of detecting if variables are being set correctly. Very Common Mistakes $VAR=`ls` This should be VAR=`ls`. When setting the value of a shell variable you don't use the $ sign. read $BUFFER The same thing here. When setting the value of a variable you don't use the $ sign. VAR=`ls -al" The second ` is missing if [ $VAR ] then echo $VAR fi Haven't specified what is being tested here. Need to refer to the contents of Tables 8.2 through 8.5 if [ $VAR -eq $VAR2 ] then echo $VAR fi If $VAR and $VAR2 are strings then you can't use -eq to compare their values. You need to use =. if [ $VAR = $VAR2 ] then echo $VAR fi The then must be on a separate line. And now for the really really hard bits Writing good shell programs We have covered most of the theory involved with shell programming, but there is more to shell programming than syntax. In this section, we will complete the scanit program, examining efficiency and structure considerations. scanit currently consists of one chunk of code with one small function. In its current form, it doesn't meet the requirements specified: "...you will produce a report of people accessing restricted sites, exactly which sites and the number of times they visited them." Our code, as it is, looks like: #!/bin/bash # FILE: scanit # check_data_files() { if [ -r netwatch -a -r netnasties ] then return 0 else return 1 fi } # Main Program if check_data_files then echo "Datafiles found" else echo "One of the datafiles missing - exiting" exit 1 fi for checkuser in $* do while read buffer do while read checksite do user=`echo $buffer | cut -d" " -f1` site=`echo $buffer | cut -d" " -f2` if [ "$user" = "$checkuser" -a "$site" = "$checksite" ] then echo "$user visited the prohibited site $site" fi done < netnasties done < netwatch done At the moment, we simply print out the user and site combination - no count provided. To be really effective, we should parse the file containing the user/site combinations (netwatch), register and user/prohibited site combinations and then when we have all the combinations and count per combination, produce a report. Given our datafile checking function, the pseudo code might look like: if data_files_exist ... else exit 1 fi check_netwatch_file produce_report exit It might also be an idea to build in a "default" - if no username(s) are given on the command line, we go and get all the users from the /etc/passwd file: f [ $1 ] then the_user_list=$* else get_passwd_users fi Exercise 8.14 Write the shell function get_passwd_users. This function goes through the /etc/passwd file and creates a list of usernames. (Hint: username is field one of the password file, the delimiter is ":") eval the wonderful! The use of eval is perhaps one of the more difficult concepts in shell programming to grasp is the use of eval. eval effectively says "parse (or execute) the following twice". What this means is that any shell variables that appear in the string are "substituted" with their real value on the first parse, then used as-they-are for the second parse. The use of this is difficult to explain without an example, so we'll refer back to our case study problem. The real challenge to this program is how to actually store a count of the user and site combination. The following is how I'd do it: checkfile() { # Goes through the netwatch file and saves user/site # combinations involving sites that are in the "restricted" # list while read buffer do username=`echo $buffer | cut -d" " -f1` # Get the username # Remove "."'s from the string site=`echo $buffer | cut -d" " -f2 | sed s/\\\.//g` for checksite in $badsites do checksite=`echo $checksite | sed s/\\\.//g` # Do this for the compare sites if [ "$site" = "$checksite" ] then usersite="$username$checksite" # Does the VARIABLE called $usersite exist? Note use of eval if eval [ \$$usersite ] then eval $usersite=\`expr \$$usersite + 1\` else eval $usersite=1 fi fi done done < netwatch } There are only two really tricky lines in this function: 1. site=`echo $buffer | cut -d" " -f2 | sed s/\\\.//g` Creates a variable site; if buffer (one line of netwatch) contained rabid.dog.com then site would become: rabiddogcom The reason for this is because of the variable usersite: usersite="$username$checksite" What we are actually creating is a variable name, stored in the variable usersite - why (you still ask) did we remove the "."'s? This becomes clearer when we examine the second tricky line: 2. eval $usersite=\`expr \$$usersite + 1\` Remember eval "double" or "pre" parses a line - after eval has been run, you get a line which looks something like: # $user="jamiesobrabiddogcom" jamiesobrabiddogcom=`expr $jamiesobrabiddogcom + 1` What should become clearer is this: the function reads each line of the netwatch file. If the site in the netwatch file matches one of the sites stored in netnasties file (which has been cat'ed into the variable badsites) then we store the user/site combination. We do this by first checking if there exists a variable by the name of the user/site combination - if one does exist, we add 1 to the value stored in the variable. If there wasn't a variable with the name of the user/site combination, then we create one by assigning it to "1". At the end of the function, we should have variables in memory for all the user/prohibited site combinations found in the netwatch file, something like: jamiesobmucusslimecom=3 tonsloyemucusslimecom=1 tonsloyeboysfunnetcomfr=3 tonsloyewarezundergr=1 rootwarzundergr=4 Note that this would be the case even if we were only interested in the users root and jamiesob. So why didn't we check in the function if the user in the netwatch file was one of the users we were interested in? Why should we!? All that does is adds an extra loop: for every line in the file for every site for every user do check create variable if user and if site in userlist, badsitelist whereas all we have now is for every line in the file for every site create variable if site in badsitelist We are still going to have to go through every user/badsite combination anyway when we produce the report - why add the extra complexity? You might also note that there is minimal file IO - datafiles are only read ONCE - lists (memory structures) may be read more than once. Exercise 8.15 Given the checksite function, write a function called produce_report that accepts a list of usernames and finds all user/badsite combinations stored by checkfile. This function should echo lines that look something like: jamiesob: mucus.slime.com 3 tonsloye: mucus.slime.com 1 tonsloye: xboys.funnet.com.fr 3 tonsloye: warez.under.gr 1 Step-by-step In this section, we will examine a complex shell programming problem and work our way through the solution. The problem This problem is an adaptation of the problem used in the 1997 shell programming assignment for systems administration: Problem Definition Your department's FTP server provides anonymous FTP access to the /pub area of the filesystem - this area contains subdirectories (given by unit code) which contain resource materials for the various subjects offered. You suspect that this service isn't being used any more with the advent of the WWW, however, before you close this service and use the file space for something more useful, you need to prove this. What you require is a program that will parse the FTP logfile and produce usage statistics on a given subject. This should include: * Number of accesses per user * Number of bytes transferred * The number of machines which have used the area. The program will probably be called from other scripts. It should accept (from the command line) the subject (given by the subject code) that it is to examine, followed by one or more commands. Valid commands will consist of: * USERS - get a user and access count listing * BYTES - bytes transmitted for the subject * HOSTS - number of unique machines who have used the area Background information A cut down version of the FTP log will be examined by our program - it will consist of: remote host name file size in bytes name of file local username or, if guest, ID string given (anonymous FTP password) For example: aardvark.com 2345 /pub/85349/lectures.tar.gz flipper@aardvark.com 138.77.8.8 112 /pub/81120/cpu.gif sloth@topaz.cqu.edu.au The FTP logfile will be called /var/log/ftp.log - we need not concern ourselves how it is produced (for those that are interested - look at man ftpd for a description of the real log file). Anonymous FTP "usernames" are recorded as whatever the user types in as the password - while this may not be accurate, it is all we have to go on. We can assume that all directories containing subject material branch off the /pub directory, eg. /pub/85321 /pub/81120 Expected interaction The following are examples of interaction with the program (scanlog): Shell_Prompt: scanlog 85321 USERS jamiesob@jasper.cqu.edu.au 1 b.spice@sworld.cqu.edu.au 22 jonesd 56 Shell_Prompt: scanlog 85321 BYTES 2322323 Shell_Prompt: scanlog 85321 HOSTS 5 Shell_Prompt: scanlog 85321 BYTES USERS 2322323 jamiesob@jasper.cqu.edu.au 1 b.spice@sworld.cqu.edu.au 22 jonesd 56 Solving the problem How would you solve this problem? What would you do first? Break it up What does the program have to do? What are its major parts? Let's look at the functionality again - our program must: * get a user and access count listing * produce a the byte count on files transmitted for the subject * list the number unique machines who have used the area and how many times To do this, our program must first: * Read parameters from the command line, picking out the subject we are interested in * go through the other parameters one by one, acting on each one, calling the appropriate function * Terminate/clean up So, this looks like a program containing three functions. Or is it? Look at the simple case first It is often easier to break down a problem by walking through a simple case first. Lets imagine that we want to get information about a subject - let's use the code 85321. At this stage, we really don't care what the action is. What happens? The program starts. * We extract the first parameter from the command line. This is our subject. We might want to check if there is a first parameter - is it blank? * Since we are only interested in this subject, we might want to go through the FTP log file and extract those entries we're interested in and keep this information in a temporary file. Our other option is to do this for every different "action" - this would probably be inefficient. * Now, we want to go through the remaining parameters on the command line and act on each one. Maybe we should signal a error if we don't understand the action? * At the end of our program, we should remove any temporary files we use. Pseudo Code If we were to pseudo code the above steps, we'd get something like: # Check to see if the first parameter is blank if first_parameter = "" then echo "No unit specified" exit fi # Find all the entries we're interested in, place this in a TEMPFILE # Right - for every other parameter on the command line, we perform # some for ACTION in other_parameters do # Decide if it is a valid action - act on it or give a error done # Remove Temp file rm TEMPFILE Let's code this: if [ "$1" = "" ] then echo "No unit specified" exit 1 fi # Remove $1 from the parm line UNIT=$1 shift # Find all the entries we're interested in grep "/pub/$UNIT" $LOGFILE > $TEMPFILE # Right - for every other parameter on the command line, we perform some for ACTION in $@ do process_action "$ACTION" done # Remove Temp file rm $TEMPFILE Ok, a few points to note: * Notice the use of the variables LOGFILE and TEMPFILE? These would have to be defined somewhere above the code segment. * We remove the first parameter from the command line and assign it to another variable. We do this using the shift command. * We use grep to find all the entries in the original log file that refer to the subject we are interested in. We store these entries in a temporary file. * The use of $@ in the loop to process the remaining parameters is important. Why did we use it? Why not $* ? Hint: "1 2 3 4 5 6" isn't "1" "2" "3" "4" "5" "6" is it? * We've invented a new function, process_action - we will use this to work out what to do with each action. Note that we are passing the function a parameter. Why are we enclosing it in quotes? Does it matter if we don't? Actually, in this case, it doesn't matter if we call the function with the parameter in quotes or not, as our parameters are expected to be single words. However, what if we allowed commands like: * * find host 138.77.3.4 * * If we passed this string to a function (without quotes), it would be interpreted as: * * $1="find" $2="host" $3="138.77.3.4" * * This wouldn't be entirely what we want - so, we enclose the string in quotes - producing: * * $1="find host 138.77.3.4" * As we mentioned, in this case, we have single word commands, so it doesn't matter, however, always try to look ahead for problems - ask yourself the figurative question - "Is my code going to work in the rain?". Expand function process_action We have a function to work on - process_action. Again, we should pseudo code it, then implement it. Wait! We haven't first thought about what we want it to do - always a good idea to think before you code! This function must take a parameter, determine if it is a valid action, then perform some action on it. It is an invalid action, then we should signal an error. Let's try the pseudo code first: process_action() { # Now, Check what we have case Action in BYTES then do a function to get bytes USERS then do a function to get a user list HOSTS then do a function to get an access count Something Else then echo "Unknown command $theAction" esac } Right - now try the code: process_action() { # Translate to upper case theAction=`echo $1 | tr [a-z] [A-Z]` # Now, Check what we have case $theAction in USERS) getUserList ;; HOSTS) getAccessCount ;; BYTES) getBytes ;; *) echo "Unknown command $theAction" ;; esac } Some comments on this code: * Note that we translate the "action command" (for example "bytes" , "users") into upper case. This is a nicety - it just means that we'll pick up every typing variation of the action. * We use the case command to decide what to do with the action. We could have just as easily used a series of IF-THEN-ELSE-ELIF-FI statements - this becomes horrendous to code and read after about three conditions so case is a better option. * As you will see, we've introduced calls to functions for each command - this again breaks to code up into bite size pieces (excuse the pun ;) to code. This follows the top-down design style. * We will now expand each function. Expand Function getUserList Now might be a good time to revise what was required of our program - in particular, this function. We need to produce a listing of all the people who have accessed files relating to the subject of interest and how many times they've accessed files. Because we've separated out the entries of interest from the log file, we need no longer concern ourselves with the actual files and if they relate to the subject. We now are just interested in the users. Reviewing the log file format: aardvark.com 2345 /pub/85349/lectures.tar.gz flipper@aardvark.com 138.77.8.8 112 /pub/81120/cpu.gif sloth@topaz.cqu.edu.au We see that user information is stored in the fourth field. If we pseudo code what we want to do, it would look something like: for every_user_in the file do go_through_the_file_and_count_occurences print this out done Expanding this a bit more, we get: extract_users_from_file for user in user_list do count = 0 while read log_file do if user = current_entry then count = count + 1 fi done echo user count done Let's code this: getUserList() { cut -f4 $TEMPFILE | sort > $TEMPFILE.users userList=`uniq $TEMPFILE.users` for user in $userList do { count=0 while read X do if echo $X | grep $user > /dev/null then count=`expr $count + 1` fi done } < $TEMPFILE echo $user $count done rm $TEMPFILE.users } Some points about this code: * The first cut extracts a user list and places it in a temp file. A unique list of users is then created and placed into a variable. * For every user in the list, the file is read through and each line searched for the user string. We pipe the output into /dev/null. * If a match is made, count is incremented. * Finally the user/count combination is printed. * The temporary file is deleted. Unfortunately, this code totally sucks. Why? There are several things wrong with the code, but the most outstanding problem is the massive and useless looping being performed - the while loop reads through the file for every user. This is bad. While loops within shell scripts are very time consuming and inefficient - they are generally avoided if, as in this case, they can be. More importantly, this script doesn't make use of UNIX commands which could simplify (and speed up!) our code. Remember: don't re-invent the wheel - use existing utilities where possible. Let's try it again, this time without the while loop: getUserList() { cut -f4 $TEMPFILE | sort > $TEMPFILE.users # Get user list userList=`uniq $TEMPFILE.users` for user in $userList # for every user... do count=`grep $user $TEMPFILE.users | wc -l` # count how many times they are echo $user $count # in the file done rm $TEMPFILE.users } Much better! We've replaced the while loop with a simple grep command - however, there are still problems: We don't need the temporary file Can we wipe out a few more steps? Next cut: getUserList() { userList=`cut -f4 $TEMPFILE | sort | uniq` for user in $userList do echo $user `grep $user $TEMPFILE | wc -l` done } Beautiful! Or is it. What about: echo `cut-f4 $TEMPFILE | sort | uniq -c` This does the same thing...or does it? If we didn't care what our output looked like, then this'd be ok - find out what's wrong with this code by trying it and the previous segment - compare the results. Hint: uniq -c produces a count of every sequential occurrence of an item in a list. What would happen if we removed the sort? How could we fix our output "problem"? Expand Function getAccessCount This function requires a the total number of unique hosts which have accessed the files. Again, as we've already separated out the entries of interest into a temporary file, we can just concentrate on the hosts field (field number one). If we were to pseudo code this: create_unique_host list count = 0 for host in host_list do count = count + 1 done echo count From the previous function, we can see that a direct translation from pseudo code to shell isn't always efficient. Could we skip a few steps and try the efficient code first? Remember - we should try to use existing UNIX commands. How do we create a unique list? The hint is in the word unique - the uniq command is useful in extracting unique listings. What are we going to use as the input to the uniq command? We want a list of all hosts that accessed the files - the host is stored in the first field of every line in the file. Next hint - when we see the word "field" we can immediately assume we're going to use the cut command. Do we have to give cut any parameters? In this case, no. cut assumes (by default) that fields are separated by tabs - in our case, this is true. However, if the delimiter was anything else, we'd have to use a "-d" switch, followed by the delimiter. Next step - what about the output from uniq? Where does this go? We said that we wanted a count of the unique hosts - another hint - counting usually means using the wc command. The wc command (or word count command) counts characters, words and lines. If the output from the uniq command was one host per line, then a count of the lines would reveal the number of unique hosts. So what do we have? cut -f1 uniq wc -l Right - how do we get input and save output for each command? A first cut approach might be: cat $TEMPFILE | cut -f1 > $TEMPFILE.cut cat $TEMPFILE.cut | uniq > $TEMPFILE.uniq COUNT=`cat $TEMPFILE.uniq | wc -l` echo $COUNT This is very inefficient; there are several reasons for this: * We cat a file THREE times to get the count. We don't even have to use cat if we really try. * We use temp files to store results - we could use a shell variable (as in the second last line) but is there any need for this? Remember, file IO is much slower than assignments to variables, which, depending on the situation, is slower again that using pipes. * There are four lines of code - this can be completed in one! So, removing these problems, we are left with: getAccessCount() { echo `cut -f1 $TEMPFILE | uniq | wc -l` } How does this work? * The shell executes what's between `` and this is output'ed by echo. * This command starts with the cut command - a common misconception is that cut requires input to be piped into it - however, cut works just as well by accepting the name of a file to work with. The output from cut (a list of hosts) is piped into uniq. * uniq then removes all duplicate host from the list - this is piped into wc. * wc then counts the number of lines - the output is displayed. Expand Function getBytes The final function we have to write (Yes! We are nearly finished) counts the total byte count of the files that have been accessed. This is actually a fairly simple thing to do, but as you'll see, using shell scripting to do this can be very inefficient. First, some pseudo code: total = 0 while read line from file do extract the byte field add this to the total done echo total In shell, this looks something like: getBytes() { bytes=0 while read X do bytefield=`echo $X | cut -f2` bytes=`expr $bytes + $bytefield` done < $TEMPFILE echo $bytes } ...which is very inefficient (remember: looping is bad!). In this case, every iteration of the loop causes three new processes to be created, two for the first line, one for the second - creating processes takes time! The following is a bit better: getBytes() { list=`cut -f2 $TEMPFILE ` bytes=0 for number in $list do bytes=`expr $bytes + $number` done echo $bytes } The above segment of code still has looping, but is more efficient with the use of a list of values which must be added up. However, we can get smarter: getBytes() { numstr=`cut -f2 $TEMPFILE | sed "s/$/ + /g"` expr $numstr 0 } Do you see what we've done? The cut operation produces a list of numbers, one per line. When this is piped into sed, the end-of-line is substituted with " + " - note the spaces. This is then combined into a single line string and stored in the variable numstr. We then get the expr of this string - why do we put the 0 on the end? Two reasons: After the sed operation, there is an extra "+" on the end - for example, if the input was: 2 3 4 The output would be: 2 + 3 + 4 + This, when placed in a shell variable, looks like: 2 + 3 + 4 + ...which when evaluated, gives an error. Thus, placing a 0 at the end of the string matches the final "+" sign, and expr is happy What if there wasn't a byte count? What if there were no entries - expr without parameters doesn't work - expr with 0 does. So, is this the most efficient code? Within the shell, yes. Probably the most efficient code would be a call to awk and the use of some awk scripting, however that is beyond the scope of this chapter and should be examined as a personal exercise. A final note about the variables Throughout this exercise, we've referred to $TEMPFILE and $LOGFILE. These variables should be set at the top of the shell script. LOGFILE refers to the location of the FTP log. TEMPFILE is the actual file used to store the entries of interest. This must be a unique file and should be deleted at the end of the script. It'd be an excellent idea to store this file in the /tmp directory (just in case your script dies and you leave the temp file laying around - /tmp is regularly cleaned out by the system) - it would be an even better idea to guarantee its uniqueness by including the process ID ($$) somewhere within its name: LOGFILE="/var/log/ftp.log" TEMPFILE="/tmp/scanlog.$$" The final program - a listing The following is the completed shell script - notice how short the code is (think of what it would be like if we hadn't been pushing for efficiency!). #!/bin/sh # # FILE: scanlog # PURPOSE: Scan FTP log # AUTHOR: Bruce Jamieson # HISTORY: DEC 1997 Created # # To do : Truly astounding things. # Apart from that, process a FTP log and produce stats #-------------------------- # globals LOGFILE="ftp.log" TEMPFILE="/tmp/scanlog.$$" # functions #---------------------------------------- # getAccessCount # - display number of unique machines that have accessed the page getAccessCount() { echo `cut -f1 $TEMPFILE | uniq | wc -l` } #------------------------------------------------------- # getUserList # - display the list of users who have acessed this page getUserList() { userList=`cut -f4 $TEMPFILE | sort | uniq` for user in $userList do echo $user `grep $user $TEMPFILE | wc -l` done } #------------------------------------------------------- # getBytes # - calculate the amount of bytes transferred getBytes() { numstr=`cut -f2 $TEMPFILE | sed "s/$/ + /g"` expr $numstr 0 } #------------------------------------------------------ # process_action # Based on the passed string, calls one of three functions # process_action() { # Translate to upper case theAction=`echo $1 | tr [a-z] [A-Z]` # Now, Check what we have case $theAction in BYTES) getBytes ;; USERS) getUserList ;; HOSTS) getAccessCount ;; *) echo "Unknown command $theAction" ;; esac } #---- Main # if [ "$1" = "" ] then echo "No unit specified" exit 1 fi UNIT=$1 # Remove $1 from the parm line shift # Find all the entries we're interested in grep "/pub/$UNIT" $LOGFILE > $TEMPFILE # Right - for every parameter on the command line, we perform some for ACTION in $@ do process_action "$ACTION" done # Remove Temp file rm $TEMPFILE # We're finished! Final notes Throughout this chapter we have examined shell programming concepts including: * variables * comments * condition statements * repeated action commands * functions * recursion * traps * efficiency, and * structure Be aware that different shells support different syntax - this chapter has dealt with bourne shell programming only. As a final issue, you should at some time examine the Perl programming language as it offers the full functionality of shell programming but with added, compiled-code like features - it is often useful in some of the more complex system administration tasks. Review Questions 8.1 Write a function that equates the username in the scanit program with the user's full name and contact details from the /etc/passwd file. Modify scanit so its output looks something like: *** Restricted Site Report *** The following is a list of prohibited sites, users who have visited them and on how many occasions Bruce Jamieson x9999 mucus.slime.com 3 Elvira Tonsloy x1111 mucus.slime.com 1 Elvira Tonsloy x1111 xboys.funnet.com.fr 3 Elvira Tonsloy x1111 warez.under.gr 1 (Hint: the fifth field of the passwd file usually contains the full name and phone extension (sometimes)) 8.2 Modify scanit so it produces a count of unique user/badsite combinations like the following: *** Restricted Site Report *** The following is a list of prohibited sites, users who have visited them and on how many occasions Bruce Jamieson x9999 mucus.slime.com 3 Elvira Tonsloy x1111 mucus.slime.com 1 Elvira Tonsloy x1111 xboys.funnet.com.fr 3 Elvira Tonsloy x1111 warez.under.gr 1 4 User/Site combinations detected. 8.3 Modify scanit so it produces a message something like: There were no users found accessing prohibited sites! if there were no user/badsite combinations. References Kochan S.G. et al "UNIX Shell Programming" SAMS 1993, USA Jones, D "Shell Programming" WWW Notes Newmarch, J "Shell Programming" http://pandonia.canberra.edu.au/OS/13_1.html Source of scanit #!/bin/bash # # AUTHOR: Bruce Jamieson # DATE: Feb 1997 # PROGRAM: scanit # PURPOSE: Program to analyse the output from a network # monitor. "scanit" accepts a list of users to # and a list of "restricted" sites to compare # with the output from the network monitor. # # FILES: scanit shell script # netwatch output from network monitor # netnasties restricted site file # # NOTES: This is a totally made up example - the names # of persons or sites used in data files are # not in anyway related to reality - any # similarity is purely coincidental :) # # HISTORY: bleak and troubled :) # checkfile() { # Goes through the netwatch file and saves user/site # combinations involving sites that are in the "restricted" # list while read buffer do username=`echo $buffer | cut -d" " -f1` site=`echo $buffer | cut -d" " -f2 | sed s/\\\.//g` for checksite in $badsites do checksite=`echo $checksite | sed s/\\\.//g` # echo $checksite $site if [ "$site" = "$checksite" ] then usersite="$username$checksite" if eval [ \$$usersite ] then eval $usersite=\`expr \$$usersite + 1\` else eval $usersite=1 fi fi done done < netwatch } produce_report() { # Goes through all possible combinations of users and # restricted sites - if a variable exists with the combination, # it is reported for user in $* do for checksite in $badsites do writesite=`echo $checksite` checksite=`echo $checksite | sed s/\\\.//g` usersite="$user$checksite" if eval [ \$$usersite ] then eval echo "$user: $writesite \$$usersite" usercount=`expr $usercount + 1` fi done done } get_passwd_users() { # Creates a user list based on the /etc/passwd file while read buffer do username=`echo $buffer | cut -d":" -f1` the_user_list=`echo $username $the_user_list` done < /etc/passwd } check_data_files() { if [ -r netwatch -a -r netnasties ] then return 0 else return 1 fi } # Main Program # Uncomment the next line for debug mode #set -x if check_data_files then echo "Datafiles found" else echo "One of the datafiles missing - exiting" exit 1 fi usercount=0 badsites=`cat netnasties` if [ $1 ] then the_user_list=$* else get_passwd_users fi echo echo "*** Restricted Site Report ***" echo echo The following is a list of prohibited sites, users who have echo visited them and on how many occasions echo checkfile produce_report $the_user_list echo if [ $usercount -eq 0 ] then echo "There were no users found accessing prohibited sites!" else echo "$usercount prohibited user/site combinations found." fi echo echo # END scanit send questions to dachronic1@hotmail.com happy shelling