Opening a file descriptor in shell can be useful for two things: manipulating several input and output as the same time, and for performance.
How to create a file descriptors
exec 6</tmp/foo, opens the file /tmp/foo for input on the file descriptor #6. This is equivalent to the system call open("/tmp/foo", O_RDONLY)
exec 7>/tmp/bar, opens the file /tmp/bar for output on the file descriptor #7. If the file already exist it is recreated. This is equivalent to the system call open("/tmp/bar", O_WRONLY|O_CREAT).
exec 7>>/tmp/bar, opens the file /tmp/bar for appending on the file descriptor #7. If the file does not exist it is created. This is equivalent to the system call open("/tmp/bar", O_APPEND|O_CREAT).
exec 6&- close the file descriptor.
You can also do other file descriptor manipulation such as duplicating file descriptors, or redirecting standard IO to files. To learn about it I invite you to read the sh, ksh or bash man page.
Performance
When your shell script evaluates a line like "echo $x >>/tmp/file", the file is opened, the content of the variable is written and the file is then closed. If this is done a couple of times in your shell script, that's fine. But if you have a loop with several thousand writes, using a file descriptor can dramatically improve the performace of your script.
Here is a quick test I have done with the following shell script.
#!/bin/bash cat /tmp/foo | while read a do echo $a >>/tmp/bar done
The input file is a file containing 2.000.000 lines. This means that the shell script is going to open, write, and close /tmp/bar 2 million times. After running the shell, I get the following results.
$ time ./bar.sh real 18m26.416s user 4m9.820s sys 8m42.876s
Now using the same shell script with file descriptors.
#!/bin/bash # open the files exec 6</tmp/foo exec 7>/tmp/bar # data "processing" cat <&6 | while read a do echo $a >&7 done
I get these results. As you can see we have a performance increase of almost 4.5.
$ time ./bar.sh real 4m30.501s user 2m22.004s sys 1m17.993s
Playing with several file descriptors
Here is a small example of a shell script using several file descriptors. This script reads a file with the following columns "ip_address trafic_in trafic_out", and writes two files trafic_in and trafic_out.
#!/bin/bash # open the two output files exec 6>/tmp/trafic_in.dat exec 7>/tmp/trafic_out.dat #open the file containing the data for input. exec 8</tmp/all_trafic.dat # data processing grep -v '^#' <&8 | while read line do set - $(echo $line) echo "${1} ${2}" >&6 echo "${1} ${3}" >&7 done #close the file descriptors exec 6<&- exec 7<&-
When executed this code produces the following output
$ head -4 /tmp/all_trafic.dat # Host In (bytes) Out (bytes) Total (bytes) 172.16.1.1 2728242803 17456323158 20184565961 172.16.1.2 62238068877 146358768518 208596837395 172.16.1.3 123056619706 150682371892 273738991598 $ ./split_trafic.sh $ $ head -4 /tmp/trafic_in.dat 172.16.1.1 2728242803 172.16.1.2 62238068877 172.16.1.3 123056619706 172.16.1.4 7684221078 $ head -4 /tmp/trafic_out.dat 172.16.1.1 17456323158 172.16.1.2 146358768518 172.16.1.3 150682371892 172.16.1.4 1931647367
Comments
Posted by: najaf husain zaidi Oct 11, 2008 @ 02:41
great article . thaks a lot najaf husain zaidi