Named Pipes In Bash
Here’s something that somebody just recently told me. This is a useful and completely non-obvious tip, so I’m documenting it here in hopes that others will find it useful.
My specific problem was that I wanted to diff the output of two processes. You can’t do this with normal shell reidrection, the best you can do is:
- send the output of one process to a tempfile
- diff the tempfile and the output of the second process, i.e.:
# /sbin/lsmod >/tmp/lsmod.tmp
# ssh sys1 /sbin/lsmod | diff /tmp/lsmod.tmp -
that works, but it requires a temporary file.
The answer to this problem is process substitution. Bash has some
process substitution operators (<(foo) and >(foo)) which are very
poorly explained in the bash man page (and trust me, I’ve read that
man page a lot). foo can be any command that produces output on
stdout. Bash execs the command, creates a named pipe from the output,
and replaces the operator with the name of that pipe. You can then
read stdout from that pipe as you would from a regular file. Thus in
this case the output of foo might be fed through the file /dev/fd/64.
Now our diff example can be written like this:
# diff <(/sbin/lsmod) <(ssh sys1 /sbin/lsmod)
Note you can stuff stdout into named pipes with the >(foo) operator too, thus the output of a process can be sent into another process which you invoke on the command line. This is more flexible than the standard pipe mechanism.
Comments
yeah.. still don't get it
This is amazing! Thanks!
You know what a pipe is, but don't forget, this is about NAMED pipes. A name, is something found in the filesystem. Normally a name is attached to a file, but it could be attached to a device like /dev/null or it could be attached to a pipe. So if you have a command that takes input from two names (you though they were files but now you realize that files are a special case or the more general "names"), then the <(cmd) notation simply creates names that represent the pipe from stdout attached to the command (cmd). Simple.
This is very useful if you need to merge sort some large files:
sort -m <(zcat file.1.gz) <(zcat file.2.gz) <(zcat file.3.gz) ... | gzip -c > merged_file.gz
I've always just knows this as "process substitution"... but maybe that is not the correct terminology for it.
Or, zcat *.gz | sort -m | gzip -c > mergedfile.gz
This is useful but it's not a named pipe. A named pipe is useful for other things (for example, simple interprocess communication)
http://en.wikipedia.org/wik...
This is process substitution, not named pipes. Named pipes are the result of "mkfifo" or "mknod foo p".
The title should be ProcessSubstitutionInBash, as this is not about named pipes per se.
I guess the confusion comes from the bash(1) manual page, which mentions named pipes in the context of explaining how and when process substitution will work. It works either by using unnamed pipes in /dev/fd, or by using named pipes, depending on what's supported by the host OS. So, here on my Linux system (and probably on yours, too), it actually uses unnamed pipes:
$ ls -l <(/sbin/lsmod)
lr-x------ 1 random users 64 нов 11 14:14 /dev/fd/63 -> pipe:[14989610]
Here's a real named pipe example for the same task:
# create a named pipe and name it, unimaginatively, "/tmp/named.pipe"
$ mkfifo /tmp/named.pipe
# check out our new named FIFO
$ ls -l /tmp/named.pipe
prw-r--r-- 1 random users 0 нов 11 14:06 /tmp/named.pipe
$ /sbin/lsmod > /tmp/named.pipe &
$ ssh sys1 /sbin/lsmod | diff - /tmp/named.pipe
P.S. Also note that on most Linux systems, you don't need superuser privileges to run lsmod(8).
The point is not to run zcat with a wildcard for all .gz files, but to hand pick different files and only merge the ones that were handpicked.
hi there,
here's how I do it. I use diffmerge. I have an alias for diffmerge(sdm). diffmerge takes 2 arguments.
sdm `run process` `run process 2`