xargs examples

While Unix provides many ways to accomplish the same task, not every command generates the same data and not every command that generates the same data does so with the same speed and efficiency.

In response to one or more of the recently presented tips on how to most effectively use the find command, so many readers have written to suggest the use of xargs that I feel compelled to summarize how and why this command is a better choice than the find command's-exec option for most routine tasks.

What is xargs?

Before we slide into the xarg examples and describe the efficiencies at work when this command is used, let's briefly examine the xargs command and how it is typically used.

First, the xargs command is unlike most other Unix commands in that its function is not so much to do something on its own, but to construct other commands. This peculiar characteristic leads some to refer to xargs as a metacommand To create commands, xargs takes whatever is passed to it from standard input and combines it with whatever arguments were provided to it in a very linear way.

For example, if you wanted to view next month's calendar, you might do this:

% echo 6 2001 | xargs cal

This command would have the same effect as typing cal 6 2001 While nothing is gained in this example by using xargs instead of simply typing cal 6 2001 we can see that xargs creates the command cal 6 2001 by issuing the cal command with the numbers passed to it on standard in.

If you want to see the command that xargs is creating, add the -t argument following xargs. The command echo 6 2001 | xargs -t cal would display cal 6 2001 before displaying the calendar.

A more practical use of xargs would be to search for recently created large files when the used space in a file system has suddenly increased. For this, you might use a command such as this:

% find . -type f -mtime -1 -size +10000 | xargs du -sk
116576  ./websvr/access/log1.tar.gz
67536   ./websvr/access/log2.tar.gz
69136   ./websvr/access/log3.tar.gz
126296  ./websvr/access/log4.tar.gz
118672  ./websvr/access/log5.tar.gz
118344  ./websvr/access/log6.tar.gz
115040  ./websvr/access/log7.tar.gz
19424   ./websvr/access/log8.tar.gz
102816  ./websvr/access/log11.tar.gz
109520  ./websvr/access/log12.tar.gz
102832  ./websvr/access/log13.tar.gz
99824   ./websvr/access/log14.tar.gz
100112  ./websvr/access/log15.tar.gz
100752  ./websvr/access/log16.tar.gz
109872  ./websvr/access/log17.tar.gz

In this example, we have used the find command to locate files larger than 10,000 blocks that were modified within the last day. The du -sk command reports the file sizes in kilobytes. Apparently, one of our users seems to have collected a large number of log files.

xargs versus -exec

The comments sent in by readers followed one or both of two tracks. Some people prefer xargs to -exec simply because the syntax seems a little more natural. If you have ever typed a long find command with an exec clause only to forget the final "\;", you're no doubt familiar with the "find: incomplete statement" error. If you have trouble remembering "\;", you can replace a command of this variety:

% find . -type f -exec wc -l {} \;

with one that looks like this:

% find . -type f | xargs wc -l

The second command has a cleaner look and is also likely to run faster.

Almost everyone who wrote preferred xargs because of a fundamental efficiency which often makes the command run ten, or maybe even one thousand times faster (only in cases where a huge number of files is being searched).

When find is used with -exec, the wc command is invoked separately for each located file. When find is used with xargs, wc is called as each successive group of files is ready to be analyzed.

The xargs command also makes it quite a bit easier to print the names of files along with the line contents when looking for particular strings. Let's compare three commands intended to do the same thing.

% find . -exec grep bug {} \; -print
A bug in NetBackup that will manifest itself by numerous ghost jobs
./kb/ET2155673
% find . -exec grep bug {} /dev/null \;
./kb/ET2155673:A bug in NetBackup that will manifest itself by numerous ghost jobs
% find . -print | xargs grep bug
./kb/ET2155673:A bug in NetBackup that will manifest itself by numerous ghost jobs

In the first example, we added a -print command at the end of the find command, causing the file name to be printed on the line following the located text.

In the second, we've used an interesting trick to print the file name on the same line as the located text. The syntax of the command and why the inclusion of /dev/null works in this way is anything but intuitive.

In the third example, we again get the file name followed by the text, but with a command that is considerably less intimidating.

If you want to print only the name of the files found, you can use a command like one of the following:

% find . -exec grep -l bug {} \;
./kb/ET2155673
% find . | xargs grep -l bug
./kb/ET2155673

More on xargs

The examples we've looked at so far have all shown xargs taking what is passed to it on standard and appending it to the end of the command that is being created. In the first xargs command shown in this column, echo 6 2001 | xargs cal demonstrated how the fields passed on standard in were tacked on to the end of the command being created (cal 6 2001). If xargs were always this restrictive, it would still be a very useful command, but there would be many things that we couldn't easily do. We would have trouble, for example, finding a number of files and moving them to /var/tmp if the variable part of the command has to come last.

Fortunately, xargs has numerous options and one of them provides a way to control where the data being passed on standard in are to be inserted into the command being created. Using the -i option and the same kind of placeholder that -exec uses, we can take care of such tasks with commands such as this:

% find . -nouser | xargs -i cp {} /var/tmp