Howto execute system commands in Perl and possible danger¶
There are various ways to run system subproces in Perl. I will mention
only 7 - few native (exec()
, system
, qx{}
/``) and few which use
additional libraries (open("|"
), IPC::Open2
, IPC::Open3
, IPC::Cmd
)
which are in fact in standard Perl distribution so they can be used without worries.
Introduction¶
Most people think that running system command from Perl is only done
by system()
or exec()
, but there are many ways to achieve this task -
some are better some are worse.
Each of them has different performance, even specific usage of function could increase/decrease performance.
This post is written only to help programmer choose right solution for task (solution secure, flexible and with best performance).
Podpowiedź
Note: I am using in this article some (quite much) text which is
copied from PerlDoc - it will be in tag: <cite>
.
Executing system command - possible ways¶
IPC::Run - PerlDoc Page - not covered in this article (it’s not part of standard Perl distribution) on Unix/Linux - AFAIK
If you don’t want to scroll to summary or conclusion click hyperlink.
Running sub-process via exec()¶
exec LIST
exec PROGRAM LIST
The „exec” function executes a system command and never returns back to executing script which had run exec!!! It fails and returns false only if the command does not exist and it is executed directly instead of via system command shell.
#!/usr/bin/perl
use strict;
use warnings;
my @args = ( "echo", "Hello world" );
# Example 1.
exec join(" ", @args); # Very insecure! This will be always splited
# by \s+ and checked for shell metacharacter
# and this can be ran using "sh -c".
# PLEASE DON'T USE THIS!!!
# Example 2
exec @args or die "No echo"; # More secure - escape to shell is only
# done when scalar @args == 1
# Better NOT to use this!
# Example 3 - The most secure!
exec { $args[0] } @args or die "No echo"; # The most secure example
# It is safe even with one-arg list
# I recommend using this!
# Examle 4 - Phail!
my @secTest = ( join(" ", @args) );
exec { $secTest[0] } @secTest; # System will try to run "echo Hello world"
# program. Variable $? will be set to non zero
# This will not be reached
print "After exec";
If there is more than one argument in LIST, or if LIST is an array with more than one value, calls execvp(3) with the arguments in LIST. If there is only one scalar argument or an array with one element in it, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system’s command shell for parsing (this is „/bin/sh -c” on Unix platforms, but varies on other platforms).
This means that if you are not using shell redirects
(>&, >>, <, >, |
) it is better to pass an LIST to exec() but this
can make you application vulnerable to metacharacter shell attack. Using
an indirect object (like this: exec {'/bin/csh'} '-sh';
) with „exec”
or „system” is also more secure. This usage also works fine with
system() - it forces interpretation of the arguments as a multivalued
list.
Informacja
Perl will attempt to flush all files opened for output before the exec, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the „autoflush()” method of „IO::Handle” on any open handles in order to avoid lost output.
Note that „exec” will NOT call your „END” blocks, nor will it call any „DESTROY” methods in your objects.
User should be very careful using „exec()”, when after exec call there is a some code Perl (if using warnings;) will print this message:
Ostrzeżenie
Statement unlikely to be reached at script.pl line XX. (Maybe you meant system() when you said exec()?)
For information how to get rid of warning read
perldoc -f exec
How execution of exec()
will be seen by other users
Before execution of exec() in Perl script
|-gnome-terminal,7000 | |-bash,7002 | | `-perl,7588 test.pl | |-bash,7378 | | `-pstree,7626 -a -c -p | |-gnome-pty-helpe,7001 | `-{gnome-terminal},7003
After execution of exec() in Perl script
|-gnome-terminal,7000 | |-bash,7002 | | `-bash,7588 | |-bash,7378 | | `-pstree,7650 -a -c -p | |-gnome-pty-helpe,7001 | `-{gnome-terminal},7003
Running sub-process via system()¶
system LIST
system PROGRAM LIST
Does exactly the same thing as „exec LIST”, except that a fork is done first, and the parent process waits for the child process to complete (it is blocked for execution time of command run in system()). Note that argument processing varies depending on the number of arguments. If there is more than one argument in LIST, or if LIST is an array with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system’s command shell for parsing (this is „/bin/sh -c” on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to „execvp”, which is more efficient.
The return value is the exit status of the program as returned by the „wait” call
To get the actual exit value, shift right by eight
Return value of -1 indicates a failure to start the program or an error of the wait(2) system call (inspect $! for the reason)
How to run external command using system()
@args = ("command", "arg1", "arg2");
system(@args) == 0 or die "system @args failed: $?"
if ($? == -1) {
print "failed to execute: $!\n";
} elsif ($? & 127) {
printf "child died with signal %d, %s coredump\n", ($? & 127), ($? & 128) ? 'with' : 'without';
} else {
printf "child exited with value %d\n", $? >> 8;
}
Informacja
Perl will attempt to flush all files opened for output before the exec, but this may not be supported on some platforms (see perlport). To be safe, you may need to set
$|
($AUTOFLUSH
in English) or call theautoflush()
method ofIO::Handle
on any open handles in order to avoid lost output.SIGINT
andSIGQUIT
are ignored during the execution ofsystem()
Running sub-process via qx{}/``¶
qx/STRING/
`STRING`
qx{}
is a string which is (possibly) interpolated and then executed as a
system command with „/bin/sh” or its equivalent. Shell wildcards, pipes,
and redirections will be honored.(so be very careful). The collected
standard output of the command is returned; standard error is
unaffected. In scalar context, it comes back as a single (potentially
multi-line) string, or undef if the command failed. In list context,
returns a list of lines (however you’ve defined lines with $/
or
$INPUT_RECORD_SEPARATOR
), or an empty list if the command failed.
capture a command’s STDERR and STDOUT together:
$output = `cmd 2>&1`;
capture only a command’s STDOUT (discard STDERR):
$output = `cmd 2>/dev/null`;
capture only a command’s STDERR (discard STDOUT):
$output = `cmd 2>&1 1>/dev/null`;
read both a command’s STDOUT and its STDERR separately:
system("program args 1>program.stdout 2>program.stderr"); open(CMD_STDOUT, '<', program.stdout) or die("..."); open(CMD_STDERR, '<', program.stderr) or die("..."); # do sth with streams - for example slurp close(CMD_STDOUT); close(CMD_STDERR);Using single-quote as a delimiter protects the command from Perl’s double-quote interpolation, passing it on to the shell instead:
$perl_info = qx(ps $$); # that's Perl's $$ $shell_info = qx'ps $$'; # that's the new shell's $$
Informacja
On most platforms, you will have to protect shell metacharacters if you want them treated literally
On some platforms shell may not be capable of dealing with multiline commands
There is a way to evaluate many commands in single line (
;
on Unix,&
on Windows CMD) - this potentially can be harmfulPerl will attempt to flush all files opened for output before the exec, but this may not be supported on some platforms (see perlport). To be safe, you may need to set
$|
($AUTOFLUSH
in English) or call theautoflush()
method ofIO::Handle
on any open handles in order to avoid lost output.Beware that some command shells may place restrictions on the length of the command line (no warning)/cite>
Using this operator can lead to programs that are difficult to port, because the shell commands called vary between system/cite>
For more information please reffer to
perldoc perlop
Running sub-process via open(» |»)¶
open FILEHANDLE,EXPR
open FILEHANDLE,MODE,EXPR
open FILEHANDLE,MODE,EXPR,LIST
open FILEHANDLE,MODE,REFERENCE
open FILEHANDLE
If the filename begins with |
, it is interpreted as a command to
which output is to be piped, (writing to this file descriptor will be
passed to standard in of command) and if the filename ends with a |
,
the filename is interpreted as a command which pipes output to us.
For three or more arguments if MODE
is |-
, the filename is
interpreted as a command to which output is to be piped, and if MODE
is -|
, the filename is interpreted as a command which pipes output
to us. In the 2-arguments (and 1-argument) form one should replace
dash («-») with the command. In the three-or-more argument form of
pipe opens, if LIST is specified (extra arguments after the command
name) then LIST becomes arguments to the command invoked if the
platform supports it.
If you open a pipe on the command «-», i.e., either |-
or -|
with 2-arguments (or 1-argument) form of open(), then there is an
implicit fork done, and the return value of open is the pid of the
child within the parent process, and 0 within the child process. (Use
„defined($pid)” to determine whether the open was successful.) The
filehandle behaves normally for the parent, but i/o to that filehandle
is piped from/to the STDOUT/STDIN of the child process. In the child
process the filehandle isn’t opened–i/o happens from/to the new
STDOUT or STDIN. Typically this is used like the normal piped open
when you want to exercise more control over just how the pipe command
gets executed, such as when you are running setuid, and don’t want to
have to scan shell commands for metacharacters. The following triples
are more or less equivalent:
# Writing
open(SPOOLER, "| cat -v | lpr -h 2>/dev/null") || die "can't fork: $!";
local $SIG{PIPE} = sub { die "spooler pipe broke" };
print SPOOLER "stuff\n";
close SPOOLER || die "bad spool: $! $?";
# more writing
open(FOO, "|tr '[a-z]' '[A-Z]'");
open(FOO, '|-', "tr '[a-z]' '[A-Z]'");
open(FOO, '|-') || exec 'tr', '[a-z]', '[A-Z]';
open(FOO, '|-', "tr", '[a-z]', '[A-Z]');
# Reading
open(STATUS, "netstat -an 2>&1 |") || die "can't fork: $!";
while () { print ;}
close STATUS || die "bad netstat: $! $?";
# More reading
open(FOO, "cat -n '$file'|");
open(FOO, '-|', "cat -n '$file'");
open(FOO, '-|') || exec 'cat', '-n', $file;
open(FOO, '-|', "cat", '-n', $file);
Informacja
On most platforms, you will have to protect shell metacharacters if you want them treated literally. Think about this example:
$filename =~ s/(.*\.gz)\s*$/gzip -dc < $1|/; open(FH, $filename) or die "Can't open $filename: $!";
On some platforms shell may not be capable of dealing with multiline commands
There is a way to evaluate many commands in single line (
;
on Unix,&
on Windows CMD) - this pottentialy can be harmfulPerl will attempt to flush all files opened for output before the exec, but this may not be supported on some platforms (see perlport). To be safe, you may need to set
$|
($AUTOFLUSH
in English) or call the „autoflush()” method ofIO::Handle
on any open handles in order to avoid lost output.Beware that some command shells may place restrictions on the length of the command line (no warning)
Using this type of open function can lead to programs that are difficult to port, because the shell commands called vary between system
On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor as determined by the value of
$^F
. See$^F
in perlvarClosing any piped filehandle causes the parent process to wait for the child to finish, and returns the status value in
$?
and${^CHILD_ERROR_NATIVE}
Be careful to check both the
open()
and theclose()
return valuesIf you’re writing to a pipe, you should also trap SIGPIPE, otherwise, think of what happens when you start up a pipe to a command that doesn’t exist - your program will phail!. Perl can’t know whether the command worked because your command is actually running in a separate process whose exec() might have failed. Therefore, while readers of bogus commands return just a quick end of file, writers to bogus command will trigger a signal they’d better be prepared to handle.
For more examples please reffer to
perldoc -f open
Remember that using third argument in OPEN function is more secure, due to interpretation of meta characters (AFAIR).
Running sub-process via IPC::Open2¶
use IPC::Open2;
Ostrzeżenie
The open2()
and open3()
functions are unlikely to work
anywhere except on a Unix system or some other one purporting to be
POSIX compliant.
IPC::Open2
is module which allows to open a process for both reading
and writing. The open2()
function runs the given $cmd and connects
$chld_out
for reading and $chld_in
for writing. open2()
is really
just a wrapper around open3()
, so read informations about open3()
which can handle stder also.
It’s what you think should work when you try $pid = open(HANDLE, "|cmd args|");
Usage of function open2
is quite easy.
use IPC::Open2;
# Using sub shell - be careful for shell extensions
$pid = open2(\*CHLD_OUT, \*CHLD_IN, 'some cmd and args');
# or without using the shell
$pid = open2(\*CHLD_OUT, \*CHLD_IN, 'some', 'cmd', 'and', 'args');
# or with handle autovivification
my($chld_out, $chld_in);
# Using sub shell - be careful for shell extensions
$pid = open2($chld_out, $chld_in, 'some cmd and args');
# or without using the shell
$pid = open2($chld_out, $chld_in, 'some', 'cmd', 'and', 'args');
Notes:
The write filehandle will have autoflush turned on
If $chld_out is a string (that is, a bareword filehandle rather than a glob or a reference) and it begins with „>&”, then the child will send output directly to that file handle
If $chld_in is a string that begins with „<&”, then $chld_in will be closed in the parent, and the child will read from it directly. In both cases, there will be a dup(2) instead of a pipe(2) made
If either reader or writer is the null string, this will be replaced by an auto generated filehandle. If so, you must pass a valid lvalue in the parameter slot so it can be overwritten in the caller, or an exception will be raised
open2() returns the process ID of the child process. It doesn’t return on failure: it just raises an exception matching „/^open2:/”. However, „exec” failures in the child are not detected. You’ll have to trap SIGPIPE yourself
open2() does not wait for and reap the child process after it exits. Except for short programs where it’s acceptable to let the operating system take care of this, you need to do this yourself. This is normally as simple as calling „waitpid $pid, 0” when you’re done with the process. Failing to do this can result in an accumulation of defunct or „zombie” processes
Using open2(0 can be dangerous and make a deadlock if software run have to read whole input at once! Checkout manual for more information! Use the Comm library and two other modules from CPAN: IO::Pty and IO::Stty to fix it (more in manual)
Running sub-process via IPC::Open3¶
use IPC::Open3
IPC::Open3
, open3 is designed to open a process for reading, writing,
and error handling. Effect of invoking open3() is extremely similar to
open2()
, open3()
spawns the given $cmd
and connects CHLD_OUT
for
reading from the child, CHLD_IN
for writing to the child, and CHLD_ERR
for errors. If CHLD_ERR
is false, or the same file descriptor as
CHLD_OUT
, then STDOUT
and STDERR
of the child are on the same
filehandle.
use IPC::Open3;
$pid = open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, 'some cmd and args', 'optarg', ...);
my($wtr, $rdr, $err);
$pid = open3($wtr, $rdr, $err, 'some cmd and args', 'optarg', ...);
Informacja
The
CHLD_IN
will have autoflush turned onIf
CHLD_IN
begins with „<&”, then CHLD_IN will be closed in the parent, and the child will read from it directlyIf
CHLD_OUT
orCHLD_ERR
begins with>&
, then the child will send output directly to that filehandle. In both cases, there will be adup(2)
instead of apipe(2)
made.If either reader or writer is the null string, this will be replaced by an autogenerated filehandle. If so, you must pass a valid lvalue in the parameter slot so it can be overwritten in the caller, or an exception will be raised
The filehandles may also be integers, in which case they are understood as file descriptors
open3()
returns the process ID of the child process. It doesn’t return on failure: it just raises an exception matching/^open3:/
. However, „exec” failures in the child (such as no such file or permission denied), are just reported toCHLD_ERR
, as it is not possible to trap themIf the child process dies for any reason, the next write to CHLD_IN is likely to generate a
SIGPIPE
in the parent, which is fatal by default. So you may wish to handle this signalopen3()
does not wait for and reap the child process after it exits. Except for short programs where it’s acceptable to let the operating system take care of this, you need to do this yourself. This is normally as simple as callingwaitpid $pid, 0
when you’re done with the process. Failing to do this can result in an accumulation of defunct or „zombie” processes
Running sub-process via IPC::Cmd¶
use IPC::Cmd qw[can_run run];
IPC::Cmd
is module which helps finding and running system commands
even interactively if desired, it is almost platform independent (if
using system commands can be platform independent ;)).
The „can_run” function can tell you if a certain binary is installed and if so where - this is exactly the same as which in Bash. The „run” function can actually execute any of the commands you give it and give you a clear return value, as well as adhere to your verbosity settings.
use IPC::Cmd qw[can_run run];
my $full_path = can_run('wget') or warn 'wget is not installed!';
### commands can be arrayrefs or strings ###
my $cmd = "$full_path -b theregister.co.uk";
my $cmd = [$full_path, '-b', 'theregister.co.uk'];
### in scalar context ###
my $buffer;
if( scalar run( command => $cmd,
verbose => 0,
buffer => \$buffer )
) {
print "fetched webpage successfully: $buffer\n";
}
### in list context ###
my( $success, $error_code, $full_buf, $stdout_buf, $stderr_buf ) =
run( command => $cmd, verbose => 0 );
if( $success ) {
print "this is what the command printed:\n";
print join "", @$full_buf;
}
### check for features
print "IPC::Open3 available: " . IPC::Cmd->can_use_ipc_open3;
# ipc_run will be probably false
print "IPC::Run available: " . IPC::Cmd->can_use_ipc_run;
print "Can capture buffer: " . IPC::Cmd->can_capture_buffer;
Informacja
If there is available
IPC::Run
, and the variable$IPC::Cmd::USE_IPC_RUN
is set to trueIPC::Run
will be used to run command (full output available in buffers, interactive commands are sure to work and you are guaranteed to have your verbosity settings honored cleanly)Otherwise, if the variable
$IPC::Cmd::USE_IPC_OPEN3
is set to true, command will be executed usingIPC::Open3
. Buffers will be available on all platforms except „Win32”, interactive commands will still execute cleanly, and also your verbosity settings will be adhered to nicelyOtherwise, if verbose argument is set to true, module will fallback to simple
system()
call, capture of buffers can’t be done, but interactive commands should still workOtherwise
IPC::CMD
will try and temporarily redirect STDERR and STDOUT, do a system() call with your command and then re-open STDERR and STDOUT. This is the method of last resort and will still allow you to execute your commands cleanly. However, no buffers will be available
Ostrzeżenie
Whitespaces - When you provide a string as this argument, the string will be split on whitespace to determine the individual elements of your command. Although this will usually just Do What You Mean, it may break if you have files or commands with whitespace in them, so be careful. If you do not wish this to happen, you should provide an array reference, where all parts of your command are already separated out. Note however, if there’s extra or spurious whitespace in these parts, the parser or underlying code may not interpret it correctly, and cause an error.
# Bash command: gzip -cdf foo.tar.gz | tar -xf - # should be passed as my $cmd = "gzip -cdf foo.tar.gz | tar -xf -"; # or as $cmd = ['gzip', '-cdf', 'foo.tar.gz', '|', 'tar', '-xf', '-']; # but not as: # $cmd = ['gzip -cdf foo.tar.gz', '|', 'tar -xf -']; # WRONG!
Summary¶
Action |
|
|
|
|
|
|
|
---|---|---|---|---|---|---|---|
Capture STDOUT |
|
|
|
|
|
|
|
Capture STDERR |
|
|
|
|
|
|
|
Run interactive commands |
|
|
|
|
|
|
|
Get return value |
|
|
|
|
|
|
|
Run a sub-shell ( |
|
|
|
|
|
|
|
Present in STD Perl dist |
|
|
|
|
|
|
|
LIMITED*
- via shell redirect
LIMITED**
- only writing/reading to/from command
Conclusion¶
In this article I have presented many ways of running subproces from
Perl, in my opinion the best is IPC::Cmd
because it is the most
portable - if running system commands can be portable.
IPC::Cmd
provides easy to understand interface to interact with host system and
if you would like to capture STDOUT and STDERR is the right choice,
but it is not probably as fast as IPC::Open3
.
If you don’t like this approach you can use system()
function with additional Bash redirects - but it will be much slower.
For interesting informations about interacting Perl and your system you should read PerlFaq.