I am calling different processes with the
subprocess module. However, I have a question.
In the following codes:
callProcess = subprocess.Popen(['ls', '-l'], shell=True)
callProcess = subprocess.Popen(['ls', '-l']) # without shell
Both work. After reading the docs, I came to know that
shell=True means executing the code through the shell. So that means in absence, the process is directly started.
So what should I prefer for my case - I need to run a process and get its output. What benefit do I have from calling it from within the shell or outside of it.
The benefit of not calling via the shell is that you are not invoking a 'mystery program.' On POSIX, the environment variable
SHELL controls which binary is invoked as the "shell." On Windows, there is no bourne shell descendent, only cmd.exe.
So invoking the shell invokes a program of the user's choosing and is platform-dependent. Generally speaking, avoid invocations via the shell.
Invoking via the shell does allow you to expand environment variables and file globs according to the shell's usual mechanism. On POSIX systems, the shell expands file globs to a list of files. On Windows, a file glob (e.g., "*.*") is not expanded by the shell, anyway (but environment variables on a command line are expanded by cmd.exe).
If you think you want environment variable expansions and file globs, research the
ILS attacks of 1992-ish on network services which performed subprogram invocations via the shell. Examples include the various
sendmail backdoors involving
In summary, use
import subprocess subprocess.call('echo $HOME') Traceback (most recent call last): ... OSError: [Errno 2] No such file or directory >>> subprocess.call('echo $HOME', shell=True) /user/khong 0
Setting the shell argument to a true value causes subprocess to spawn an intermediate shell process, and tell it to run the command. In other words, using an intermediate shell means that variables, glob patterns, and other special shell features in the command string are processed before the command is run. Here, in the example, $HOME was processed before the echo command. Actually, this is the case of command with shell expansion while the command ls -l considered as a simple command.
source: Subprocess Module
An example where things could go wrong with Shell=True is shown here
from subprocess import call filename = input("What file would you like to display?\n") What file would you like to display? non_existent; rm -rf / # THIS WILL DELETE EVERYTHING IN ROOT PARTITION!!! call("cat " + filename, shell=True) # Uh-oh. This will end badly...
Check the doc here: subprocess.call()
Executing programs through the shell means that all user input passed to the program is interpreted according to the syntax and semantic rules of the invoked shell. At best, this only causes inconvenience to the user, because the user has to obey these rules. For instance, paths containing special shell characters like quotation marks or blanks must be escaped. At worst, it causes security leaks, because the user can execute arbitrary programs.
shell=True is sometimes convenient to make use of specific shell features like word splitting or parameter expansion. However, if such a feature is required, make use of other modules are given to you (e.g.
os.path.expandvars() for parameter expansion or
shlex for word splitting). This means more work, but avoids other problems.
In short: Avoid
shell=True by all means.
The other answers here adequately explain the security caveats which are also mentioned in the
subprocess documentation. But in addition to that, the overhead of starting a shell to start the program you want to run is often unnecessary and definitely silly for situations where you don't actually use any of the shell's functionality. Moreover, the additional hidden complexity should scare you, especially if you are not very familiar with the shell or the services it provides.
Where the interactions with the shell are nontrivial, you now require the reader and maintainer of the Python script (which may or may not be your future self) to understand both Python and shell script. Remember the Python motto "explicit is better than implicit"; even when the Python code is going to be somewhat more complex than the equivalent (and often very terse) shell script, you might be better off removing the shell and replacing the functionality with native Python constructs. Minimizing the work done in an external process and keeping control within your own code as far as possible is often a good idea simply because it improves visibility and reduces the risks of -- wanted or unwanted -- side effects.
Wildcard expansion, variable interpolation, and redirection are all simple to replace with native Python constructs. A complex shell pipeline where parts or all cannot be reasonably rewritten in Python would be the one situation where perhaps you could consider using the shell. You should still make sure you understand the performance and security implications.
In the trivial case, to avoid
shell=True, simply replace
subprocess.Popen("command -with -options 'like this' and\\ an\\ argument", shell=True)
subprocess.Popen(['command', '-with','-options', 'like this', 'and an argument'])
Notice how the first argument is a list of strings to pass to
execvp(), and how quoting strings and backslash-escaping shell metacharacters is generally not necessary (or useful, or correct).
Maybe see also When to wrap quotes around a shell variable?
If you don't want to figure this out yourself, the
shlex.split() function can do this for you. It's part of the Python standard library, but of course, if your shell command string is static, you can just run it once, during development, and paste the result into your script.
As an aside, you very often want to avoid
Popen if one of the simpler wrappers in the
subprocess package does what you want. If you have a recent enough Python, you should probably use
check=Trueit will fail if the command you ran failed.
stdout=subprocess.PIPEit will capture the command's output.
text=True(or somewhat obscurely, with the synonym
universal_newlines=True) it will decode output into a proper Unicode string (it's just
bytesin the system encoding otherwise, on Python 3).
I'll close with a quote from David Korn: "It's easier to write a portable shell than a portable shell script." Even
subprocess.run('echo "$HOME"', shell=True) is not portable to Windows.
Anwser above explains it correctly, but not straight enough.
ps command to see what happens.
import time import subprocess s = subprocess.Popen(["sleep 100"], shell=True) print("start") print(s.pid) time.sleep(5) s.kill() print("finish")
Run it, and shows
start 832758 finish
You can then use
ps -auxf > 1 before
finish, and then
ps -auxf > 2 after
finish. Here is the output
cy 71209 0.0 0.0 9184 4580 pts/6 Ss Oct20 0:00 | \_ /bin/bash cy 832757 0.2 0.0 13324 9600 pts/6 S+ 19:31 0:00 | | \_ python /home/cy/Desktop/test.py cy 832758 0.0 0.0 2616 612 pts/6 S+ 19:31 0:00 | | \_ /bin/sh -c sleep 100 cy 832759 0.0 0.0 5448 532 pts/6 S+ 19:31 0:00 | | \_ sleep 100
See? Instead of directly running
sleep 100. it actually runs
/bin/sh. and the
pid it prints out is actually the
/bin/sh. After if you call
s.kill(), it kills
sleep is still there.
cy 69369 0.0 0.0 533764 8160 ? Ssl Oct20 0:12 \_ /usr/libexec/xdg-desktop-portal cy 69411 0.0 0.0 491652 14856 ? Ssl Oct20 0:04 \_ /usr/libexec/xdg-desktop-portal-gtk cy 832646 0.0 0.0 5448 596 pts/6 S 19:30 0:00 \_ sleep 100
So the next question is , what can
/bin/sh do? Every linux user knows it, heard it, and uses it. But i bet there are so many people who doesn't really understand what is
shell indeed. Maybe you also hear
/bin/bash, they're similar.
One obvious function of shell is for users convenience to run linux application. because of shell programm like
bash, you can directly use command like
ls rather than
/usr/bin/ls. it will search where
ls is and runs it for you.
Other function is it will interpret string after
$ as environment variable. You can compare these two python script to findout yourself.
subprocess.call(["echo $PATH"], shell=True)
And the most important, it makes possible to run linux command as script. Such as
else are introduced by shell. it's not native linux command
let's assume you are using shell=False and providing the command as a list. And some malicious user tried injecting an 'rm' command. You will see, that 'rm' will be interpreted as an argument and effectively 'ls' will try to find a file called 'rm'
'ls','-ld','/home','rm','/etc/passwd']) ls: rm: No such file or directory -rw-r--r-- 1 root root 1172 May 28 2020 /etc/passwd drwxr-xr-x 2 root root 4096 May 29 2020 /home CompletedProcess(args=['ls', '-ld', '/home', 'rm', '/etc/passwd'], returncode=1)subprocess.run([
shell=False is not a secure by default, if you don't control the input properly. You can still execute dangerous commands.
'rm','-rf','/home']) CompletedProcess(args=['rm', '-rf', '/home'], returncode=0) subprocess.run(['ls','-ld','/home']) ls: /home: No such file or directory CompletedProcess(args=['ls', '-ld', '/home'], returncode=1) >>>subprocess.run([
I am writing most of my applications in container environments, I know which shell is being invoked and i am not taking any user input.
So in my use case, I see no security risk. And it is much easier creating long string of commands. Hope I am not wrong.