I've been incredibly impressed with FreeBSD's
shquote(3) function as a singular way to prevent command injection in a server process. I'm currently working on a RESTful API (aren't we all), backed in part by a server daemon written in C. The largest threat to any networked API is user input. A server developer must assume that all input is malicious until proven valid. There are many types of injection attacks against web services; I've been focused on Unix command injection today.
Let's say you're building a service which takes a directory path as input and creates this directory path on the server's file system. In C we can just call:
1 void mkdir_recursive(char* path) {
2 char* cmd;
3 asprintf(&cmd, "mkdir -p %s", path);
4 system(cmd);
5 }
Now wait! Using
system() is bad! It's prone to command injection attacks and there's usually always a programmable API to do the same thing in a safer, cleaner way.
I agree, but I am interested in exploring command injection and
mkdir -p
is so convenient for recursive directory creation. There's no easy libc alternative and writing it myself would take 10x as much code. Yes, I'm complaining about 3 lines vs 30, but Larry Wall says laziness is one of a programmers greatest virtues.
So normally to protect this simple code from command injection we'd validate the input for shell meta-characters. We must check for ; & ( ) ` | > < $. These could allow an attacker to append an arbitrary command onto our simple mkdir, read or write files, and access environment variables. For instance we could receive a path which would make the above command look like this:
mkdir -p a/new/dir ; cat /etc/passwd > /usr/local/apache/htdocs/passwd
Which unfiltered would cause the server's local user list to now be published in a publicly accessible directory. To further cause issues, in my particular use case, all of the above characters are valid for Unix directory paths. So simply removing them from the user's input would be incorrect behavior.
In comes
shquote(). It protects against all of these command injections and allows these characters to be used in path names, because it puts everything in strong quotes. With a refactor of our above code snippet:
1 void mkdir_recursive(char* path) {
2 char *cmd;
3 int len = shquote(path, NULL, 0) + 1;
4 char *safe_path = malloc(len);
5 shquote(path, &safe_path, len);
6 asprintf(&cmd, "mkdir -p %s", path);
7 system(cmd);
8 }
The execute command now looks like this:
mkdir -p 'a/new/dir ; cat /etc/passwd > /usr/local/apache/htdocs/passwd'
Which will safely create 10 new directories, instead of calling 2 commands.
# while [ "`ls`" ]; do ls -F && cd "`ls`"; done
a/
new/
dir ; cat /
etc/
passwd > /
usr/
local/
apache/
htdocs/
passwd/
Surely we can still trick
shquote() with embedded \ or ' characters right? Let's try adding intermediate quotes to break up the two commands:
a/new/dir' ; 'cat /etc/passwd > /usr/local/apache/htdocs/passwd
becomes
mkdir -p 'a/new/dir'\'' ; '\''cat /etc/passwd > /usr/local/apache/htdocs/passwd'
# while [ "`ls`" ]; do ls -F && cd "`ls`"; done
a/
new/
dir' ; 'cat /
etc/
...
That didn't give us an exploit. So let's try escaping those intermediate quotes:
a/new/dir\' ; \'cat /etc/passwd > /usr/local/apache/htdocs/passwd
becomes
mkdir -p 'a/new/dir\'\'' ; \'\''cat /etc/passwd > /usr/local/apache/htdocs/passwd'
# while [ "`ls`" ]; do ls -F && cd "`ls`"; done
a/
new/
dir\' ; \'cat /
etc/
...
My simple function handles them both correctly with successive strong quote wrappings. It still isn't protected from other attacks like path traversal trickery using embedded .. but
shquote() goes a long way in santizing input. I couldn't think of a way to break it. Please tell me if you can.