Bug Blog: execing a shell

Posted: 2010-04-01 22:30:50, Updated: 2010-04-25 05:35:45

I just spent an hour tracking down a bug involving the exec syscall. I was trying to implement a "shell" function for my programming language project. Something like this:

exitcode = shell("ls -l");

On UNIX-like systems, this involves forking off a child process that will exec the command. The parent process then waits for the child to finish and returns the exit status. Simple.

However, unlike CreateProcess under Windows, the UNIX style exec syscall requires the caller to parse the command line arguments for the new image. Since I want the user of my shell function to be able to pass anything they could type on the command line, parsing the command line arguments is not a trivial matter.

The bug

Not to worry though, I can use the sh command to do the parsing for me. I simply transform the argument to the shell function into "/bin/sh -c '<command>'". Now the problem is reduced to execing /bin/sh with 2 arguments.

But it didn't work. I got mysterious errors of the form "-c: can't open <command>".

Why?

The execve syscall (the low level exec syscall on Linux) takes as one of it's parameters an array of arguments for the new image. I was passing the array ["-c", "<command>"]. Wrong!

The command line arguments array on UNIX always starts with the name under which the binary was invoked, as any C programmer will tell you. So the shell was being invoked under the name "-c" with the single argument "<command>" which it tried to interpret as a shell script. Hence the error "-c: can't open <command>". The "-c" shell couldn't run the "shell script" called "<command>".