Running Perl on Unix

Unix Basics | Bad Interpreter | Security & Sanity | Ruminations on the Benefits and Pitfalls of /usr/bin/env

Unix permissions for programs, solutions to common problems running Perl programs on Unix systems—such as no such file or directory errors from files that do exist, and ruminations on why I dislike /usr/bin/env perl.

Unix Basics

First, ensure that the file is executable, and that the file can be read by the necessary user and group (or everyone on the system). parsepath is a good example, as it can help debug permission problems, though itself must be marked executable (and readable, as perl must be able to read the contents of the file):

$ ls -l parsepath
-rw-rw-r-- 1 jmates jmates 13949 Apr 26 2009 parsepath
$ chmod +rx parsepath
$ ls -l parsepath
-rwxrwxr-x 1 jmates jmates 13949 Apr 26 2009 parsepath

chmod(1) also accepts formats such as 0755. Never set modes of 0777 or o+w. Software may (rightfully) refuse to run should it detect global write permissions.

Executables should be grouped into various directories, and these directories added to the PATH environment variable. Consult the documentation for the shell or software being used to see how this is done. Never add . to the PATH, as this can lead to security problems. Instead, always use ./parsepath when running the program in a local directory, or parsepath when running it from a PATH directory:

$ which parsepath
/Users/jmates/bin/parsepath
$ parsepath .
- /Users/jmates/bin
d 0755 root:admin /
d 0755 root:admin /Users
d 0750 jmates:jmates /Users/jmates
d 0750 jmates:jmates /Users/jmates/bin
$ cd ~/bin
$ ./parsepath user=nobody +rx parsepath
! unix-other +rx fails: d 0750 jmates:jmates /Users/jmates
! unix-other +rx fails: d 0750 jmates:jmates /Users/jmates/bin

The second invcation of parsepath illustrates why the nobody user would not be able to run /Users/jmates/bin/parsepath: directory permissions would deny access to parsepath, even if parsepath itself allowed access. This is why each and every parent directory must also be checked when debugging permissions problems (and why I wrote parsepath to help automate these tedious checks).

Bad Interpreter

Unix systems read the first line of text files, and look for something to execute the subsequent data with. If this line is broken due to a typo or invisible characters, the program will not run. Note that the error below mentions the file being run, not the typo of the bin directory name: this is a common source of confusion, as the error actually stems from the file path listed on the shebang line:

$ ls -l broken
-rwxrwxr-x 1 jmates jmates 40 15 Oct 11:12 broken
$ cat broken
#!/usr/bni/perl -l
print "Hello World";
$ ./broken
zsh: no such file or directory: ./broken

To check the interpreter path, copy and paste the path listed on the shebang line to see if the file can be found. Using copy and paste is important, as humans can unconsciously correct the problem when manually typing out what they think the shebang line shows.

$ head -1 broken
#!/usr/bni/perl -l
$ file /usr/bni/perl
/usr/bni/perl: Can't stat `/usr/bni/perl' (No such file or directory)

Invisible Characters

If the path to perl listed in the shebang line exists, next check for invisible characters. These characters could be added by accident, or will exist if the file uses a linefeed format different than Unix requires. The hexdump(1), od(1), or xxd(1) commands should be learned, as they are handy for debugging. Using a hex viewer is critical, as these will show the actual characters in the file. Also consult ascii(7) for a list of character codes, and learn whether the hex editor displays values in hex, decimal, or octal.

Sample files to download and test with:

Text files on Unix may also need to be converted to ASCII, or if UTF-8, any Byte Order Mark (BOM) (invisible characters again) may need to be removed, as otherwise Unix may not be able to parse the shebang line.

DOS (Internet) Linefeeds

Internet linefeeds are \r\n, as used by DOS or Windows systems. However, Unix requires \n linefeeds.

$ head -1 dos.pl | od -c
0000000 # ! / u s r / b i n / p e r l \r
0000020 \n
0000021
$ cat -v dos.pl
#!/usr/bin/perl^M
print "Hello World\n";^M
$ xxd dos.pl
0000000: 2321 2f75 7372 2f62 696e 2f70 6572 6c0d #!/usr/bin/perl.
0000010: 0a70 7269 6e74 2022 4865 6c6c 6f20 576f .print "Hello Wo
0000020: 726c 645c 6e22 3b0d 0a rld\n";..
$ perl dos.pl
Hello World
$ chmod +rx dos.pl
$ ./dos.pl
zsh: ./dos.pl: bad interpreter: /usr/bin/perl^M: no such file or directory

Unix reads every character until a \n is found. If the program uses \r\n for linefeeds, Unix will include the \r, and search for /usr/bin/perl\r, which does not exist. The fix is to extirpate any invisible characters from the shebang line, or convert the file to use Unix linefeeds.

Legacy Mac OS Linefeeds

Legacy Mac OS (System 9 and previous) used \r for the linefeed, instead of the Internet standard \r\n or Unix \n. Unix will read the entire file, find no newline, and either consider the program an executable, or the entire program as the shebang line. Both of which fail, one silently, the other with an unusual error message:

$ perl macos.pl
$ chmod +rx macos.pl
$ ./macos.pl
zsh: exec format error: ./macos.pl
$ xxd macos.pl
0000000: 2321 2f75 7372 2f62 696e 2f70 6572 6c0d #!/usr/bin/perl.
0000010: 7072 696e 7420 2248 656c 6c6f 2057 6f72 print "Hello Wor
0000020: 6c64 5c6e 223b 0d ld\n";.

perl macos.pl is silent, as perl encounters only a single line in the file, and that line is a comment. Adding the execute and read bits and then executing ./macos.pl will reveal the problem, as will inspecting the program with a hex viewer. Hex Fiend is a nice graphical hex editor for Mac OS X.

Converting Text Formats

If the file uses a non-Unix linefeed format, tools such as dos2unix or BBEdit can convert line breaks between different formats used by Unix, Windows, and legacy Mac OS applications. Unix requires \n linefeeds.

$ perl -i -pe 's/\r/\n/g' macos.pl
$ perl -i -pe 'tr/\r//d' dos.pl

Security & Sanity

If possible, software should be changed ownership to root, and the write bits removed via ugo-w. However, denying write access by no means prevents a human from making mistakes: “oh, I thought I was root on the development box, not production.” Larger sites should therefore enforce permissions on executables via configuration management, so that if someone accidentally removes execute permissions, the correct permissions will eventually be detected, reported, and restored. Changing the ownership to root also adds security, as a malicious user or attacker who has gained control of a role account will be unable to directly and easily change the software.

Ruminations on the Benefits and Pitfalls of /usr/bin/env

I do not consider #!/usr/bin/env perl a better way to find perl than #!/usr/bin/perl and other direct invocations, just another sometimes useful, sometimes flawed method. /usr/bin/env is both optimistic and lazy, a hope that some perl exists, somewhere, without confirmation that the perl found is actually suitable for the program. Optimistic, as it hopes a correct version of perl (and related libraries and modules) always appears first in the PATH environment variable, and that no user who will ever run the program on any system will ever have a incompatible version of perl installed (or again related libraries and modules), or that the user has not tweaked their PATH to list an incorrect version of perl first, or whether env works at all (it did not on Mac OS X 10.2). Lazy, as due diligence ensuring that a perl with the proper libraries and modules and configuration is available in a consistent and reproducible manner has perhaps not been performed: and how would one know, when a random perl is being pulled out of PATH? Any user need only, at any time, to change their PATH slightly, or install a custom perl somewhere, and the program will at best fail noisily, or at worst silently use some different module version to generate data incompatible with other invocations of the program, to who knows what ill effect.

I simply set #!/usr/bin/perl, as this suits the systems I currently administer, and would use configuration management to adjust the shebang as appropriate or a software depot if a different site required a different approach. I would not throw caution to the wind, and hope that the vendor env finds something, as this strikes me as a great way to create hard to debug problems. On the other hand, App::perlbrew and a standardized way of managing the various Perl installations and various environment variables could make good use of env to select between different versions of Perl.

Software depot—see The Practice of System and Network Administration for a discussion of these—organize sets of known versions of software into unique environments, where a depot version of env—most definitely not the vendor /usr/bin/env—can determine the proper path, LD_LIBRARY_PATH, @INC, and so forth. This allows multiple versions of perl and other software, with versions appropriate to the program, to exist on the same system, with lessened risk that the wrong environment and therefore wrong perl would be used to run the wrong program. A user could still manually run program A from environment A with perl B from environment B, but that would be a manual action. The custom software depot env wrapper, without any deliberate misdirection by the user, would only run program A with perl A under environment A, or fail if unsure as to what environment must be used.