Debugging File::Find Problems

The perl File::Find module is commonly used to traverse directories. The default behavior of this module may cause unexpected results, notably for absolute versus relative file paths. Debugging these problems can be difficult for the inexperienced. The buggy-find script illustrates this case:

$ cd /tmp
$ mktemp -d test.XXXXXXX
test.BRQnCls
$ perl buggy-find test.BRQnCls
DBG: dir=test.BRQnCls, file=test.BRQnCls (.)
could not stat test.BRQnCls: No such file or directory
$ perl buggy-find `pwd`/test.BRQnCls
DBG: dir=/tmp/test.BRQnCls, file=/tmp/test.BRQnCls (.)

Passing the relative directory name test.BRQnCls to the script fails, while the same name fully qualified works. Note that “fails” and “works” are poor terms, as they are vague, and communicate nothing about the problem. A description of the problem—“passing a relative path causes the script to not be able to stat a directory that does exist, while using an absolute path succeeds”—and full source code should be provided when asking for help. A full description of the problem may provide clues, or at least eliminate things to check on: test.BRQnCls certainly exists, though cannot be found for some reason when supplying a relative path name.

The (invisible) problem in this script is due to the behavior of File::Find, coupled with a misuse of the variables offered by File::Find. Also, the script fails to reveal the invisible problem, which makes it difficult for someone who does not know or has forgotten about PWD to understand the issue. In both cases, File::Find has issued a chdir call into the test.BRQnCls directory. Starting with an absolute path, $File::Find::name points to /tmp/test.BRQnCls. With a relative starting path, $File::Find::name instead points to test.BRQnCls, which the script has already moved into. Therefore, the script cannot find that directory, because it is already in that directory—unless that directory contained a subdirectory with an identical name:

$ mkdir -p test/test/test
$ perl buggy-find test
debug: dir=test, file=test (.)
debug: dir=test, file=test/test (test)
debug: dir=test/test, file=test/test/test (test)
could not stat test/test/test: No such file or directory

In each case, the script, when considering test, is actually performing the stat on test/test, and so forth. This could be revealed by also having the script print the current working directory via the Cwd module as part of the debug information.

The solution is to use $_ instead of $File::Find::name, as this variable is set correctly to . when considering the directory the script has just issued a chdir into. $File::Find::name could still be used, but only for display purposes. File::Find also has a no_chdir option.

Another Gotcha & Alternative

Note that symbolic links may also complicate matters, depending on when they are encountered, and what they point to:

$ readlink /tmp
private/tmp
$ ls -ld /tmp
lrwxr-xr-x@ 1 root wheel 11 Aug 14 2009 /tmp -> private/tmp

As an alternative to File::Find, consider File::Find::Rule.