Welcome

My name is Jason and I am a software developer in the Bay Area. For fun I like to hike, run, shoot some photos, and take advantage of the many other activities California state has to offer. To the right you will see my resume.

Recent Books
  • Head First Design Patterns
    Head First Design Patterns
    by Elisabeth Freeman, Eric Freeman, Bert Bates, Kathy Sierra, Elisabeth Robson
« Using a Class Factory With Singletons | Main | Interview question: Find the most used character using a Hashtable »
Sunday
Apr212013

Quick Fix: File count of a directory

Recently I had put together a DOS batch script that would monitor the number of PDFs in a directory. Once that count got to or above 500 it would then turn on some services to process the PDFs. Now there's probably numerous ways to determine the number of files in a given path, but I wanted to do it in a single line and ideally exercise existing DOS functionality where possible. That's how I came up with the following single line command chain.

for /f "tokens=1" %%A in ('dir %pdfDir%\*.pdf 2^>nul ^| find /i "File(s)" ^|^| echo 0') do (set fileCount=%%A)

First, let me explain why all the carrots “^” are present. If you didn't know, this is how you escape special characters, like pipes “|” and redirects “>”  in DOS. Since we're using the FOR command, we want to escape these characters because the FOR command will take the entire string given and run it internally to process the output. Before this happens though DOS will read the entire line, process/parse any variables and/or special characters, like pipes and redirects, and then run the command. This parsing can get confused when it sees the pipes and redirects without carrots and thinks you'll want to execute these actions before passing them to the FOR command. This is the same as to why you have to use double percent signs “%%” in a script for the FOR command, but not when using it on the command line. The double percent signs is how you escape percent signs so the DOS batch file doesn't treat it like a variable it should process, but let's the FOR command process it instead.

C:\TechBlog>echo Redirect = ^>  Pipe = ^|  Double pipe = ^|^|
Redirect = >  Pipe = |  Double pipe = ||

If you don't understand the “2>nul” syntax, please see my earlier post “Redirection of the output streams in DOS scripts”.

So let's break down what's happening here. I'm going to simplify the full command to the following to help explain what's going on and later describe why the “echo 0” is present.

for /f "tokens=1" %%A in ('dir %pdfDir%\*.pdf 2^>nul ^| find /i "File(s)"') do (set fileCount=%%A)

When you do a DIR on a directory you get output like the following.

C:\TechBlog>dir
 Volume in drive C has no label.
 Volume Serial Number is A48F-FE2B

 Directory of C:\TechBlog

04/21/2013  12:00 PM    <DIR>          .
04/21/2013  12:00 PM    <DIR>          ..
04/21/2013  12:00 PM               391 PdfFile1.pdf
04/21/2013  12:00 PM               444 PdfFile2.pdf
04/21/2013  12:00 PM               338 TextFile.txt
               3 File(s)          1,173 bytes
               2 Dir(s)  16,957,591,552 bytes free

Most of this information we don't care about except for the line stating the file count. We want to isolate this line so that we can use the FOR command to parse it. To do this we pipe the output of the DIR into the FIND command. The FIND is setup to only return lines that have the text “File(s)” in them. This line of text is returned and the FOR command parses it. Because we're using the FOR command with “tokens=1” we're saying that after the FOR command parses the line, which by default is by whitespaces, we only want the first token found. That token will be the number of files. So now when the FOR command runs the command (set fileCount=%%A) it's setting fileCount to the number of files found in the directory. We have our file count in a variable and ready to use!

Ah, but there's a catch. There's always a catch. Sadly when you do a DIR on something specific, like “dir *.doc” and there are no DOCs present it doesn't simply return “0 File(s)”, but the message “File Not Found” which isn't really useful for what we're trying to accomplish. We need some error handling.

C:\TechBlog>dir *.doc
 Volume in drive C has no label.
 Volume Serial Number is A48F-FE2B

 Directory of C:\TechBlog

File Not Found

This is where the “echo 0” comes in. Going back to the original full command, when DIR doesn't pipe any text into FIND that contains “File(s)”, the FIND command returns a failed (non-zero) exit code. Adding “||” to the end of a command means only run the following command if the previous has failed. Since FIND has failed, the “echo 0” command is ran returning the text “0” which FOR then parse. This then resolves our “File Not Found” scenario and we're good to go!

Do you know of another single line command/command chain to determine the number of files in a path? If so, share it! I'd love to see how others have accomplished this.

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>