Using DOS Batch Files to Run Experiments on Windows
By Matthew MacLeod (2006)
When developing research code, one often wants to run test their experimental
code on a series of problems, without necessarily building a full-fledged
interactive interface. Especially for those familiar with Linux or Unix, script
files seem like a natural solution for automating test runs, as performance of
the test script is generally far less important than the test themselves. As I'm
doing my thesis work in Windows, I delved into the world of .bat
files to perform this scripting. I discovered that batch files are both more and
less powerful than I expected, and have some weird quirks. This is a hopefully
useful catalog of my findings. And before you ask, yes, I am aware of Cygwin, and use it for some tasks as well, but
the native scripting interface is easier to work with in many cases.
The Basics
First off, you'll need to create a batch file with the
extension .bat
. I gave mine the rather uninspired
namescript.bat
. You can run this file either by double-clicking it
in Windows Explorer, or by typing script
at the command line in the
directory in which it is stored. The second is often desirable when you're
debugging, as the command window will automatically disappear when the batch
file exits if you run it from Explorer, taking any ouptut with it. Within the
file you can place any commands that you would normally execute from the command
line, essentially as you would normally use them. Read on for the
exceptions.
Running a Series of Tests
If you had to type out every test individually, you probably
wouldn't gain much from using a script. What you likely want to do is pass data
into your code one at a time. For example, I work on problems in the
AMPL modelling language, which stores individual
problems in
.mod
files. To run my program once on every problem in
a directory, I use a FOR statement like this:
FOR %%s IN (*.mod) DO
thesis.exe %%s
As you might guess from the statement syntax, the
%%s
gets expanded into the name of each
.mod
file in
the directory. When you run the batch file it will execute, for
instance:
thesis.exe problem1.mod
thesis.exe
problem2.mod
thesis.exe problem3.mod
And so on. Parsing the
filename passed to your program into something useful is highly dependent on the
language you use, and you'll need to work that one out yourself.
Saving the Output of Your Program to a File
If you're running a lot of tests, any output from your program will likely
scroll off the top of your screen while you're off having a coffee or sleeping
or any other number of useful things. If you ran the script from Windows
Explorer this is even worse, as the fatal error message your program helpfully
printed at 3:23 AM disappeared when the script aborted. So it's a good idea to
redirect the output to a file.
The simplest way to direct output to a file is with the >
filename
and >> filename
operators. The major
difference between the two is that a single angled bracket will overwrite
filename
if it exists, whereas the double bracket version will
append to the end of the file. If you're running a batch of tests you probably
want to append to a log file, but if you reuse the same script remember to
change the filename or move the original log out of the way if you want to
separate your results.
Extending the example from the previous section, you can use a statement like
this to capture all the output from every run of your program into one
file:
FOR %%s IN (*.mod) DO thesis.exe %%s >> scriptout.txt
.txt
is the standard extension for text files
recognized by Windows Notepad and the like, so it's easiest to use that,
although any extension will do. If you want to add some other information to the
file that your program doesn't output, you can use the ECHO
statement. Also useful are the date
and time
commands.
To write the date and time at the top of your output file, you can use the
following:
echo Experiment Time: >> testout.txt
date /t
>> scriptout.txt
time /t >> scriptout.txt
Which
will produce something like the following in
scriptout.txt
:
Experiment Time:
30/01/2006
05:21
PM
Compiling Your Output Data
Probably the easiest spreadsheet readable format your program can output data
to is Comma
Separated Value format. Each line of the file is a row of the spreadsheet,
with columns on each row separated by commas (other separators are also
possible). Originally I had my program append directly to a shared results file,
with each run adding a row. This was problematic for several reasons:
- One bad run could corrupt the rest of the file
- It made it hard to restart half finished runs
- Examining results partway through a run was perilous, as my program may be
trying to access the one shared file
Instead, I opted to have each run output to its own result file, and then
concatenate the individual files together using the script. This only really
addresses the third point and part of the second one, but it's a start. For ease
of scripting, the results for each problem1.mod
file are output to
a corresponding problem1.mod.csv
file. Yes, Windows has sort of
caught on to the multiple file extensions thing. They'll open (basically) fine
in Excel or OpenOffice. To accomplish this, I use something like the
following:
copy ..\results.csv .\results.csv
FOR %%s IN (*.mod)
DO thesis.exe %%s
FOR %%s IN (*.mod.csv) DO copy results.csv+%%s
results.csv
The first line copies a file with the headers for
each column from the directory above it (..\
is the representation
for going up a directory level). I keep a fresh copy there so it doesn't get
clobbered. The second line you'll recognize from above, it invokes my code on
each problem. Somewhere in there the results of the test get output to a
.mod.csv
file, which again you'll have to figure out for yourself.
The last line goes through all the output result files and appends them to
results.csv
. Note that if you have any stray result files from
previous runs in the directory they'll get appended as well, so make sure to
clear those out. There may be a cleaner way to do this inside a single FOR loop,
but I haven't put the effort in to figure that syntax out yet. Note that
appending them results in a "unknown character" square showing up at the start
of each line in Excel and Notepad, although this does not show up in OpenOffice.
Haven't tracked down the culprit, but as my first column is just the model name
it's easy to ignore.
Sending Yourself the Output
As alluded to above, while running these large batches of
tests you probably have other things to do. Like sleeping. Especially sleeping.
Which you generally do at your apartment, not your lab, so a little "ding" when
your program finishes may not help you much. If you're like me what you'd really
like is to get your data without having to go out in the blizzard or blazing
heat or whatever Ottawa has decided to hit you with this month. The IT
departments at most labs are clever enough to not let you indiscrimantely
remotely access your University workstation, so other solutions are in order.
One of the simplest is to just e-mail your results out as an attachment. To that
end I've discovered this handy tool called
Blat. The setup is not too complicated, although
you may have to poke at the registry a bit to get your default server set up
properly. Once that's done, you can finish off your batch file with something as
simple as:
"C:\Program Files\Blat250\full\blat.exe" - -body results
-subject results -to your.name@dept.university.ca -attach results.csv
And you're done. Nothing like waking up to a steaming hot pot of
results in the morning. Well, maybe.
Command Line Options
Hard coding everything into your script isn't that flexible, as you'll
probably want to try your program with a series of different options. Any
options you pass to the batch file are accessed using the %
operator. The first option is %1
, the second %2
, etc.
My particular program has 7 options, so my final script looks like
this:
copy ..\results.csv
.\results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv
FOR %%s IN (*.mod) DO
thesis.exe %%s %1 %2 %3 %4 %5 %6 %7 >> scriptout.txt
FOR %%s IN
(*.mod.csv) DO copy results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv+%%s
results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv
"C:\Program
Files\Blat250\full\blat.exe" - -body results -subject results -to
your.name@dept.university.ca -attach
results_a%1_b%2_minp%3_maxp%4_n%5_w%6_pb%7.csv
As you can see all
7 options are passed to the program, as well as being incorporated into the name
of the result file for easier sorting later.
Scripting Your Script
Now that we've got all these options set up, it would be nice to run the
script repeatedly with different options without having to come in and restart
it manually. To do this, we'll want to write a script that calls our script.
This was the major "gotcha" I encountered with batch files - if you are
invoking another batch file from within a batch file, you should use the
CALL
statement. Otherwise the calling script will exit when
the called script exits. This is obviously not what you want if you're planning
on calling it several times. One of my "meta-scripts" (or script of scripts)
looks like this:
call script.bat 1 0.001 0 0.75 20 50 1
call
script.bat 1 0.001 0 0.75 20 50 2
call script.bat 1 0.001 0 0.75 20 50 3