-
Notifications
You must be signed in to change notification settings - Fork 631
Installing and Running FDS on a Linux Cluster
A common platform for running FDS simulations is a Linux cluster which consists of multiple computers referred to as "slave nodes" which are controlled by a single "master node." You typically login to the master node and launch jobs on the various slave nodes using a batch queuing system.
There are two ways that you can use FDS on a Linux cluster. You can either download and install the pre-compiled FDS and Smokeview binaries, or you can clone the FDS-SMV repository following these instructions. If you are just interested in running FDS, you probably want to do the former. If, however, you are interested in doing research, or working with the FDS developers, you should do the latter.
-
Open a terminal session.
-
"cd" to the directory where the downloaded bundle is located, typically your home directory.
-
Run the installer script using the bash shell:
$ bash FDS_6.3.1-SMV_6.3.2_linux64.sh
Note that the version number for the file that you downloaded might be different. When you execute this command, there are some options for installation that will follow.
In addition to copying release files to the user specified location, the installer adds the following line
source .bashrc_fds mpi_path
to the .bashrc file where mpi_path
is the location of the MPI distribution (if it is present on your system). This source line updates the PATH
and LD_LIBRARY_PATH
environment variables and allows FDS and Smokeview to be run from the command line.
To make sure that FDS installed properly, just type
fds
at the command prompt. You should see information about the version and date of compilation. If you are working at a single computer that is running Linux, you can now use FDS as you would have on a Windows PC. The FDS User's Guide provides some more details.
It is more than likely, however, that you are working on a Linux cluster, and if you just type fds
at the command line, you will only launch a single process on the master node, which is not the way you want to use the cluster, except if you just have a short run that you want to debug or if you are developing an input file. Once you are ready to start longer jobs, you need to invoke the MPI (Message Passing Interface) functionality, which is taken up in the next section.
We assume that if you have downloaded the pre-compiled binaries of FDS and Smokeview that your cluster has Open MPI installed. If not, you or the system administrator should load it following the instructions given in this wiki Running FDS MPI on Linux. The important part of Open MPI is that your version needs to be in the same series as the version we used to compile FDS. To find out what your version is, type
mpirun --version
at the command prompt. If the command is not recognized, then you do not have Open MPI installed (or perhaps your PATH
variables need to be adjusted). If the command is recognized and the version of Open MPI is in the 1.8.x series, then you should be good to go. FDS 6.3.0 was compiled with Open MPI 1.8.4.
At NIST, we have two Linux clusters running Centos Linux. We use PBS (Portable Batch System) to schedule jobs for execution on the slave nodes. A typical job is launched using a bash script (call it script.pbs
, for example) like the following:
#!/bin/bash
#PBS -N job_name
#PBS -e /home/userid/.../job_name.err
#PBS -o /home/userid/.../job_name.log
#PBS -l nodes=2:ppn=2
#PBS -l walltime=2:0:0
export OMP_NUM_THREADS=1
cd /home/userid/.../
mpirun --report-bindings --bind-to socket --map-by socket -np 4 /home/userid/FDS/FDS6/bin/fds job_name.fds
The job_name
is the base name of the input file, the .err
and .log
files contain what is usually spilled onto the screen when you run FDS. These files are typically created when the job is done. You can assign them to any directory you want because some Linux clusters have specific work spaces that are separate from the user directories. The parameter nodes
indicates the name of nodes you want to use, and ppn
is the number of processes per node. Our practice at NIST is to assign 2 MPI processes to each node because the nodes of our cluster have 2 sockets (i.e. chips), and jobs run fastest when we do not fill up the entire node. The walltime
in this case is 2 hours. The job is typically killed after that, so choose wisely. The setting of OMP_NUM_THREADS
is intended to overwrite any existing environment variable. For this example, we are not going to invoke OpenMP. The cd
command changes directory to where the input file is located. The mpirun
command has a few options that you may or may not want to use. The option --report-bindings
adds a detailed list to the .err
file of the nodes and cores used to run the job. The options --bind-to socket
and --map-by socket
tell the scheduler to place each MPI process on its own socket. That is essentially saying that you want each process to have its own chip. This is useful when you cluster is not being used heavily, but of less value as the cluster fills up with jobs.
To submit this job, type
qsub script.pbs
Monitor your job by typing
qstat -a
Kill your job by typing
qdel jobid
where the jobid
is given by the qstat
command. There are many more options for these commands. Just do an Internet search and you'll see that many computing centers have listed them in detail. The ones listed here are the most important.