Skip to content

Intel Diagnostic Tools

Kevin McGrattan edited this page May 5, 2022 · 10 revisions

There are several packages that are part of the Intel oneAPI HPC Toolkit that analyze code performance and structure. Each is described briefly below. At NIST, we typically use the command-line (cl) version of each package to "collect" information during the FDS simulation, and then we use the graphical user interface (gui) to analyze the results.

Intel Inspector

Intel Inspector is a tool that can help detect improperly coded OpenMP directives.

  1. Compile FDS in debug mode with the following options -shared-intel -check none -O0 -g

  2. Add the string inspxe-cl -collect ti2 -result-dir my_results -- just before the name of the executable in the srun command in the SLURM batch script. These commands invoke the command line (cl) version of Intel Inspector. The argument ti2 is a particular level of detail. my_results is just a name given to the directory that is created within the current directory to hold the analysis information. You can call it anything you want. Note that the name of the cluster node that runs the case is appended to the directory name.

  3. Run the case using the sbatch command to launch the run script. Make sure that the case is relatively small and that you run just a few time steps. Inspector takes a long time to check for race conditions.

  4. When the case is done, open up the Inspector graphical interface with the command inspxe-gui. You'll need to be at the console of the head node of the cluster to make this work.

  5. Assuming the GUI opens, look at the list of errors and where in the code the errors occur. Typically, errors occur where the same variable is "touched" by multiple OpenMP threads. If you do not have access to the GUI, analyze the results using command line form of Inspector, type:

    `inspxe-cl -report problems -r my_results`
    

Intel Trace Collector and Analyzer

The Intel Trace Collector and Analyzer are two separate programs with a single purpose---to enable you to visualize the work flow of each MPI process of an FDS simulation.

The Collector step is done by adding a flag to the qfds.sh script

qfds.sh -p 6 -r jobname.fds

which produces a trace file called fds_trace.stf at the end of the FDS job. The Trace Analyzer is a visualization tool that reads the trace file and displays its contents graphically, assuming your command shell can open a graphics window:

traceanalyzer fds_trace.stf

The main consideration in tracing FDS is that the trace file can become enormous if you run a long job and trace each and every function and subroutine call. To prevent this, there is a configuration file called fds_trace.conf in the directory Build/Scripts that contains a list of the main subroutines called in FDS. Only these subroutines are traced, keeping the trace file to a reasonable size and enabling you to more easily visualize the work flow. Make sure that the job only runs a handful of time steps, as there's no need to make the trace file bigger than it already is.

The most important graphic in the Trace Analyzer is the timeline. Get this from the Charts menu, Event Timeline. You will first see the entire timeline, but you can click and drag over shorter time intervals to see details. You will also notice that the first time you use the Trace Analyzer, everything is either colored red (MPI) or blue (Application). Go to the chart in the lower left corner and right click on the Groups, and choose to ungroup them. You should see the modules and subroutines you've chosen to trace. Keep ungrouping until you get down to the subroutine level. If you right-click again, you can choose to color the various routines, making it much easier to visualize. Your chosen color scheme will be saved in a file called .itarc in your home directory.

Intel Advisor

This folder contains the optimal build for Intel Advisor, which can assist in determining effective optimization locations for threading and vectorization.

Required Setup

  1. Source advixe-vars.sh, from the installation folder for Inspector. For example:
source /opt/intel19/advisor/advixe-vars.sh
  1. Compile the advise version of FDS using the script make_fds.sh in this directory. The relevant compiler options are listed here.

Recommended Steps

For useful analysis capabilities, using advixe-gui on the platform used for collection is ideal, or at least having access to the used source and executable on the machine where Advisor is intalled.

To setup, use advixe-gui before collecting data. X11 forwarding is necessary if logging in to a remote cluster. Inside the GUI, use the new project option to create a project for FDS, selecting the 'advise' version's executable as the target.

Collection

The base command used on one's platform to run FDS is a proper starting point. mpiexec fds [test case] is a common input, and thus we'll use it for the model here.

Before the fds executable in the run command, such as between mpiexec and the executable, place advixe-cl -collect <analysis-type> [Optional actions] -- . Most frequently, analysis-type can be 'survey'. This obtains a basic look at the program, and can be followed up with 'suitability' after annotation. Note that the others have not been incredibly useful/stable in FDS development. You can learn about annotating here

Thus, a possible input could be:

mpiexec -np 1 advixe-cl -collect survey -- $HOME/firemodels/fds/Build/impi_intel_linux_64_advise/fds_impi_intel_linux_64_advise simple_test.fds

Analysis

Graphical User Interface (Recommended)

Here, the command line interface has not been explored, so advixe-gui is the offering that can be recommended. Open the project file with the GUI to analyze results.

IMPORTANT NOTE

Generally, unless directly interested in testing the suitability of several locations at once, or in vectorization, Amplifier is a more useful tool for FDS development. It can be found in the 'vtune' build folder.

Intel VTune Profiler

Vtune is included with the Intel oneAPI Base Toolkit. It is most useful for profiling the code; that is, generating a list of the most frequently used subroutines. The easiest way to use it is to first create a SLURM script and add the verbiage below to the srun command between the srun options and the full path to the executable:

srun -N <nodes> -n <procs> --ntasks-per-node <procs per node> vtune -quiet -collect hotspots -trace-mpi -result-dir my_results <full path to fds executable> my_job.fds

When the job is done, issue the following command:

vtune -report hotspots -result-dir my_results.nodeXXX > my_report

Note that the results directory name is appended by the name of the node or nodes on which the case run. Also note that the text file my_report is not nicely formatted. I haven't figured that out yet.

Clone this wiki locally