Skip to content

Intel Diagnostic Tools

Kevin McGrattan edited this page Feb 15, 2021 · 10 revisions

Intel Inspector

Intel Inspector is a tool that can help detect improperly coded OpenMP directives.

Required Setup

If the command inspxe-cl is not recognized, invoke the start-up script:

source /opt/intel/oneapi/inspxe-vars.sh

Collection Procedure

Intel Inspector can be run from the command line or via a graphical user interface (GUI). To invoke the GUI, login to a virtual desktop and type inspxe-gui in a command shell. To run Inspector from the command line, type the following:

inspxe-cl -collect ti2 -- $HOME/firemodels/fds/Build/impi_intel_linux_64/fds_impi_intel_linux_64 simple_test.fds

The analysis-type, ti2, generates a reasonably balanced analysis for threading errors, identifying more errors than ti1 in less time than ti3.

Analysis Procedure using the Command Line option

To analyze the results using command line form of Inspector, type:

inspxe-cl -report <report_type> -r <result_dir>

This will display results according to report_type to standard out. problems is a suitable report_type.

Analysis Procedure using the Graphical User Interface

Alternatively, you can open the GUI via inspxe-gui. It can be run directly on a inspxe file, or, if a project was created, the result directories can be generated/placed in the project's folder, and viewed by clicking the names of the results on the left.

Intel Trace Collector and Analyzer

The Intel Trace Collector and Analyzer are two separate programs with a single purpose---to enable you to visualize the work flow of each MPI process of an FDS simulation. The Trace Collector is essentially built into the FDS executable via a compiler option (-tcollect), and it outputs a trace (.stf) file at the end of the FDS job. The Trace Analyzer is a visualization tool that reads the trace file and displays its contents graphically.

For details on the Intel Trace Collector, read the manual. For details on the Intel Trace Analyzer, read the manual.

To enable Trace Collector and Analyzer, add the following line to your .bashrc file:

source /opt/intel19/parallel_studio_xe_2019/psxevars.sh

The main consideration in tracing FDS is that the trace file can become enormous if you run a long job and trace each and every function and subroutine call. To prevent this, there is a configuration file called fds_trace.conf in this directory that contains a list of the main subroutines called in FDS. Only these subroutines are traced, keeping the trace file to a reasonable size and enabling you to more easily visualize the work flow.

To output trace information, use the -r flag when invoking qfds.sh.

To use the configuration file, add the -c <filepath>/<configfilename>.conf flag to qfds.sh. If using a custom script, add

export VT_CONFIG=<Full path to FDS repo>/Build/impi_intel_linux_64_trace/fds_trace.conf

to your PBS script.

Make sure that the job only runs a handful of time steps, as there's no need to make the trace file bigger than it already is. The default trace file name is fds_trace.stf, and once the job is finished, you can start the Trace Analyzer:

traceanalyzer fds_trace.stf &

If you get an error message, it is probably because you cannot open an X11 window on your computer. To do so, download the free program called Xming, start it by double-clicking, then add to your PuTTY session Connection --> SSH --> X11 --> Enable X11 forwarding. When you open up a linux command prompt, type xeyes & and if a pair of mouse-following googly eyes pop up, you're in business.

The most important graphic in the Trace Analyzer is the timeline. Get this from the Charts menu, Event Timeline. You will first see the entire timeline, but you can click and drag over shorter time intervals to see details. You will also notice that the first time you use the Trace Analyzer, everything is either colored red (MPI) or blue (Application). Go to the chart in the lower left corner and right click on the Groups, and choose to ungroup them. You should see the modules and subroutines you've chosen to trace. Keep ungrouping until you get down to the subroutine level. If you right-click again, you can choose to color the various routines, making it much easier to visualize. Your chosen color scheme will be saved in a file called .itarc in your home directory.

Creating a new configuration file

If you do not want to use the configuration file fds_trace.conf that is included above, you can create your own by following these steps.

  1. Compile a special version of FDS in Build/impi_intel_linux_64_trace. This is essentially a debug compilation with the additional compiler option -tcollect.

  2. Run a very short version of the FDS job that you want to trace. Just a few time steps is sufficient. You should not use a configuration file for this run. You want to collect everything.

  3. The trace (.stf) file that is created in the same directory as the FDS job output has a history of every function and subroutine call.

  4. To streamline the trace analysis, create a configuration file that reduces the number of subroutines and functions to trace. To do this, run the Configuration Assistant at the command line:

itcconfig <trace_file.stf>
  1. A graphical window will pop up, called Trace Collector Configurator. Click on Filters, and then select the Functions tab in the center panel. Right click in the empty box beneath Function Name Pattern, select the functions and subroutines you want to trace. You might consider turning Application off on the first line, and then add back in those routines that you want to trace. The term Application means everything besides MPI calls. It is your application, i.e. FDS.

  2. Save the configuration (.conf) file and exit the Configurator.

  3. Add export VT_CONFIG=<configuration_file.conf> to your run script, created with qfds.sh, or just run qfds.sh with the `-c <configuration_file.conf> option.

  4. Run your FDS job again, looking for the new trace (.stf) file.

  5. Run the Intel Trace Analyzer:

traceanalyzer <trace_file.stf>

Intel Advisor

This folder contains the optimal build for Intel Advisor, which can assist in determining effective optimization locations for threading and vectorization.

Required Setup

  1. Source advixe-vars.sh, from the installation folder for Inspector. For example:
source /opt/intel19/advisor/advixe-vars.sh
  1. Compile the advise version of FDS using the script make_fds.sh in this directory. The relevant compiler options are listed here.

Recommended Steps

For useful analysis capabilities, using advixe-gui on the platform used for collection is ideal, or at least having access to the used source and executable on the machine where Advisor is intalled.

To setup, use advixe-gui before collecting data. X11 forwarding is necessary if logging in to a remote cluster. Inside the GUI, use the new project option to create a project for FDS, selecting the 'advise' version's executable as the target.

Collection

The base command used on one's platform to run FDS is a proper starting point. mpiexec fds [test case] is a common input, and thus we'll use it for the model here.

Before the fds executable in the run command, such as between mpiexec and the executable, place advixe-cl -collect <analysis-type> [Optional actions] -- . Most frequently, analysis-type can be 'survey'. This obtains a basic look at the program, and can be followed up with 'suitability' after annotation. Note that the others have not been incredibly useful/stable in FDS development. You can learn about annotating here

Thus, a possible input could be:

mpiexec -np 1 advixe-cl -collect survey -- $HOME/firemodels/fds/Build/impi_intel_linux_64_advise/fds_impi_intel_linux_64_advise simple_test.fds

Analysis

Graphical User Interface (Recommended)

Here, the command line interface has not been explored, so advixe-gui is the offering that can be recommended. Open the project file with the GUI to analyze results.

IMPORTANT NOTE

Generally, unless directly interested in testing the suitability of several locations at once, or in vectorization, Amplifier is a more useful tool for FDS development. It can be found in the 'vtune' build folder.

Intel VTune Amplifier

This folder contains the optimal build for Intel VTune Amplifier, which can locate hotspots and the performance of both parallel and serial code. This helps for targeting optimizations.

Required Setup

  1. Source amplxe-vars.sh, from the installation folder for Inspector. For example:
source /opt/intel19/vtune_amplifier_2019/amplxe-vars.sh
  1. Compile the vtune version of FDS using the script make_fds.sh in this directory. The relevant compiler options are listed here.

Recommended Steps

For useful analysis capabilities, using amplxe-gui on the platform used for collection is ideal, or at least having access to the used source and executable on the machine where Amplifier is installed.

To setup, use amplxe-gui before collecting data. X11 forwarding is necessary if logging in to a remote cluster. Inside the GUI, use the new project option to create a project for FDS, selecting the 'vtune' version's executable as the target. When prompted in analysis, select the 'vtune' version's folder for binary files, and firemodels/fds/Source for source files. To use the project, place collection results, obtained later, into the generated project folder.

Collection

The base command used on one's platform to run FDS is a proper starting point. mpiexec fds [test case] is a common input, and thus we'll use it for the model here.

Before the fds executable in the run command, such as between mpiexec and the executable, place amplxe-cl -collect <analysis-type> [Optional actions] -- . Most frequently, analysis-type can be hpc-performance. This focuses on serial and parallel performance in a high-performance computing environment, i.e. clusters. You can look at other analysis types here

Thus, a possible input could be:

mpiexec -np 1 amplxe-cl -collect hpc-performance -- $HOME/firemodels/fds/Build/impi_intel_linux_64_vtune/fds_impi_intel_linux_64_vtune simple_test.fds

Alternatively, using qfds.sh -a [result_directory] will run hpc-performance, automatically.

Analysis

Command Line

To gather data from the command line, use:

amplxe -cl -report <report_type> -r <result_dir>

This will display results according to report_type to standard out. summary tends to be a good place to start, and you can find them all here

Graphical User Interface (Recommended)

Due to the large amount of data collected in any Amplifier run, using the amplxe-gui is highly recommended With results placed/generated in the project folder, Amplifier can display the results in a convenient GUI, instead of necessitating parsing the output on the command line.

You can access the GUI on a cluster that has it installed by running:

amplxe-gui &

This runs it without exclusivity, so you can continue using your shell.

In the GUI, there are several display options under 'Bottom-up' in a results screen. When optimizing existing OpenMP directives, focus on using options that highlight OpenMP regions. Otherwise, focus on function or source file views. Hovering over data labels normally gives a cursory explanation of what is represented.

Clone this wiki locally