This tool maps bytecode/dex elements (such as types, methods, fields, and invocations) to source elements. The input is the executable code (JAR/APK format) plus the original sources. The output is a JSON mapping of source code elements that correspond to low-level entities. The low-level entity ids follow the format of the Doop static analysis framework.
This is work in progress, currently the following source languages are (partially) supported: Java, Groovy, Kotlin.
Install the Kotlin ANTLR grammar to the local Maven repository:
./install-kotlin-parser.sh
Then, install the tool:
./gradlew installDist
To generate the JSON mappings for application with code in app.jar
and sources in app-sources.jar
, run the following command:
build/install/source-ir-fitter/bin/source-ir-fitter --ir path/to/app.jar --source path/to/app-sources.jar --out app-out --json
The output .json files can be found in directory app-out
.
This tool can map results of Doop static analyses (in SARIF format, originally on bytecode) to sources. The translated results can be viewed in a tool with SARIF support, such as Visual Studio Code with the SARIF viewer extension or as part of a custom GitHub Action.
Assume a Doop analysis with SARIF output enabled:
cd $DOOP_HOME
./doop -i path/to/app.jar -a context-insensitive --sarif --id app --stats none --gen-opt-directives --no-standard-exports
Then, run the code processor:
build/install/source-ir-fitter/bin/source-ir-fitter --translate-sarif --ir path/to/app.jar --source path/to/app/src --out app-out --database ${DOOP_HOME}/out/app/database
Finally, run Visual Studio Code on the results:
code path/to/app/src app-out/doop.sarif
Problem: Some elements in Groovy sources (such as method calls) cannot be mapped to the generated bytecode.
Solution: This is an inherent limitation of Groovy's dynamic
features. Using @CompileStatic
in the Groovy sources helps.
Problem: In the generated metadata, there is a call to StringBuilder.append() that does not appear in the sources, yet it is mapped to a source line.
Solution: Bytecode elements that have source line information are still mapped to the sources. For example, the compiler may have generated StringBuilder.append() calls to implement string concatenation and the metadata may map these calls to the corresponding source lines.