Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved compile speed by running multi-threaded library discovery. #2625

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

cmaglie
Copy link
Member

@cmaglie cmaglie commented Jun 5, 2024

Please check if the PR fulfills these requirements

See how to contribute

  • The PR has no duplicates (please search among the Pull Requests
    before creating one)
  • The PR follows
    our contributing guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • UPGRADING.md has been updated with a migration guide (for breaking changes)
  • configuration.schema.json updated if new parameters are added.

What kind of change does this PR introduce?

This is a tentative to improve compile speed by multi-threading the library discovery phase.

How library discovery works

The library discovery is a sequential process, and could be summarized as follows:

  • The sketch source files (compilation units) are added to a queue
  • An empty includes path list is initialized
  • For each file in the queue:
    • A C++ macro preprocessing is run on it (a.k.a. gcc -E ...)
    • If the operation returns a Missing include file "xxx.h" error then:
      • A library providing xxx.h is added to the includes path list
      • The source files (compilation units) of the library are added to the queue
      • The current source file is preprocessed again to see if there are other missing includes

The above loop continues until one of the following happens:

  • All the files in the queue are processed successfully
  • A missing include can not be provided by any of the installed libraries
  • There is a syntax error different from missing include

Could this be multi-threaded?

The main issue here is that each run of gcc -E ... requires the includes path list determined by the previous iterations. At first sight, it seems a strictly sequential process.

BTW, there are some strategies that we may use:

  1. When a library is added to the queue, we may speculatively start a multithreaded compilation of the next files in the queue assuming no missing include errors will be produced. Most of the time this assumption turns out to be true and the time saved could be dramatic.

  2. We may leverage the "library detection cache" to spawn a speculative multithreaded compilation of all the files (predicting all the libraries added in the process), assuming that the missing include errors would be the same as the previous successful compilation as saved in the "library detection cache". In this case, we reproduce the same compiles as the previous build. The basic assumption is that the libraries detected will not change between compilations.

If an unforeseen missing include error is detected, the speculative multithreaded compilation must be canceled and restarted (basically wasting the work done ahead of time). In a multi-core system, since the work "ahead" is done in parallel, the worst-case scenario should take nearly the same time as the "classic" non-multithreaded compile.

Due to the complexity of these changes, I'll start this as a draft. I want to add integration tests to trigger edge cases before merging.

What is the current behavior?

What is the new behavior?

Compiles should in general be faster than before.

Does this PR introduce a breaking change, and is titled accordingly?

No

Other information

@cmaglie cmaglie self-assigned this Jun 5, 2024
@cmaglie cmaglie added type: enhancement Proposed improvement topic: build-process Related to the sketch build process labels Jun 5, 2024
@cmaglie cmaglie force-pushed the libs_detector_improvements branch 2 times, most recently from f408819 to 05c4745 Compare June 5, 2024 10:21
Copy link

codecov bot commented Jun 5, 2024

Codecov Report

Attention: Patch coverage is 77.82910% with 96 lines in your changes missing coverage. Please review.

Project coverage is 67.87%. Comparing base (ce4341f) to head (87765a4).

Files Patch % Lines
...lder/internal/preprocessor/arduino_preprocessor.go 0.00% 22 Missing ⚠️
...nternal/arduino/builder/internal/detector/cache.go 72.46% 14 Missing and 5 partials ⚠️
...rnal/arduino/builder/internal/detector/detector.go 85.04% 9 Missing and 7 partials ⚠️
internal/arduino/builder/sizer.go 12.50% 6 Missing and 1 partial ⚠️
internal/arduino/builder/builder.go 82.35% 4 Missing and 2 partials ⚠️
...nal/arduino/builder/internal/preprocessor/ctags.go 62.50% 6 Missing ⚠️
internal/arduino/builder/compilation.go 86.84% 3 Missing and 2 partials ⚠️
...l/arduino/builder/internal/detector/source_file.go 88.63% 3 Missing and 2 partials ⚠️
commands/service_compile.go 72.72% 2 Missing and 1 partial ⚠️
internal/arduino/builder/internal/runner/task.go 80.00% 2 Missing and 1 partial ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2625      +/-   ##
==========================================
+ Coverage   67.67%   67.87%   +0.20%     
==========================================
  Files         234      237       +3     
  Lines       22176    22269      +93     
==========================================
+ Hits        15007    15116     +109     
+ Misses       5990     5978      -12     
+ Partials     1179     1175       -4     
Flag Coverage Δ
unit 67.87% <77.82%> (+0.20%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cmaglie cmaglie changed the title Improved compile speed by runnning multi-threaded library discovery. Improved compile speed by running multi-threaded library discovery. Jun 5, 2024
@cmaglie cmaglie force-pushed the libs_detector_improvements branch from 05c4745 to 8afa7d8 Compare June 5, 2024 12:29
@manchoz
Copy link
Contributor

manchoz commented Jun 5, 2024

LGTM

@cmaglie cmaglie force-pushed the libs_detector_improvements branch from cc15358 to 87765a4 Compare August 8, 2024 15:40
Copy link
Contributor

@alessio-perugini alessio-perugini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my POV, is okay. I tested it on some sketches and I can see the improvement.
However, let's wait for some other users to gather feedback. (Just in case some edge cases arise).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: build-process Related to the sketch build process type: enhancement Proposed improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants