-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup Ninja backend with many extract_objects or targets #13879
base: master
Are you sure you want to change the base?
Commits on Nov 20, 2024
-
compilers: cache the results of is_source()
is_source() is called almost 900000 times in a QEMU setup. Together with the previously added caching, this basically removes _determine_ext_objs() from the profile when building QEMU. Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f1fbed6 - Browse repository at this point
Copy the full SHA f1fbed6View commit details -
utils: cache build directory files
get_target_generated_sources often calls File.from_built_relative on the same file, if it is used by many sources. This is a somewhat expensive call both CPU- and memory-wise, so cache the creation of build-directory files as well. Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c1f4f6b - Browse repository at this point
Copy the full SHA c1f4f6bView commit details -
ninjabackend: use File.from_built_relative()
Do not reinvent it in NinjaBackend.determine_ext_objs(), so as to use the recently added caching of the results of File.from_built_relative(). Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 589510f - Browse repository at this point
Copy the full SHA 589510fView commit details -
ninjabackend: prefer "in" to regex search
Regexes can be surprisingly slow. This small change brings ninja_quote() from 12 to 3 seconds when building QEMU. Before: ncalls tottime percall cumtime percall 3734443 4.872 0.000 11.944 0.000 After: ncalls tottime percall cumtime percall 3595590 3.193 0.000 3.196 0.000 Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f3dfac4 - Browse repository at this point
Copy the full SHA f3dfac4View commit details -
"Inline" CompilerArgs.__iter__() into CompilerArgs.__init__(), so that replace list(Iterable) is replaced by the much faster list(List). Before: ncalls tottime cumtime 19268 0.163 3.586 arglist.py:97(__init__) After: ncalls tottime cumtime 18674 0.211 3.442 arglist.py:97(__init__) Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 281b960 - Browse repository at this point
Copy the full SHA 281b960View commit details -
arglist: optimize flush_pre_post(), and __iadd__() with it
Unless an argument is marked as Dedup.OVERRIDDEN, pre_flush_set and post_flush_set will always be empty and the loops in flush_pre_post() will not be doing anything interesting: for a in self.pre: dedup = self._can_dedup(a) if a not in pre_flush_set: # This just makes new a copy of self.pre new.append(a) if dedup is Dedup.OVERRIDDEN: # this never happens pre_flush_set.add(a) for a in reversed(self.post): dedup = self._can_dedup(a) if a not in post_flush_set: # Here self.post is reversed twice post_flush.appendleft(a) if dedup is Dedup.OVERRIDDEN: # this never happens post_flush_set.add(a) new.extend(post_flush) In this case it's possible to avoid expensive calls and loops, instead relying as much on Python builtins as possible. Track whether any options have that flag and if not just concatenate pre, _container and post. Before: ncalls tottime cumtime 45127 0.251 4.530 arglist.py:142(__iter__) 81866 3.623 5.013 arglist.py:108(flush_pre_post) 76618 3.793 5.338 arglist.py:273(__iadd__) After: 35647 0.156 0.627 arglist.py:160(__iter__) 78998 2.627 3.603 arglist.py:116(flush_pre_post) 73774 3.605 5.049 arglist.py:292(__iadd__) The time in __iadd__ is reduced because it calls __iter__, which flushes pre and post. Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d054f7f - Browse repository at this point
Copy the full SHA d054f7fView commit details -
arglist: post is only appended to, make it a list
self.post is only ever appended to on the right hand. However, it is then reversed twice in flush_pre_post(), by using "for a in reversed.post()" and appendleft() within the loop. It would be tempting to use appendleft() in __iadd__ to avoid the call to reversed(), but that is not a good idea because the loop of flush_pre_post() is part of a slow path. It's rather more important to use a fast extend-with-list-argument in the fast path where needs_override_check if False. For clarity, and to remove the temptation, make "post" a list instead of a deque. Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for db0cadc - Browse repository at this point
Copy the full SHA db0cadcView commit details -
backends: avoid extend() in _flatten_object_list
Accumulate into lists that are passed by the caller, thus avoiding allocations and calls to extend() on recursive extract_objects(). Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2130bbb - Browse repository at this point
Copy the full SHA 2130bbbView commit details -
backends: remove unused argument
The proj_dir_to_build_root argument of determine_ext_objs() is always empty, remove it. Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3593e90 - Browse repository at this point
Copy the full SHA 3593e90View commit details -
backends: avoid os.path.join in common case of _flatten_object_list
proj_dir_to_build_root is empty by default, in fact always except on some cases of the VS2010 backend. Add it after the fact in flatten_object_list(), which reduces the numbers of os.path.join(). Signed-off-by: Paolo Bonzini <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 58a072d - Browse repository at this point
Copy the full SHA 58a072dView commit details