Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging TuringBenchmarking into DynamicPPL #715

Open
yebai opened this issue Nov 11, 2024 · 4 comments
Open

Merging TuringBenchmarking into DynamicPPL #715

yebai opened this issue Nov 11, 2024 · 4 comments

Comments

@yebai
Copy link
Member

yebai commented Nov 11, 2024

DynamicPPL's tools for profiling and debugging models are growing.

TuringBenchmarking is very lightweight, and its scope overlaps with growing utilities inside DynamicPPL, so it makes sense to consider merging. Besides, these benchmarking codes will receive better maintenance under DynamicPPL.

The extra deps like ReverseDiff, Zygote, PrettyTables can be removed or managed via weak deps.

@penelopeysm
Copy link
Member

penelopeysm commented Nov 11, 2024

I had an unsubstantiated opinion on Slack that this should be a separate repo. The main reason is to avoid the DPPL repo itself becoming a home for all sorts of things – while we might not have much 'utility' code right now, this can grow in the future, and it's harder to remove code than to add it in.

I can see the point about maintainability being harder – for example having to keep the repo up to date with DPPL version releases. One way to keep us on track with this is the docs repo – if we have a doc page explaining these debugging / profiling tools (which we really should have!), and DPPL is bumped in the docs environment, then it will force us to keep the profiling repo up to date as well.

(the docs repo is actually doing a good job of forcing us to iron out stuff like Bijectors right now!)

@torfjelde
Copy link
Member

Also agree with @penelopeysm here 👍

@yebai
Copy link
Member Author

yebai commented Nov 12, 2024

We don't generally consider the amount of code when deciding whether to create a new repo or package. As long as the code is organised and readable, it is okay.

In reality, IIRC, the TuringBenchmarks repo was created from some small utilities that @torfjelde used in personal workflows. Then, it was rushed into a package for a winter school so that people could use it following Julia's Pkg workflow.

The code in TuringBenchmarks should have been hosted in DynamicPPL from the beginning, in retrospect. Thus, this issue.

EDIT: My primary metric when deciding on new packages is isolating complexity and encouraging reusability or stability. For example, developing and debugging MCMC samplers is often very challenging, so we decided to isolate them from the extra complexity of DynamicPPL.

@torfjelde
Copy link
Member

Then, it was rushed into a package for a winter school so that people could use it following Julia's Pkg workflow.

This is not strictly true. It was somewhat rushed into having its first release; it was a module from the get-go, and made so because I thought the initial thing I had been doing, i.e. putting benchmarking in benchmarks/ in DynamicPPL wasn't a good idea. This is ofc separate from adding a separate DynamicPPL.BenchmarkingTools submodule or whatever, but just wanted to clarify the original motivation behind TuringBenchmarking.jl 👍

The code in TuringBenchmarks should have been hosted in DynamicPPL from the beginning, in retrospect. Thus, this issue.

I guess the question is exactly how this is to be done? Do you mean to make a new submodule of DPPL that contains benchmarking tools, e.g. DynamicPPL.BenchmarkTools?

My original motivation for making the benchmarking a separate package (and also why I'm still of the opinion that this is the way to go) is:

  1. It allows us to add dependencies much more freely, e.g. having several of the default AD backends loaded already so the user can just do adbackend=[...] instead of all the using stuff is nice.
  2. We can add more extensive benchmarking than just logdensity computations, e.g. benchmarking a model with a SamplingContext containing a sampler from Turing.jl, or me and @mhauru breifly discussed the idea of maybe adding benchmarking of AbstractMCMC.step for samplers too.
  3. TuringBenchmarking.jl currently also holds some code for running benchmarks on Stan models, which is used for comparison against other PPLs. We can't add this to DPPL.
  4. Different release schedules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants