Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support calling formatters on each file separately #333

Closed
michaelpj opened this issue Jul 3, 2024 · 5 comments
Closed

Support calling formatters on each file separately #333

michaelpj opened this issue Jul 3, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@michaelpj
Copy link

Is your feature request related to a problem? Please describe.

At work we use treefmt to call ormolu on ~2.5k Haskell files. This takes about 32 seconds.

A natural improvement would be to format those files in parallel. At the moment, that would require changing the formatter.

The alternative would be for treefmt to handle the parallelism by running the formatter on each file individually. Then the formatter doesn't need to do anything.

Describe the solution you'd like

Some way of specifying how treefmt should call the formatter. Here's one option:

  • Add a batchSize option to a formatter, with no batch size meaning "infinite"
  • Split up the files to format into batches of at most batchSize, call the formatter once with each batch, in parallel.

Then:

  • No batch size behaves as today
  • Batch size 1 runs the formatter on just one file each time
  • Intermediary batch sizes can be tuned on a case-by-case basis

Describe alternatives you've considered

Do nothing, expect formatters to handle this.

@michaelpj
Copy link
Author

Note: I tried running the formatter once per-file in parallel using fd, and it wasn't much faster. So maybe there is another mystery here.

@brianmcgee
Copy link
Member

@michaelpj which version of treefmt are you using?

In v2 we implemented a new approach:

  • for each path determine the sequence of formatters to apply, providing us with a unique batch key
  • batch formatting tasks by the batch key until we reach the batch size, currently hardcoded to 1024, at which point we fire off a go routine which will apply each formatter in sequence to the batch of paths.

The errgroup for applying the formatters is bounded by runtime.NumCPU(). With all this in mind, you should already be seeing some concurrency.

@brianmcgee
Copy link
Member

If we allowed providing the batch size that could be used to reduce the batch size and further improve concurrency, sending smaller numbers of paths to ormolu.

@michaelpj
Copy link
Author

Ah, interesting. I am indeed on 0.6.1! I'll see if I can get the newer one.

@brianmcgee
Copy link
Member

I've created #334 to follow up on the batch size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants