Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split hackage-packages.nix into multiple files #518

Open
sternenseemann opened this issue Sep 21, 2021 · 7 comments
Open

Split hackage-packages.nix into multiple files #518

sternenseemann opened this issue Sep 21, 2021 · 7 comments

Comments

@sternenseemann
Copy link
Member

hackage-packages.nix in NixOS is ever growing and I think it is time to think about making hackage2nix generate a file per package and a file callPackage-ing those files. The reasons are as follows:

  • Nix's laziness would mean that upon evaluation of a single attribute, we have to parse less Nix code. Parsing hackage-packages.nix is quite expensive and it has a constant impact of over half a second on evaluating anything Haskell related.

  • It could also be beneficial for repository size. As I understand it, git is better at deduplicating lots of small fails, rather than one big file, so getting rid of the hackage-package.nix behemoth could be helpful here, too. This is, however, based on my very limited understanding and would need to be confirmed by experimentation or someone more knowlegdable than me.

@maralorn
Copy link
Member

I am confused. That idea sounds to good to be true. Also I don‘t think any other auto generated part of nixpkgs does this? There have to be reasons why this has not been done.

@sternenseemann
Copy link
Member Author

I think it's just easier to generate a single file and more obvious. As you can see from the top 10 of files with the most lines in nixpkgs, we really are in a league of our own, so my guess is this is not a problem anyone else really had to consider.

13389 ./pkgs/servers/nosql/influxdb2/influx-ui-yarndeps.nix
13805 ./pkgs/servers/jellyfin/node-deps.nix
14029 ./pkgs/applications/version-management/gitlab/yarnPkgs.nix
15450 ./pkgs/development/compilers/elm/packages/node-packages.nix
17524 ./pkgs/development/r-modules/cran-packages.nix
24655 ./pkgs/top-level/perl-packages.nix
32847 ./pkgs/top-level/all-packages.nix
36831 ./pkgs/tools/typesetting/tex/texlive/pkgs.nix
125326 ./pkgs/development/node-packages/node-packages.nix
298210 ./pkgs/development/haskell-modules/hackage-packages.nix

@cdepillabout
Copy link
Member

One question I had is that I guess when we do this, we'd have one file that looks like:

pkgs/development/haskell-modules/hackage-packages.nix:

{ callPackage }:
{
  ...
  aeson = callPackage ./haskellPackages/aeson.nix {};
  ...
  conduit = callPackage ./haskellPackages/conduit.nix {};
  ...
  lens = callPackage ./haskellPackages/lens.nix {};
  ...
}

Then we'd have all our individual packages in .nix files in directory like pkgs/development/haskell-modules/haskellPackages/.

So for example pkgs/development/haskell-modules/haskellPackages/aeson.nix would look like:

{ mkDerivation, attoparsec, base, base-compat
, base-compat-batteries, base-orphans, base16-bytestring
, bytestring, containers, data-fix, deepseq, Diff, directory, dlist
, filepath, generic-deriving, ghc-prim, hashable, hashable-time
, integer-logarithms, primitive, QuickCheck, quickcheck-instances
, scientific, strict, tagged, tasty, tasty-golden, tasty-hunit
, tasty-quickcheck, template-haskell, text, th-abstraction, these
, time, time-compat, unordered-containers, uuid-types, vector
}:
mkDerivation {
  pname = "aeson";
  version = "1.5.6.0";
  sha256 = "1s5z4bgb5150h6a4cjf5vh8dmyrn6ilh29gh05999v6jwd5w6q83";
  revision = "2";
  editedCabalFile = "1zxkarvmbgc2cpcc9sx1rlqm7nfh473052898ypiwk8azawp1hbj";
  libraryHaskellDepends = [
    attoparsec base base-compat-batteries bytestring containers
    data-fix deepseq dlist ghc-prim hashable primitive scientific
    strict tagged template-haskell text th-abstraction these time
    time-compat unordered-containers uuid-types vector
  ];
  testHaskellDepends = [
    attoparsec base base-compat base-orphans base16-bytestring
    bytestring containers data-fix Diff directory dlist filepath
    generic-deriving ghc-prim hashable hashable-time integer-logarithms
    QuickCheck quickcheck-instances scientific strict tagged tasty
    tasty-golden tasty-hunit tasty-quickcheck template-haskell text
    these time time-compat unordered-containers uuid-types vector
  ];
  description = "Fast JSON parsing and encoding";
  license = lib.licenses.bsd3;
}

If this is the approach we took, then pkgs/development/haskell-modules/haskellPackages/ would have about 16,000 files in it (since there are currently about 16,000 packages on Hackage?).

Would having 16000 files in a directory cause any problems?

@sternenseemann
Copy link
Member Author

Would having 16000 files in a directory cause any problems?

Yeah, that is the big question.

@expipiplus1
Copy link
Contributor

expipiplus1 commented Oct 6, 2021 via email

@peti
Copy link
Member

peti commented Nov 1, 2021

Nix's laziness would mean that upon evaluation of a single attribute, we have to parse less Nix code. Parsing hackage-packages.nix is quite expensive and it has a constant impact of over half a second on evaluating anything Haskell related.

Have you actually tested whether this is true? I believe that Nix parses included files even if they are not actually needed for the evaluation. I may be wrong (or the behavior might have changed), but I guess it's a good idea to test it.

@sternenseemann
Copy link
Member Author

Have you actually tested whether this is true? I believe that Nix parses included files even if they are not actually needed for the evaluation. I may be wrong (or the behavior might have changed), but I guess it's a good idea to test it.

Oh, that is a good hint, I'll have to check that. Testing in general would be needed for this, for example I'm not sure if git performance may degrade if it has update many extra individual files (in the tens of thousands) instead of a single big file…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants