Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should makeLenses generate code using record syntax by default? #986

Open
yairchu opened this issue Oct 14, 2021 · 6 comments
Open

Should makeLenses generate code using record syntax by default? #986

yairchu opened this issue Oct 14, 2021 · 6 comments

Comments

@yairchu
Copy link
Contributor

yairchu commented Oct 14, 2021

makeLenses produces code of length that is quadratic to the number of fields, and this appears to result in long compilation times -

Here on my M1 MacBook Pro with GHC 8.10.7 (at commit f76e271), a module with large records that compiles in 2.35 seconds, compiles in 20.15 seconds when adding makeLenses for these records. It appears that using record syntax makes this slightly faster, at 15.98 seconds (see accompanying PR) - a 21% reduction in compilation time, still pretty slow but at least it's an improvement. I guess the reason it is still slow is that the code does translate to the same Core later??

(I looked into this issue because of this large record in real code which is slowing my build times)

@phadej
Copy link
Collaborator

phadej commented Oct 14, 2021

copying my comment on PR.

Is it faster to compile, or produces smaller Core? IIRC record updates are still compiled into Core case?

AFAIK, big records just lead to quadratic code size, if you don't go very creative, like in https://well-typed.com/blog/2021/08/large-records/

@yairchu
Copy link
Contributor Author

yairchu commented Oct 15, 2021

The resulting core is slightly different. Changed the example to just two fields for brevity:

data Big = Big { _a0 :: Int , _a1 :: Int }

Then both cores for a0 start with:

a0 [InlPrag=INLINE (sat-args=2)] :: Lens' Big Int
[GblId, Arity=3, Caf=NoCafRefs, Unf=OtherCon []]
a0
  = \ (@ (x0 :: * -> *))
      ($dFunctor_a0 :: Functor x0)
      (eta_B2 :: Int -> x0 Int)
      (eta1_B1 :: Big) ->

But then the two differed, while makeLenses resumes with

      case eta1_B1 of { Big x1 x2 ->
      fmap
        @ x0
        $dFunctor_a0
        @ Int
        @ Big
        (\ (y :: Int) -> BigRecord.Big y0 x2)
        (eta_B2 x1)
      }

the new variant produces the following core:

      fmap
        @ x0
        $dFunctor_a0
        @ Int
        @ Big
        (\ (y :: Int) ->
           case eta1_B1 of { Big x1 x2 ->
           BigRecord.Big y x2
           })
        (eta_B2 (case eta1_B1 of { Big x1 x2 -> x1 }))

The difference being that the regular one matches eta1_B1 once outside and the new one matches it twice within, so it's actually longer core! So I don't know what exactly makes it less slow then..

@adamgundry
Copy link

At a wild guess, using record syntax might save time in the type-checker because you're type-checking an expression that is constant-size rather than linear in the number of record fields? You could test this hypothesis by compiling with -ddump-timings to see where the time is being spent.

Given that the Core is larger with the new variant, I wonder if optimizing use sites might end up taking longer. And is there a semantic or runtime performance difference arising from the change?

@phadej
Copy link
Collaborator

phadej commented Oct 16, 2021

Given that the Core is larger with the new variant, I wonder if optimizing use sites might end up taking longer. And is there a semantic or runtime performance difference arising from the change?

For over modifier l big, over = coerce, Functor in question is Identity.

  • eta1_B1 = big :: Big
  • eta_B2 = modifier :: Int -> Int
-- makeLenses
   case eta1_B1 of { Big x1 x2 ->
      fmap
        @ x0
        $dFunctor_a0
        @ Int
        @ Big
        (\ (y0 :: Int) -> Big y0 x2)
        (eta_B2 x1)
      }

over modifier l big
  = case big of { Big x1 x2 ->
        (\ (y0 :: Int) -> Big y0 x2)
        (modifier x1)
      }

  =<beta-redux>
    case big of { Big x1 x2 ->
        let y0 = modifier x1
        in Big y0 x2
      }

  =<inline>
    case big of { Big x1 x2 ->
        Big (modifier x1) x2
      }
-- using records
      fmap
        @ x0
        $dFunctor_a0
        @ Int
        @ Big
        (\ (y :: Int) ->
           case eta1_B1 of { Big x1 x2 ->
           Big y x2
           })
        (eta_B2 (case eta1_B1 of { Big x1 x2 -> x1 }))

over modifier l big
  =  (\ (y :: Int) ->
           case big of { Big x1 x2 ->
           Big y x2
           })
       (modifier (case big of { Big x1 x2 -> x1 }))

  =<beta-redux>
     let y = modifier (case big of { Big x1 x2 -> x1 })
     in case big of { Big x1 x2 ->
           Big y x2
        }

  =<inline>
     case big of { Big x1 x2 ->
         Big (modifier (case big of { Big x1 x2 -> x1 })) x2
     }

  =<notice that `big` is cased on twice>
     case big of { Big x1 x2 ->
         Big (modifier x1) x2
     }

I think GHC is smart enough to figure out double case optimization. But I will be surprised if the use-site is faster to optimize.

(view case is simpler, as then the eta_B2 is replaced with coerce and fmap @(Const r) is also just coerce)

@yairchu
Copy link
Contributor Author

yairchu commented Oct 17, 2021

At a wild guess, using record syntax might save time in the type-checker because you're type-checking an expression that is constant-size rather than linear in the number of record fields? You could test this hypothesis by compiling with -ddump-timings to see where the time is being spent.

Results of -ddump-timings show improvement in Renamer/typechecker over here from 2019 to 456 (btw what unit is this? milliseconds?)

Last steps:

StageNormalRecord Syntax
Renamer/typecheckeralloc=1674493960 time=2018.927alloc=344285416 time=455.906
Desugaralloc=382819504 time=550.085alloc=535813632 time=805.058
Simplifieralloc=3741509512 time=2641.438alloc=4725170280 time=3449.629
CoreTidyalloc=445557936 time=403.412alloc=841371136 time=398.969
CorePrepalloc=17176 time=0.052alloc=17176 time=0.037
CodeGenalloc=10154994752 time=4546.312alloc=7060459536 time=3597.344

ekmett added a commit that referenced this issue Jul 6, 2023
Generate lenses using record syntax (#986)
@RyanGlScott
Copy link
Collaborator

#987 adds an option to generate lenses using record syntax, but this option is disabled by default. I'll leave this issue open to discuss whether we should change the default.

@RyanGlScott RyanGlScott changed the title Should makeLenses generate code using record syntax? Should makeLenses generate code using record syntax by default? Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants