-
Notifications
You must be signed in to change notification settings - Fork 13
Qi Meeting Apr 26 2024
Qi Meeting Apr 26 2024
Adjacent meetings: Previous | Up | Next
We discussed the implementation of the take
transformer in Qi's stream fusion optimization. We also reviewed our current test coverage.
As part of Qi 4, we released an optimizing compiler including the stream fusion optimization for sequences of functional list operations. The initial release included fusing map
, filter
, foldl
, foldr
, range
and car
. Recently, Dominik has been extending fusion to all applicable interfaces in racket/list
. take
presented some challenges.
In our implementation of stream fusion, we divided interfaces into three classes:
- Producers
- Transformers
- Consumers
For instance, range
is a producer because it is the source of the elements that are functionally processed by subsequent stages. map
and filter
are transformers because they translate one stream to another. foldl
and foldr
are consumers, because they compute some value from an input stream.
Now, consider take
, which operates the following way:
(take (range 100) 5) ;=> '(0 1 2 3 4)
Which category does take
belong in?
Typically, transformers like map
and filter
do not have an internal notion of state since they simply process one element at a time without needing any additional context. Yet, take
would be an exception as it would need to keep track of state since it needs to know how many elements remain to be "taken." The operation is informed not just by the current element in the stream but also on what has happened before. So far, it is only consumers that keep track of state, by means of a loop.
So, perhaps take
should be a consumer.
As a consumer, take
would process an input stream and produce a list as a value.
It would work, but consider this case:
(car (map sqr (filter odd? (take (range 100000000000) 100000))))
Naively, this processes a huge list and ultimately returns a single value at the head of the resulting list after a series of transformations. By employing stream fusion, we would still need to fully construct the (large) intermediate output of the take
consumer before fusing the subsequent filter
, map
and car
stages.
We should be able to do better. take
consumes a list and produces a list. Perhaps it should be a transformer after all, transforming one stream into another.
To understand how take
should be implemented as a stream component that manages state, we considered its basic (ordinary) implementation. It would essentially be a simple loop with two accumulators -- one to keep track of the remaining input and one to keep track of the result. Something like:
(define (take n vs)
(if (= 0 n)
null
(cons (car vs)
(take (sub1 n) (cdr vs)))))
We considered three possible approaches.
The intuition behind this approach is: can we transform an entire functional sequence into a single loop?
For functions that contain a loop, the Chez Scheme compiler will simply leave arguments in registers, instead of allocating stack space.
Indeed, Racket's for
forms take advantage of this by ultimately compiling to for/foldr/derived
which in turn compiles to a big driver loop (i.e. a named let
) with many fragments that come from various specifications in various surface for
forms. For instance, outer expressions may contribute bindings, loop expressions may contribute accumulators, and sequences like in-list
and in-range
compile to :do-in
which likewise contributes a fragment to the for/foldr/derived
.
We could emulate this, so that each stream component, such as map
or take
, contribute fragments to an underlying driver loop, either directly or perhaps via :do-in
and for/foldr/derived
.
Another approach is to continue with the continuation-passing approach but, in order to account for the new sequence in take
, mutate the state by using set!
. The use of mutation seemed undesirable and we wondered whether it might block optimizations in the Chez Scheme compiler, but felt that it probably would be OK. This is the approach Dominik has already implemented. We also did some googling and found that Clojure uses this approach in its implementation of "transducers."
Yet another approach would be to define a kind of type by annotating the state with some data (such as a position in the list), and then unwrapping it each time in subsequent stages (such as in a producer) before re-wrapping it if necessary.
This seemed like it might incur a performance cost, as it's likely that the Chez optimizer won't optimize away the pairs, and there might therefore be a performance hit from garbage collection. It's unclear if we could avoid the cost of cons cell construction here.
But we did learn that this kind of "tupling and untupling" is how Haskell does it in its implementation of fusion.
We agreed that it would be necessary to try out these different approaches and assess their performance before deciding on an implementation.
There may also be papers out there on implementing fusion using the continuation-passing style that could be useful references.
We reviewed lines that are not covered by tests today. We expect some of them to become covered now that we are expanding the set of producers, transformers and consumers. But for one of them, the issue is the "arity fast tracking" in the Racket runtime, so that the function isn't even called during execution if the arity doesn't match. Thus, it is impossible for test execution to "cover" it.
Since it is the body of the function that isn't covered, we felt that changing:
(λ (v) v)
to
identity
… might address this case.
(Some of these are carried over from last time)
- Decide on a release model, document it in the user docs, and announce the performance regression (and the remedy) to users.
- Improve unit testing infrastructure for deforestation.
- Discuss and work out Qi's theory of effects and merge the corresponding PR.
- Schedule a discussion for the language composition proposal and implement a proof of concept.
- Decide on appropriate reference implementations to use for comparison in the new benchmarks report and add them.
- Deforest other
racket/list
APIs viaqi/list
- Decide on whether there will be any deforestation in the Qi core, upon
(require qi)
(without(require qi/list)
) - Review and merge the fixes for premature termination of compiler passes.
- Continue investigating options to preserve or synthesize the appropriate source syntax through expansion for blame purposes.
Dominik, Michael, Sid
Home | Developer's Guide | Calendar | Events | Projects | Meeting Notes