You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.
There are two ways to write in summingbird: sumByKey and write. However the semantics are different. The former is building a materialized aggregated readable KV store, and the latter is building a stream.
Right now, we call both of these Producers. It might be better to have two different concepts. Something like:
sumByKey(store: P#Store[K, V]):Watchable[P, K, V]
// This trait can be watched, in that you can read outputs from it,// but you cannot guarantee that you will see all writes.traitWatchable[P<:Platform[P], K, V] {
defwatch:Producer[P, (K, (Option[V], V))]
}
If we add scanByKey, the difference is that is truly a stream: each item in causes an output. If you want an exact sumByKey, you could do that with scanByKey and Monoid.plus, but the point is: asking for sumByKey you are asking for a weaker contract (which would have its own type).
I am not sold on this, but I do think something is missing in failing to acknowledge the difference between a process that builds a KV-store, and one that is a an event stream.
Interested in comments.
The text was updated successfully, but these errors were encountered:
Do we have any other instances of watchable? I agree the contract of sumByKey is different than everything else. And it would be great to call this out somehow. Is there any operations we wouldn't like to allow or some way this would change users behavior/expectations? Or just aiming for correctness in our typing?(Both seem like valid reasons).
There is the other issue you mentioned before which is maybe something we should consider how to deal with or even how do we communicate to users. The Option[V] having some pretty strange properties in online processing right now when using batching. At the top of every batch it will revert to None.
snoble
pushed a commit
to snoble/summingbird
that referenced
this issue
Sep 8, 2017
There are two ways to write in summingbird: sumByKey and write. However the semantics are different. The former is building a materialized aggregated readable KV store, and the latter is building a stream.
Right now, we call both of these Producers. It might be better to have two different concepts. Something like:
If we add scanByKey, the difference is that is truly a stream: each item in causes an output. If you want an exact sumByKey, you could do that with scanByKey and Monoid.plus, but the point is: asking for sumByKey you are asking for a weaker contract (which would have its own type).
I am not sold on this, but I do think something is missing in failing to acknowledge the difference between a process that builds a KV-store, and one that is a an event stream.
Interested in comments.
The text was updated successfully, but these errors were encountered: