Skip to content
This repository has been archived by the owner on Jan 20, 2022. It is now read-only.

Should we add another type to represent an aggregated store? #473

Open
johnynek opened this issue Mar 8, 2014 · 1 comment
Open

Should we add another type to represent an aggregated store? #473

johnynek opened this issue Mar 8, 2014 · 1 comment

Comments

@johnynek
Copy link
Collaborator

johnynek commented Mar 8, 2014

There are two ways to write in summingbird: sumByKey and write. However the semantics are different. The former is building a materialized aggregated readable KV store, and the latter is building a stream.

Right now, we call both of these Producers. It might be better to have two different concepts. Something like:

  sumByKey(store: P#Store[K, V]): Watchable[P, K, V]

// This trait can be watched, in that you can read outputs from it,
// but you cannot guarantee that you will see all writes.
trait Watchable[P <: Platform[P], K, V] {
  def watch: Producer[P, (K, (Option[V], V))]
}

If we add scanByKey, the difference is that is truly a stream: each item in causes an output. If you want an exact sumByKey, you could do that with scanByKey and Monoid.plus, but the point is: asking for sumByKey you are asking for a weaker contract (which would have its own type).

I am not sold on this, but I do think something is missing in failing to acknowledge the difference between a process that builds a KV-store, and one that is a an event stream.

Interested in comments.

@ianoc
Copy link
Collaborator

ianoc commented Mar 10, 2014

Do we have any other instances of watchable? I agree the contract of sumByKey is different than everything else. And it would be great to call this out somehow. Is there any operations we wouldn't like to allow or some way this would change users behavior/expectations? Or just aiming for correctness in our typing?(Both seem like valid reasons).

There is the other issue you mentioned before which is maybe something we should consider how to deal with or even how do we communicate to users. The Option[V] having some pretty strange properties in online processing right now when using batching. At the top of every batch it will revert to None.

snoble pushed a commit to snoble/summingbird that referenced this issue Sep 8, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants