Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Vector results #5

Open
zzFluke opened this issue Jan 24, 2020 · 12 comments
Open

Support Vector results #5

zzFluke opened this issue Jan 24, 2020 · 12 comments

Comments

@zzFluke
Copy link

zzFluke commented Jan 24, 2020

At the moment supported evaluation results is Result<f64,Error>, will be enhanced with Advanced Number Types (Complex Numbers, Rational Numbers, Arbitrary Precision Numbers, Big Integers + Crypto), but is it even possible to have something like Result<Vec<T>,Error>? I imagine to have custom functions that can perform map or iteration operations over some data not just reduction.

@likebike
Copy link
Owner

likebike commented Jan 25, 2020

It's an interesting idea. I'll think about it.

I'm afraid of blowing up the complexity of the language... I really don't want argument-list-expansion (like python's "*args" and "**kwargs") or vector slicing because I feel like it pushes the language a bit too far beyond its typical use-case.

@likebike
Copy link
Owner

likebike commented Jan 25, 2020

I also feel like this kind of capability is already "sort-of supported" because custom functions can do anything. It would be easy to define a custom function "A" that stores vector results in an external storage location and produces NaN (or maybe an ID related to the external storage) as its return value, then custom function "B" would continue to operate on that external location and probably produce the reduction as its return value.

It's quite a hack and doesn't compose well, but it might be "good enough".

If I see real-life use cases for this, I will probably consider a general implementation, like what you're suggesting.

@zzFluke
Copy link
Author

zzFluke commented Jan 25, 2020

@likebike, thanks a lot for your comments and suggestion. Your idea to return ref on vector makes sense. What I'm going to do is to take your lib as a basis and try to implement higher level language and evaluation engine inspired by Excel Formula language and in particular Apache POI but without spreadsheet burden. I have little experience with Rust, though .. it can take a while.

@likebike
Copy link
Owner

I added a unit test which performs some vector operations with some custom functions. You might be able to use it as an example:

fn custom_vector_funcs() {

It might be a bit cryptic. Let me know if you have questions.

@zzFluke
Copy link
Author

zzFluke commented Jan 30, 2020

Great, thanks @likebike !

@arnodb
Copy link

arnodb commented Jan 30, 2020

Wouldn't the following work?

Add a generic param T: Add + Sub + Mul + Div and let Rust do the rest?

For the Vec case (or whatever complex type you can think about), I suppose a newtype that implements the required traits would allow matrix computations.

@likebike
Copy link
Owner

likebike commented Feb 1, 2020

@arnodb I thought about doing that, but I assumed it wouldn't be adequate, since I do more than those four aritmetic operations. I also do Exponentiation, Modulo, Comparison, Logarithms, Rounding, Trigonometry, Absolute-Value, Sign, etc etc etc.

Do you know a way that I could achieve all those results just by adding more trait bounds? I guess I could define new traits for all those extra capabilities, and then implement those new traits for all the advanced number types. I guess I just thought it was more direct to avoid that.

@arnodb
Copy link

arnodb commented Feb 3, 2020

That gives a broader view of the problem to solve. I confess I integrated fasteval very quickly in replacement of meval which works but is rather old and constrained (in regards to the way to populate context - variables and functions).

First of all, exponentiation, logarithm, trigonometry are functions that take Ts and return a T. If fasteval provides a very small core (i.e. no preset functions) then they are not a problem. Another layer of the code can provide the user with functions that fit the T (s)he is working with.

I suppose the modulus falls into the same bucket as the +-*/ operations: https://doc.rust-lang.org/std/ops/trait.Rem.html. However if one wants to work with a T that has a modulus operator and another one with another T which has no modulus I'm not sure it is viable to try having different parsers with different generic bounds. The other way around is rather simple but might give the impression of laziness: everything can be turned into a function. The only issue I can see is that the parser may produce an AST with functions that may not exist for the type at eval time.

Maybe the first question to ask yourself is at which point do you know the type you are working with. I mean, in simple arithmetic, everything is a function (a trait to implement if you prefer). Like Rust, the point is to check at the very moment you know the T, that every operation involved in the parsed expression is available for that T or to fail as early as possible.

Could this check be static is another big design question :-). If you want it to be static then you need to know T before you parse expressions (so that the compiler issues an error for you) and disable parts of the parser based on that knowledge. That can reveal very complex to implement because it is probably impossible to implement "something" for a type "not implementing a given trait".

This comment is too long, hope it gives you ideas.

@likebike
Copy link
Owner

likebike commented Feb 3, 2020

I integrated fasteval very quickly in replacement of meval

If you have some spare time, can you please tell me if you found any part of the transition from meval to fasteval difficult? Are there things you miss that meval does better than fasteval? Have you been able to notice any performance difference from the switch?

@arnodb
Copy link

arnodb commented Feb 3, 2020

That was awesomely easy. One single commit:

Thanks to a slightly better design, I removed an unnecessary RefCell and a context allocation.

The only thing that puzzled me was the strongly fixed capacity of the slab. I would have expected something like a vector that can grow until there is nothing left to parse. In my case everything is parsed on startup. But I worked around that by exposing different constructors. The only thing is that it could be hard to determine what capacity is necessary.

@likebike
Copy link
Owner

likebike commented Feb 3, 2020

I'm happy to see that you were able to migrate such a large project so easily.

A few notes:

The only thing that puzzled me was the strongly fixed capacity of the slab. I would have expected something like a vector that can grow until there is nothing left to parse.

Right, fasteval achieves most of its performance advantage by pre-allocating all of its stuff, rather than performing many small allocations. That's why the size needs to be known ahead of time. It also helps with safety: https://docs.rs/fasteval/0.2.4/fasteval/#safety

...But I can understand that sometimes this requirement is a hassle and confusing. I'll consider adding a 'GrowableSlab' or something which would behave slower than a pre-allocated slab, but with more intuitive behavior.

Thanks so much for your feedback!

@arnodb
Copy link

arnodb commented Feb 3, 2020

HashMap vs. BTreeMap: good point. Although, in theory, HashMap is supposed to be faster for some operations at the (rather high) cost of memory consumption whereas BTreeMap is more memory efficient. But I think I need to refresh my knowledge about that.

Edit: https://doc.rust-lang.org/std/collections/index.html shows, as expected, O(1) for almost all operations with HashMap and O(log(n)) for almost all operations with BTreeMap. So choosing the last at the cost of operation times is either a matter of order/determinism or of memory footprint. It's not mentioned, but a B-Tree has a lower memory footprint than hashed entries.

parse_noclear: I read the documentation and saw that the other function (the one in the example) was always clearing the slab. So I looked for a good way to share the slab, it took a few minutes, thanks IntelliJ.

ExprNamespace vs. closure: again it's IntelliJ that, this time might have made me missed something. I just looked at the signature of the evaluation function and started implementing the trait. I'll have a closer look soon.

None vs. panic: interesting, I need to look at that too.

Is it that slower to use Vecs that never grow (after some time if need be) instead of using fixed size arrays?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants