Implements partial refactor to allow for Boolean & Numeric interop (via `FormulaTerm` wrapper) #95

mejrpete · 2020-10-02T13:57:11Z

NOTE: this is submitted as a PR to devel, but given the scope of the changes (and some of the existing hacks) it would probably be better for this to be submitted as a PR to a new feature branch dedicated to this refactor (which I think would have to be added to the upstream and then this PR would have to be re-submitted against that branch).

As discussed in #91. This implementation is not the full SMT-like representation that we eventually want. Rather, I implemented a wrapper class FormulaTerm which allows Formula objects to be wrapped as Terms in such a way that Boolean-valued Formula objects can be automatically cast as 0/1 numeric values for arithmetic, etc.

The primary motivation for this refactor is to allow for the common patterns used in RDDL (which commonly includes numeric arithmetic over Boolean-valued state and action fluents).

This implementation

Changes the Boolean sort to be 0/1 valued and a child of the Naturals sort when the Arithmetic theory is included in a language (and a standalone child of Object otherwise)
Adds the FormulaTerm wrapper class, which is used as a container for Formula objects that must be treated as Term objects for arithmetic, etc.
All of Predicate, Formula, Term, Function objects are now associated with a FirstOrderLanguage (previously only Function and Term objects had a language property). Languages are inherited up from "subterms", when, for example, a CompoundFormula is constructed. This is essential for being able to construct FormulaTerm wrappers when needed, since Term objects always need an associated language.
Implements a set of tests (primarily focused on RDDL use cases) that adds to the existing test suite.

While the FormulaTerm wrapper strategy for interop between Term and Formula objects falls short of the full SMT-like representation that we want, I think it is progress in the right direction. Additionally, this strategy, along with a few notable hacks that need to be addressed allowed me to almost entirely preserve the public tarski API (which also allowed for use of the existing test suite during the refactor without a ton of changes).

The following are the key "hacks" that absolutely need to be addressed

Contradiction and Tautology objects now require a FirstOrderLanguage as a parameter when they are constructed. This is due to the fact that Formula objects with Contradiction and Tautology objects as subterms must be able to extract their own FirstOrderLanguage from their subterms. This breaks the common direct use of Tautology and Contradiction where they are not associated with a language. From a mathematical perspective, it is a bit weird to force Tautology and Contradiction to be associated with a FOL, since there is no need formally. This is something that bears more discussion.
As a result of the changes to Tautology, it was no longer possible to construct non-conditional FSTRIPS *Effect objects with Tautology as their condition, as *Effect objects are built outside the scope of a FOL. As a quick hack to make things work, I introduced a Pass class, which behaves exactly as Tautology is supposed to in this case, but without a language. This is a pretty terrible hack, but allowed me to work around the issue without making major API changes to FSTRIPS *Effect construction before this PR was filed.
In order to avoid breaking anything in testing, etc I retained the Boolean theory. It no longer does anything, since the Boolean sort is always attached to a FOL, either as a child of Naturals or Object depending on whether or not there are numeric sorts attached to the language. This is more minor than the above two issues, but is included to motivate a discussion on which API components are flexible in the full SMT-like representation moving forward.
Formula (and related) are no longer hashable, and must be treated the same way as Term objects where you use symref for use in associative containers.

These hacks were included specifically because I wanted to avoid arbitrarily changing the public API for tarski without discussion. The following are the API components that caused difficulty, and I believe (although I may be wrong) will require modification to avoid these kinds of hacks moving forward.

FSTRIPS *Effect construction outside the context of a FOL (while using something like Tautology as a condition if they are not conditional effects)
Tautology and Contradiction objects needing a FOL passed in at construction. (This is a change that was made, but is a little gross).

Finally, I have also included tests that demonstrate appropriate interoperability between Predicate/Formula objects and Function/Term objects through the FormulaTerm wrapper class --- meaning that (as currently tested) numeric interpretation of Boolean-valued objects works as expected for all basic arithmetic, if then else, and sumterm operators. I've also added a large integration test in tests/io/test_rddl_writer.py that uses a significant chunk of the new functionality to generate one of the academic-advising RDDL domain instances used in IPPC-2018 (instance 1).

This refactor is passing all tests that I was able to get working in my local configuration (which includes the base tests and all the benchmark tests). I have not been able to run tests associated with the sdd extra or which require the gringo binary for ASP reachability testing. Finally, I disabled one test related to shorthands for Contradiction and Tautology that no longer makes sense in context of the changes to those classes.

I fully admit that portions of this are really ugly! My hope is that we can discuss what is allowed to change in the API (or find better workarounds than I found) that will allow me to cut out the hacks that were used to make all of this work. Additionally, I still think that moving to a full SMT-like representation is strictly better than the FormulaTerm wrapper strategy used here (although I hope that this version could be used as the starting point for that refactor, since some of the changes --- including those made to the Boolean sort, would need to be made regardless!)

…akes the Boolean types a core builtin

…with wrappers

… to reflect API changes to FSTRIPS *Effect(s)

… domain lists in rddl instance

…test (passing)

…y`" workaround This is a bit of an ugly workaround, but it will work for now. we have a special `Pass` type that is EXACTLY only ever used when we need to construct or check against an FSTRIPS effect that is not a conditional effect. Previously, the condition had been set as a default parameter to `Tautology` for any regular FSTRIPS effect. However, since *Effects are built outside of the context of a language, this did not work after we needed a language for Tautology and Contradiction (so that other Formula types could inherit them). There are a few more permanent approaches to this. We could either break the API and build Effects in the context of a language, or we could figure out another way to have a universal Tautology that somehow does not need to be in the context of a language. Notably this problem will STILL BE AN ISSUE if we go to a fully boolean-valued Function refactor (eliminating Predicates, etc), rather than the current partial refactor that involves wrappers between Formula and Term.

…leans

…devel

miquelramirez · 2020-10-05T11:28:00Z

Hi @mejrpete,

cheers for the PR, mate.

Miquel.

…ult to pre-2018

…lean sort being attached to the language

codecov-io · 2020-10-19T18:14:07Z

Codecov Report

Merging #95 into devel will increase coverage by 0.15%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##            devel      #95      +/-   ##
==========================================
+ Coverage   98.30%   98.46%   +0.15%     
==========================================
  Files          47       48       +1     
  Lines        3014     3257     +243     
  Branches      114      114              
==========================================
+ Hits         2963     3207     +244     
+ Misses         41       40       -1     
  Partials       10       10

Impacted Files	Coverage Δ
tests/common/gridworld.py	`100.00% <ø> (ø)`
tests/fol/test_fol_accessors.py	`100.00% <100.00%> (ø)`
tests/fol/test_sorts.py	`100.00% <100.00%> (ø)`
tests/fol/test_syntax.py	`98.76% <100.00%> (-1.24%)`	⬇️
tests/fstrips/contingent/localize.py	`100.00% <100.00%> (ø)`
tests/fstrips/contingent/test_sensors.py	`100.00% <100.00%> (ø)`
tests/fstrips/hybrid/test_differential.py	`100.00% <100.00%> (ø)`
tests/fstrips/test_fstrips_operations.py	`100.00% <100.00%> (ø)`
tests/fstrips/test_representation.py	`100.00% <100.00%> (ø)`
tests/fstrips/test_simplify.py	`100.00% <100.00%> (ø)`
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d57d46...b53b073. Read the comment docs.

gfrances · 2020-10-20T08:34:07Z

Hi @mejrpete, thanks for the PR!
I am wondering what's the best way for me to jump in and help a bit with this, given that I don't have a lot of time, but could perhaps put in a couple of hours here and there.
I have created a new branch against which to resubmit the PR, as you mentioned:
https://github.com/aig-upf/tarski/tree/rddl-refactor

Maybe you could give me write access to mejrpete/tarski and we both work on the same branch;
or I can give you access here and we both work on a PR against devel - whichever you prefer?

Cheers,

BTW don't worry too much about the sdd tests, it's a relatively dead part of the code which I'm not sure it's worth maintaining for much longer.

mejrpete · 2020-10-20T15:55:33Z

@gfrances Understood completely on time availability! I think I'm now generally familiar enough with the tarski codebase that direct questions and discussion on design decisions should be more than enough most of the time (with other help as bonus!).

If it's ok with you, I think my having write access to the feature branch here is probably cleaner in terms of version history, etc (and would mean that the current project I'm working on with @emilkeyder-invitae could point to the feature branch on the tarski upstream in the meantime, rather than my fork, which is a nice bonus). It also means that we can more cleanly track issues on the main tarski repo rather than my fork --- which seems nice from a centralization standpoint! That would make the history and issue relationship track more like feature development on a project, rather than a big PR from outside.

If that's ok with you, I'll resubmit this PR (no longer as a draft) to the new branch, and then we can start working there. I'll close this PR in the meantime and should have the new one submitted before tomorrow!

Thanks!

gfrances · 2020-10-20T17:40:03Z

Sure, let's go with that

mejrpete and others added 11 commits September 17, 2020 16:24

implements basic functionality for FormulaTerm wrapper function and m…

4d71d52

…akes the Boolean types a core builtin

completes partial implementation of Term and Formula type conversion …

6fbd18e

…with wrappers

implements major refactor components. includes modifications to tests…

edffd3b

… to reflect API changes to FSTRIPS *Effect(s)

fixes bug where multiple writes without reset would dump repeated obj…

6aa8c21

… domain lists in rddl instance

implements basic RDDL writer integration test & makes it pass

47a1bb5

completes implemenation of academic_advising rddl writer integration …

27ed994

…test (passing)

updates tests related to Term/Formula interop and arithmetic with Boo…

ea35670

…leans

Merge branch 'devel' into rddl-refactor

25e0c12

makes some code style fixes

527afa2

re-enables a test that is now passing again after pull from upstream …

741a7d3

…devel

miquelramirez added contribution Contributions to Tarski enhancement Enhances existing features labels Oct 5, 2020

mejrpete added 3 commits October 7, 2020 13:58

adds option for 2018-style RDDL file format writing -- maintains defa…

00ad4cc

…ult to pre-2018

adds condition to avoid adding :numeric-fluents simply due to the Boo…

5a61949

…lean sort being attached to the language

fixes stdout name issue on macos

b53b073

mejrpete closed this Oct 20, 2020

mejrpete mentioned this pull request Oct 27, 2020

Implements partial refactor to allow for Boolean & Numeric interop (via FormulaTerm wrapper) #96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implements partial refactor to allow for Boolean & Numeric interop (via `FormulaTerm` wrapper) #95

Implements partial refactor to allow for Boolean & Numeric interop (via `FormulaTerm` wrapper) #95

mejrpete commented Oct 2, 2020 •

edited

Loading

miquelramirez commented Oct 5, 2020

codecov-io commented Oct 19, 2020

gfrances commented Oct 20, 2020

mejrpete commented Oct 20, 2020 •

edited

Loading

gfrances commented Oct 20, 2020

Implements partial refactor to allow for Boolean & Numeric interop (via FormulaTerm wrapper) #95

Implements partial refactor to allow for Boolean & Numeric interop (via FormulaTerm wrapper) #95

Conversation

mejrpete commented Oct 2, 2020 • edited Loading

miquelramirez commented Oct 5, 2020

codecov-io commented Oct 19, 2020

Codecov Report

gfrances commented Oct 20, 2020

mejrpete commented Oct 20, 2020 • edited Loading

gfrances commented Oct 20, 2020

Implements partial refactor to allow for Boolean & Numeric interop (via `FormulaTerm` wrapper) #95

Implements partial refactor to allow for Boolean & Numeric interop (via `FormulaTerm` wrapper) #95

mejrpete commented Oct 2, 2020 •

edited

Loading

mejrpete commented Oct 20, 2020 •

edited

Loading