-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom rule config (+use gear_id) #961
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #961 +/- ##
==========================================
+ Coverage 90.92% 91.06% +0.13%
==========================================
Files 49 49
Lines 7055 7051 -4
==========================================
+ Hits 6415 6421 +6
+ Misses 640 630 -10 |
Go ahead & do a rebase; a large amount of gear-related activity just landed on master. |
f0e95ce
to
b6e2319
Compare
@kofalt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brush-pass LGTM
@ryansanford That should generally be the case, yes. |
api/jobs/handlers.py
Outdated
except APINotFoundException: | ||
self.abort(400, 'Cannot find gear for alg {}, alg not valid'.format(doc['alg'])) | ||
|
||
doc['project_id'] = cid | ||
doc.setdefault('config', gear['gear']['config']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retcon: this isn't correct, see out-of-band conversation for more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kofalt is referencing that the config doc on the gear stores the default value (if available) under a default
key. I added this function in jobs.gears
that, given a gear doc and a config map, will set missing keys if a default value exists.
LGTM, with change to use new function to set defaults for submitted config. The |
b6e2319
to
a9dfd46
Compare
According to further discussions since, it seems that triggered jobs using the latest available gear for a given name is the present and desired future behavior, so I opted for not storing a gear default config on new rules. Ties in nicely with @nagem's addition, where defaults are applied as late and as unified as possible. @kofalt Please sanity check manifest -> gear intended fix. @nagem @kofalt Should I add a check upon gear version addition, which makes sure that any rules that have a custom config still validate against the new gear manifest? |
if len(gear.get('manifest', {}).get('config', {})) > 0: | ||
invocation = gear_tools.derive_invocation_schema(gear['manifest']) | ||
if len(gear.get('gear', {}).get('config', {})) > 0: | ||
invocation = gear_tools.derive_invocation_schema(gear['gear']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice catch
api/jobs/handlers.py
Outdated
@@ -531,7 +539,7 @@ def post(self): | |||
gear = get_gear(gear_id) | |||
if gear.get('gear', {}).get('custom', {}).get('flywheel', {}).get('invalid', False): | |||
self.abort(400, 'Gear marked as invalid, will not run!') | |||
validate_gear_config(gear, config_) | |||
validate_gear_config(gear, fill_gear_default_values(gear, config_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The config defaults will be filled in Queue.enqueue_job
(called after batch/id/run
). Setting it at the beginning is maybe slightly more efficient, but it will still iterate over all of the gear config keys for each batch job that is enqueued. You might have already been aware of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Deliberately not filling batch config with defaults, just using fill_gear_default_values
for validation purposes. I think it's nice that the defaults only get persisted as late as possible, as you implemented it in enqueue_job
. (This is a similar situation to rules as I see it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right Megan, but I think he's doing the right thing here. We probably need fill_gear_default_values
whenever validate_gear_config
is called.
a9dfd46
to
b8028d3
Compare
api/jobs/handlers.py
Outdated
except APINotFoundException: | ||
self.abort(400, 'Cannot find gear for alg {}, alg not valid'.format(doc['alg'])) | ||
|
||
doc['project_id'] = cid | ||
|
||
if 'config' in doc: | ||
validate_gear_config(gear, fill_gear_default_values(gear, doc['config'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called unconditionally, setting doc['config']
to {}
and calling fill_gear_default_values
on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity, in what scenario can we run into a validation error if there is no custom config set for the new rule?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kofalt Do you think unconditional validation would be preferred here, too? https://github.com/scitran/core/pull/961/files#diff-8ae88fa922271b27b6066aae46fda7abR239
api/jobs/handlers.py
Outdated
if 'alg' in updates or 'config' in updates: | ||
gear = get_gear_by_name(updates.get('alg', doc['alg'])) | ||
config_ = fill_gear_default_values(gear, updates.get('config', doc.get('config', {}))) | ||
validate_gear_config(gear, config_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think validate_gear_config
will throw an exception; maybe catch that and throw a @nagem -friendly exception for status codes & such?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kofalt The current (already friendly) exc thrown by it is APIValidationException
(422). Wondering if that's appropriate or maybe InputValidationException
is more fitting?
api/jobs/handlers.py
Outdated
@@ -531,7 +539,7 @@ def post(self): | |||
gear = get_gear(gear_id) | |||
if gear.get('gear', {}).get('custom', {}).get('flywheel', {}).get('invalid', False): | |||
self.abort(400, 'Gear marked as invalid, will not run!') | |||
validate_gear_config(gear, config_) | |||
validate_gear_config(gear, fill_gear_default_values(gear, config_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right Megan, but I think he's doing the right thing here. We probably need fill_gear_default_values
whenever validate_gear_config
is called.
api/jobs/rules.py
Outdated
@@ -199,6 +199,9 @@ def create_potential_jobs(db, container, container_type, file_): | |||
gear = gears.get_gear_by_name(alg_name) | |||
job = Job(str(gear['_id']), inputs, tags=['auto', alg_name]) | |||
|
|||
if 'config' in rule: | |||
job.config = rule['config'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the rule config have the defaults filled out, or will it need to be pushed through fill_gear_default_values
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rule config only ever contains custom values, so this still relies on fill_gear_default_values
happening later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The defaults will get filled in Queue.enqueue_job()
. Let's not use fill_defaults
when creating a job that will get enqueued through that codepath.
raml/schemas/definitions/rule.json
Outdated
"project_id": { "type": "string" }, | ||
"alg": { "type": "string" }, | ||
"name": { "type": "string" }, | ||
"config": { "type": "object" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove these five definitions, and instead just use their { "type": "string" }
/ etc directly.
We probably do this a lot in other parts of the schema, but it's a bit overboard.
raml/schemas/input/rule-update.json
Outdated
} | ||
}, | ||
"additionalProperties": false | ||
"allOf": [{"$ref": "../definitions/rule.json#/definitions/rule-input"}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nagem Confirm for me that this is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
raml/schemas/output/rule-list.json
Outdated
{ | ||
"$schema": "http://json-schema.org/draft-04/schema#", | ||
"type": "array", | ||
"items": { "$ref":"../definitions/rule.json#/definitions/rule-output" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nagem Same here?
3d3b322
to
e126ebf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diff is a little rough, but LGTM to the best of my ability
@@ -111,7 +111,7 @@ def validate_gear_config(gear, config_): | |||
validator = Draft4Validator(ci) | |||
|
|||
try: | |||
validator.validate(config_) | |||
validator.validate(fill_gear_default_values(gear, config_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defaults will be filled when the job is spawned, so I would think we wouldn't fill them here. What is the expected behavior if a gear is updated and the update has a new default that the rule didn't explicitly set? If we fill defaults here, it won't use the updated default. It is possible they wouldn't want to use an updated default without any user action.
If I missed a discussion on this, let me know. @kofalt @ambrussimon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that it's a job rule, I would expect valid defaults to be used, rather than invalid ones from a different gear document. Really the core issue here is that fill_gear_default_values
modifies the map, it's reasonable to call this in places that you won't want to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I would either not call fill_gear_default_values
in places where you don't want to modify the document (quick fix), or (better fix) change fill_gear_default_values
to use copy.deepcopy
, then check the various callees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice catch, that was the original intent... I totally agree, fixing right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, but I was asking if we wanted defaults persisted to the database for rule config if the user didn't explicitly send them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nagem I thought it would be best not to store the default config on the rule, and have new jobs simply use potentially updated/new defaults, filled at spawn time. I gathered that's the most useful/expected behavior for now ~ in line with how it works currently. Proper gear versioning is partially in conflict with the "auto-update" feature... At least that's what I thought during implementation. I guess auto-selecting the latest gear on the UI in a dropdown wouldn't be too complicated, either.
Anyhow, gear versioning or not, I think not storing the defaults is better. Assuming gears have sane defaults for every version, it just working out of the box after an update (at least if no custom config was set!) seems nice, and that's exactly what Thad wants to hold on to. If the user wants to hardcode the defaults, they are still free to, using the new rule config...
To be clear: we still haven't decided and acted on a solution to "what happens when someone uploads a new version of a gear whose config conflicts with an existing gear rule config", so this PR cannot be merged. |
0381acf
to
8bc87c5
Compare
I think we've at least come to the conclusion that, when attempting to upload a gear, the user should be warned of project rules that would now be in conflict with with the gear's new config. I believe most clients use |
{
"disables_project_rules": [
{
"path": "scitran/testdata",
"id": "aex"
},
{
"path": "your-group/whatever",
"id": "aex2"
},
]
} If the array is non-empty, frontends can print that as a warning before/after an upload. |
This is on hold until gear rules referencing gear id (version) is implemented |
8bc87c5
to
c163d43
Compare
@nagem Bump. Freshly rebased on top of
(Can PR separately if desired.) I feel like I'm missing something... So is there any auto-disabling/auto-version-bump feature required on the API side, or is that now completely left for the UI/clients to sort out? (Now that rules use gear_id) |
c163d43
to
837e00e
Compare
837e00e
to
e8f6d18
Compare
e8f6d18
to
71a8138
Compare
71a8138
to
da9e2b3
Compare
da9e2b3
to
56fa85e
Compare
Resolves #922
Resolves #959
Could someone with more JSONSchema experience reassure me that this looks OK?
https://github.com/scitran/core/compare/rule-config-and-validation?expand=1#diff-31c3baaff2a2d926bb13668c2bbd23bdR5
Review Checklist