-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change override to override_sole_init in alf.config #1705
Conversation
cbb6a84
to
7fe31a7
Compare
7fe31a7
to
bca872b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the much cleaner changes and tests, Andrew. A few questions regarding alf config behavior.
Le
alf/config_util_test.py
Outdated
@@ -182,6 +182,63 @@ def sole_init_test_env(x): | |||
self.assertEqual(alf.get_config_value("sole_init_test_twice.x"), 1) | |||
self.assertEqual(alf.get_config_value("sole_init_test_env.x"), 1) | |||
|
|||
# Test override_config doesn't doesn't overwrite for immutable values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove one "doesn't"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Done.
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_on_immutable", x=0, mutable=False) | ||
alf.override_config("override_on_immutable", x=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to raise an error on overriding immutable config? Silently ignoring override can be hard to debug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to my reply above for why this cannot be done currently without adding more state parameters. There is a logger warning that gets printed in this scenario, though it's easy for users to not see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe let's color code the warning message a bright yellow, so it's harder to miss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now it prints as red via absl.logging.warning
. Do you think yellow is better?
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_no_affect_sole_init", x=1, sole_init=True) | ||
alf.override_config("override_no_affect_sole_init", x=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this also throw error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't throw an error since override_config
allows us to set a config's value despite an earlier sole_init=True
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Maybe rename the function to alf.override_sole_config()? Just as how you renamed override to override_sole_init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
only when absolutely necessary (e.g., a teacher-student training loop, | ||
where the student must override certain configs inherited from the | ||
teacher). If the config is immutable, a warning will be declared with | ||
no changes made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of quietly ignoring the config, should we just throw error for this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason we cannot do this is because alf.pre_config
and --conf_param
set values with mutable=False
(that is how those values avoid getting overwritten). Therefore, there may be times we want to override our conf files using one of the two options and raising an error in this situation is problematic. A workaround for this was to possibly introduce a pre_configged
state variable for configs, but I thought that may be unnecessary. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. You are right. Let's go with current way for the PR.
Separately, are there some examples of these cases? It would be good to understand a bit more, and see if we can design better solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to examples of the teacher-student setup?
One is our own agent training. Let's say you train expert with ALF_SOLE_CONFIG=1
. When we train the agent with ALF_SOLE_CONFIG=1
as well, it will inevitably raise errors as agent often overwrites some of the expert's configs. Here, we can use the alf.override_config
function. Not only does this minimize config calls, but it also explicitly tells the coder which inherited calls are being overwritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the really speedy review Le. Addressed your comments. Let me know if there are any questions or clarifications I can make.
only when absolutely necessary (e.g., a teacher-student training loop, | ||
where the student must override certain configs inherited from the | ||
teacher). If the config is immutable, a warning will be declared with | ||
no changes made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason we cannot do this is because alf.pre_config
and --conf_param
set values with mutable=False
(that is how those values avoid getting overwritten). Therefore, there may be times we want to override our conf files using one of the two options and raising an error in this situation is problematic. A workaround for this was to possibly introduce a pre_configged
state variable for configs, but I thought that may be unnecessary. Let me know what you think.
alf/config_util_test.py
Outdated
@@ -182,6 +182,63 @@ def sole_init_test_env(x): | |||
self.assertEqual(alf.get_config_value("sole_init_test_twice.x"), 1) | |||
self.assertEqual(alf.get_config_value("sole_init_test_env.x"), 1) | |||
|
|||
# Test override_config doesn't doesn't overwrite for immutable values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Done.
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_on_immutable", x=0, mutable=False) | ||
alf.override_config("override_on_immutable", x=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to my reply above for why this cannot be done currently without adding more state parameters. There is a logger warning that gets printed in this scenario, though it's easy for users to not see it.
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_no_affect_sole_init", x=1, sole_init=True) | ||
alf.override_config("override_no_affect_sole_init", x=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't throw an error since override_config
allows us to set a config's value despite an earlier sole_init=True
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for answering the questions. Only a few nits.
Thanks,
Le
only when absolutely necessary (e.g., a teacher-student training loop, | ||
where the student must override certain configs inherited from the | ||
teacher). If the config is immutable, a warning will be declared with | ||
no changes made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. You are right. Let's go with current way for the PR.
Separately, are there some examples of these cases? It would be good to understand a bit more, and see if we can design better solutions.
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_on_immutable", x=0, mutable=False) | ||
alf.override_config("override_on_immutable", x=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe let's color code the warning message a bright yellow, so it's harder to miss?
alf.config("override_no_affect_sole_init", x=1, sole_init=True) | ||
alf.override_config("override_no_affect_sole_init", x=2) | ||
with self.assertRaises(RuntimeError) as context: | ||
alf.config("override_no_affect_sole_init", x=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test that x's value is 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_no_affect_sole_init", x=1, sole_init=True) | ||
alf.override_config("override_no_affect_sole_init", x=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Maybe rename the function to alf.override_sole_config()? Just as how you renamed override to override_sole_init.
@@ -285,7 +285,11 @@ def get_env(): | |||
# A 'None' random seed won't set a deterministic torch behavior. | |||
for _ in range(PerProcessContext().ddp_rank): | |||
random_seed = random.randint(0, 2**32) | |||
config1("TrainerConfig.random_seed", random_seed, raise_if_used=False) | |||
config1( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are several other config1 calls in config_helpers.py that may need fix too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch. This has been fixed for the other calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. PTAL
@@ -285,7 +285,11 @@ def get_env(): | |||
# A 'None' random seed won't set a deterministic torch behavior. | |||
for _ in range(PerProcessContext().ddp_rank): | |||
random_seed = random.randint(0, 2**32) | |||
config1("TrainerConfig.random_seed", random_seed, raise_if_used=False) | |||
config1( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch. This has been fixed for the other calls.
only when absolutely necessary (e.g., a teacher-student training loop, | ||
where the student must override certain configs inherited from the | ||
teacher). If the config is immutable, a warning will be declared with | ||
no changes made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you referring to examples of the teacher-student setup?
One is our own agent training. Let's say you train expert with ALF_SOLE_CONFIG=1
. When we train the agent with ALF_SOLE_CONFIG=1
as well, it will inevitably raise errors as agent often overwrites some of the expert's configs. Here, we can use the alf.override_config
function. Not only does this minimize config calls, but it also explicitly tells the coder which inherited calls are being overwritten.
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_on_immutable", x=0, mutable=False) | ||
alf.override_config("override_on_immutable", x=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now it prints as red via absl.logging.warning
. Do you think yellow is better?
alf/config_util_test.py
Outdated
pass | ||
|
||
alf.config("override_no_affect_sole_init", x=1, sole_init=True) | ||
alf.override_config("override_no_affect_sole_init", x=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
alf.config("override_no_affect_sole_init", x=1, sole_init=True) | ||
alf.override_config("override_no_affect_sole_init", x=2) | ||
with self.assertRaises(RuntimeError) as context: | ||
alf.config("override_no_affect_sole_init", x=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done.
edb31e1
to
0da269c
Compare
* Rename alf.override_config to alf.override_sole_config
This PR changes the
override
flag ofalf.config
tooverride_sole_init
. The implementation before allowedalf.override_config
to override a value of a config regardless of themutable
flag. Unfortunately, this breaks the logic of setting parameters via--conf_params
oralf.pre_config
. It also causes issues in evaluator processes that may set certain config values withmutable=False
and then read from the conf file as it overwrites the desired eval setting.Therefore, this PR now changes
alf.override_config
toalf.override_sole_config
, which respects immutability. On our local hobot code, I've now verified that all the above problems have been remedied and singular config calls can now be enforced when usingALF_SOLE_CONFIG=1
, greatly reducing chance for bugs via nested / unintentional overwrites.Additional test cases for edge cases has also been provided in
config_utils_test.py
.Once this gets merged, I can submit PRs for
ALF_SOLE_CONFIG=1
compatible hobot confs.