You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the parameters of the optimizer (i.e. learning_rate, momentum, etc.) are copied and tied for each trainable parameters in the model.
E.g. if the optimizer has a learning rate parameter named lr and the compiled graph has a parameter named l1.weight, the optimizer parameter in the graph will be named input_opt_l1.weight_0.lr and in the optimizer we will store this parameter in a Dict[Tuple[str, str], Tensor]. For the example case, that will be:
{
("l1.weight", "lr"): value
}
This means that when we want to prepare the optimizer graph for execution, we need to manually match the name of the optimizer parameter in the graph with its entry in the optimizer's parameter dictionary. E.g. match "input_opt_l1.weight_0.lr" with ("l1.weight", "lr").
Additionally, all trainable parameters will have its own copy of the learning rate optimizer parameter, which is not ideal as well.
The text was updated successfully, but these errors were encountered:
Currently, the parameters of the optimizer (i.e.
learning_rate
,momentum
, etc.) are copied and tied for each trainable parameters in the model.E.g. if the optimizer has a learning rate parameter named
lr
and the compiled graph has a parameter namedl1.weight
, the optimizer parameter in the graph will be namedinput_opt_l1.weight_0.lr
and in the optimizer we will store this parameter in aDict[Tuple[str, str], Tensor]
. For the example case, that will be:This means that when we want to prepare the optimizer graph for execution, we need to manually match the name of the optimizer parameter in the graph with its entry in the optimizer's parameter dictionary. E.g. match
"input_opt_l1.weight_0.lr"
with("l1.weight", "lr")
.Additionally, all trainable parameters will have its own copy of the learning rate optimizer parameter, which is not ideal as well.
The text was updated successfully, but these errors were encountered: