-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalising the training module to accomodate varying input #20
base: main
Are you sure you want to change the base?
Conversation
In
Assuming that the number of inputs would be different for all three sources, we need to generalise the code for |
We can include the normalisation of the variables in the neural network architecture instead of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking much better @surbhigoel77 with a few comments on there.
There also appear to be a number of conflicts with the main branch, so perhaps a rebase is needed (though I suggest testing this on a new branch.)
The big thing that is missing is documentation on how to use it which would be nice to allow handing over to collaborators for future re-use. Perhaps #14 can help?
The other thing to consider is some CI though this requires out collaborators understanding how to run formatters and linting. And I don;t think there are any tests yet...?
Was there any way to check that the same results are being generated after the refactor?
# if model is not None: | ||
# # torch.save(model.state_dict(), 'conv_torch.pth') | ||
# torch.save(model.state_dict(), 'trained_models/weights_conv') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this dead code, or should it be wrapped in a conditional of some sort, or replaced by a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure as to it's purpose, ask @yqsun91.
layers = [] | ||
input_size = in_ver * ilev + in_nover | ||
for _ in range(hidden_layers): | ||
layers.append(nn.Linear(input_size, hidden_size, dtype=torch.float64)) | ||
layers.append(nn.SiLU()) | ||
input_size = hidden_size | ||
layers.append(nn.Linear(hidden_size, out_ver * ilev, dtype=torch.float64)) | ||
self.linear_stack = nn.Sequential(*layers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A brief comment to summarise what this code is doing for those used to writing it in the previous layout might be useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is looking much better, the net is much cleaner and the docstrings really help understand what is going on.
|
||
|
||
|
||
class EarlyStopper: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not immediately clear to me why this should be a class rather than a function, and whether it belons here with the Model, or if it would be better being moved to train.py
with the training loop/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@surbhigoel77 agrees this should be a singe function with inputs, and moved to train.py
dim_NNout = int(out_ver * ilev) | ||
x_train = np.zeros([dim_NN, Ncol]) | ||
y_train = np.zeros([dim_NNout, Ncol]) | ||
target_var = ['UTGWSPEC', 'VTGWSPEC'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have made the input names a variable input, but hardcoded the output variables.
Is it worth making both variable?
newCAM_emulation/main.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this presents a much cleaner overview of the development, with details abstracted away into modules.
for t in range(epochs): | ||
if t % 2 ==0: | ||
print(f"Epoch {t+1}\n-------------------------------") | ||
def train_loop(dataloader, model, loss_fn, optimizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ambiguous name?
This seems to be training a single epoch i.e. a single iteration of the full training loop?
if early_stopper.early_stop(val_loss): | ||
print("BREAK!") | ||
if early_stopper.early_stop(val_loss, model): | ||
# print("BREAK!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code?
Or, more likely, should we be writing some output to tell the user that training has finished because early stopping criteria was met?
ilev : int | ||
Number of vertical levels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing some docs for input variables.
…iles to loaddata from train.py
…train and deleted commented out code in model
Closes #19
In order to accomodate data coming from multiple sources, we need to replace the hardcoded inputs with variables whose values are extracted from the dataset during the code run.
Files that need updating:
loaddata.py
Model.py
train.py
In
train.py
, the.nc
data files are read and the variables are extracted and normalised. This part should be taken out oftrain.py
and be kept in a separatepreprocessing.py
file, to keep train.py common for different data sources.train.py
can be converted into a module instead. We can also have a test notebook in the repo.