Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model optimization #3

Open
BioLaoXu opened this issue Jun 3, 2021 · 2 comments
Open

model optimization #3

BioLaoXu opened this issue Jun 3, 2021 · 2 comments

Comments

@BioLaoXu
Copy link

BioLaoXu commented Jun 3, 2021

According to literature research,GraphSmote is probably the only one toolkit that can train graph neural networks on unbalanced data,It's a great privilege to use this toolkit.
My learning task is consistent with the Cora data set,just a simple node-level classification(0 or 1,0 represent negative and 1 represent positive),i have ~13000 nodes(positive:~550,the rest nodes are all negative),each node has 350 features;then i rewrite the Cora data_loader function for my data set,and it words successfully.then I train the model by GraphSmote main.py:
nohup python3 -u main.py --imbalance --no-cuda --dataset=cora --setting='embed_up' --epochs=1000 --model=GAT --up_scale=0.95 --nhid=8 --batch_size=64 --lr=0.001 --dropout=0.5 &
It works!but the accuracy of the model in both the training data and the verification data set is very low and fluctuates within 0.6,just like bellow:
Loading cora dataset... edges between class 0 and class 0: 63682080.000000 edges between class 0 and class 1: 2235402.000000 edges between class 1 and class 0: 2235402.000000 edges between class 1 and class 1: 99624.000000 0-th class sample number: 12353 1-th class sample number: 500 valid current auc-roc score: 0.643200, current macro_F score: 0.333333 Epoch: 00001 loss_train: 0.7087 loss_rec: 0.7087 acc_train: 0.5000 loss_val: 0.7085 acc_val: 0.5000 time: 125.3990s Test set results: loss= 0.7085 accuracy= 0.5000 test current auc-roc score: 0.452893, current macro_F score: 0.333333 valid current auc-roc score: 0.459200, current macro_F score: 0.333333 Epoch: 00002 loss_train: 0.7085 loss_rec: 0.7085 acc_train: 0.5000 loss_val: 0.7087 acc_val: 0.5000 time: 125.1620s valid current auc-roc score: 0.481600, current macro_F score: 0.333333 Epoch: 00003 loss_train: 0.7083 loss_rec: 0.7083 acc_train: 0.5000 loss_val: 0.7083 acc_val: 0.5000 time: 124.7956s valid current auc-roc score: 0.561600, current macro_F score: 0.333333 ...... Epoch: 00820 loss_train: 0.6924 loss_rec: 0.6924 acc_train: 0.5500 loss_val: 0.6953 acc_val: 0.3400 time: 174.9777s
the detailed results:
nohup.txt

i have try reset the nhid/dropout/learn ratio ,but both the acc_train and acc_val are very low and fluctuate,Do you have any suggestions for optimization,Looking forward to your reply!thanks

@TianxiangZhao
Copy link
Owner

How do you set the im_ratio? You can try set it as 0.5 and test the performance.

@neeraja1504
Copy link

I am working on a similar problem involving GATs. However, when I keep nhid as a default 64 there is a tcmalloc error. The GAT model is also taking a lot of hours to train compared to the GCN and GraphSage models. The accuracy is around 55% even when trained for more than a thousand epochs. Any thoughts on what can be done to improve the performance and reduce the training time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants