model optimization #3

BioLaoXu · 2021-06-03T09:08:22Z

According to literature research,GraphSmote is probably the only one toolkit that can train graph neural networks on unbalanced data,It's a great privilege to use this toolkit.
My learning task is consistent with the Cora data set,just a simple node-level classification(0 or 1,0 represent negative and 1 represent positive),i have ~13000 nodes(positive:~550,the rest nodes are all negative),each node has 350 features;then i rewrite the Cora data_loader function for my data set,and it words successfully.then I train the model by GraphSmote main.py：
nohup python3 -u main.py --imbalance --no-cuda --dataset=cora --setting='embed_up' --epochs=1000 --model=GAT --up_scale=0.95 --nhid=8 --batch_size=64 --lr=0.001 --dropout=0.5 &
It works!but the accuracy of the model in both the training data and the verification data set is very low and fluctuates within 0.6,just like bellow:
Loading cora dataset... edges between class 0 and class 0: 63682080.000000 edges between class 0 and class 1: 2235402.000000 edges between class 1 and class 0: 2235402.000000 edges between class 1 and class 1: 99624.000000 0-th class sample number: 12353 1-th class sample number: 500 valid current auc-roc score: 0.643200, current macro_F score: 0.333333 Epoch: 00001 loss_train: 0.7087 loss_rec: 0.7087 acc_train: 0.5000 loss_val: 0.7085 acc_val: 0.5000 time: 125.3990s Test set results: loss= 0.7085 accuracy= 0.5000 test current auc-roc score: 0.452893, current macro_F score: 0.333333 valid current auc-roc score: 0.459200, current macro_F score: 0.333333 Epoch: 00002 loss_train: 0.7085 loss_rec: 0.7085 acc_train: 0.5000 loss_val: 0.7087 acc_val: 0.5000 time: 125.1620s valid current auc-roc score: 0.481600, current macro_F score: 0.333333 Epoch: 00003 loss_train: 0.7083 loss_rec: 0.7083 acc_train: 0.5000 loss_val: 0.7083 acc_val: 0.5000 time: 124.7956s valid current auc-roc score: 0.561600, current macro_F score: 0.333333 ...... Epoch: 00820 loss_train: 0.6924 loss_rec: 0.6924 acc_train: 0.5500 loss_val: 0.6953 acc_val: 0.3400 time: 174.9777s
the detailed results:
nohup.txt

i have try reset the nhid/dropout/learn ratio ,but both the acc_train and acc_val are very low and fluctuate,Do you have any suggestions for optimization,Looking forward to your reply!thanks

The text was updated successfully, but these errors were encountered:

TianxiangZhao · 2021-06-03T16:14:05Z

How do you set the im_ratio? You can try set it as 0.5 and test the performance.

neeraja1504 · 2022-03-01T07:04:31Z

I am working on a similar problem involving GATs. However, when I keep nhid as a default 64 there is a tcmalloc error. The GAT model is also taking a lot of hours to train compared to the GCN and GraphSage models. The accuracy is around 55% even when trained for more than a thousand epochs. Any thoughts on what can be done to improve the performance and reduce the training time?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model optimization #3

model optimization #3

BioLaoXu commented Jun 3, 2021 •

edited

Loading

TianxiangZhao commented Jun 3, 2021

neeraja1504 commented Mar 1, 2022

model optimization #3

model optimization #3

Comments

BioLaoXu commented Jun 3, 2021 • edited Loading

TianxiangZhao commented Jun 3, 2021

neeraja1504 commented Mar 1, 2022

BioLaoXu commented Jun 3, 2021 •

edited

Loading