ml-1m数据集结果不一致 #9

enoche · 2021-07-20T04:00:49Z

Hi，thanks for sharing the code.
With your source code unmodified (dropout: 0.5, neg-weight: 0.5), I have tried on ml-1m and get the following results:
First col: Recall
Second col: NDCG

loss,loss_no_reg,loss_reg -20357.158033288044 -20357.158033288044 0.0
TopK: [10, 20, 50]
Recall@10: 0.1 NDCG@10: 0.04893161658667473
Recall@20: 0.16258278145695365 NDCG@20: 0.06466930639306495
Recall@50: 0.29817880794701984 NDCG@50: 0.09132394256584671

Which is much lower than in readme:
NDCG@5, 10, 20
0.2457 0.2475 0.2656

May I have your help to reproduce your results on ml1m. Thanks.

chenchongthu · 2021-07-20T05:10:59Z

The results
NDCG@5, 10, 20
0.2457 0.2475 0.2656
is on ml-lcfn datasets.

To compare with LCFN, you need to use the dataset ml-lcfn, which is the same as the data used in Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters.

enoche · 2021-07-20T05:16:51Z

Noted with thanks. I will try ml-lcfn.

enoche · 2021-07-20T05:32:28Z

The results
NDCG@5, 10, 20
0.2457 0.2475 0.2656
is on ml-lcfn datasets.

To compare with LCFN, you need to use the dataset ml-lcfn, which is the same as the data used in Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters.

After running on ml-lcfn(with (dropout: 0.5, neg-weight: 0.5),), I got:

499
Updating: time=0.42
loss,loss_no_reg,loss_reg -14270.39457370924 -14270.39457370924 0.0
TopK: [10, 20, 50]
0.17474698281831363 0.24197656885939905
0.26642073600099603 0.2590238286022946
0.42833159703000406 0.30848090465690464

NDCG@10: 0.24197656885939905 < 0.2475
NDCG@20: 0.2590238286022946 < 0.2656

Not the same as reported in README, but is acceptable.

chenchongthu · 2021-07-20T05:34:38Z

Have you ever read the readme carefully? What is your setting of embedding size? For a fair comparison, we also set the embedding size as 128, which is utilized in the LCFN work.

enoche · 2021-07-20T05:52:11Z

Have you ever read the readme carefully? What is your setting of embedding size? For a fair comparison, we also set the embedding size as 128, which is utilized in the LCFN work.

Noted! Here is the results:

499
Updating: time=0.73
loss,loss_no_reg,loss_reg -15902.035453464674 -15902.035453464674 0.0
TopK: [10, 20, 50]
0.18050760514556305 0.24640283029068843
0.27622081694706263 0.26572967980513473
0.44025214134898943 0.31696308408475754

NDCG@10: 0.24640283029068843 < 0.2475
NDCG@20: 0.26572967980513473 > 0.2656

Much better!

chenchongthu · 2021-07-21T12:47:33Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

enoche · 2021-07-22T01:21:23Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

附件中是 amazon-games的数据5-core处理完的。
x_label列是train/valid/test (0/1/2) 的标记，这个数据分隔是按全局时间序来的（SelfCF一样的）。您那边可以测试一下看看效果。

games_processed.csv

chenchongthu · 2021-07-22T02:05:46Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

附件中是 amazon-games的数据5-core处理完的。
x_label列是train/valid/test (0/1/2) 的标记，这个数据分隔是按全局时间序来的（SelfCF一样的）。您那边可以测试一下看看效果。

games_processed.csv

谢谢，我大概跑了一下，
parser.add_argument('--dropout', type=float, default=0.5,
help='dropout keep_prob')
parser.add_argument('--negative_weight', type=float, default=0.05,

在第100轮时结果如下：
R@20=0.0764
R@50=0.1323
N@20=0.0367
N@50=0.0511

看起来比SelfCF好很多？🤔
R@20=0.0509
R@50=0.0913
N@20=0.0250
N@50=0.0350

enoche · 2021-07-22T02:25:15Z

嗯，这个结果确实不错。能分享一下代码不？谢谢啦！您那边还是tensorflow吗？~

chenchongthu · 2021-07-22T02:27:08Z

嗯，这个结果确实不错。能分享一下代码不？谢谢啦！您那边还是tensorflow吗？~

代码就还是github上的代码

enoche · 2021-07-22T02:28:35Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

附件中是 amazon-games的数据5-core处理完的。
x_label列是train/valid/test (0/1/2) 的标记，这个数据分隔是按全局时间序来的（SelfCF一样的）。您那边可以测试一下看看效果。
games_processed.csv

谢谢，我大概跑了一下，
parser.add_argument('--dropout', type=float, default=0.5,
help='dropout keep_prob')
parser.add_argument('--negative_weight', type=float, default=0.05,

在第100轮时结果如下：
R@20=0.0764
R@50=0.1323
N@20=0.0367
N@50=0.0511

看起来比SelfCF好很多？🤔
R@20=0.0509
R@50=0.0913
N@20=0.0250
N@50=0.0350

嗯，这个结果确实不错。能分享一下代码不？谢谢啦！您那边还是tensorflow吗？~

代码就还是github上的代码

嗯，了解。您之前的数据是用last one，现在是用global-time line分隔，这个对程序上没有影响吗？

chenchongthu · 2021-07-22T02:29:54Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

附件中是 amazon-games的数据5-core处理完的。
x_label列是train/valid/test (0/1/2) 的标记，这个数据分隔是按全局时间序来的（SelfCF一样的）。您那边可以测试一下看看效果。
games_processed.csv

谢谢，我大概跑了一下，
parser.add_argument('--dropout', type=float, default=0.5,
help='dropout keep_prob')
parser.add_argument('--negative_weight', type=float, default=0.05,
在第100轮时结果如下：
R@20=0.0764
R@50=0.1323
N@20=0.0367
N@50=0.0511
看起来比SelfCF好很多？🤔
R@20=0.0509
R@50=0.0913
N@20=0.0250
N@50=0.0350

嗯，这个结果确实不错。能分享一下代码不？谢谢啦！您那边还是tensorflow吗？~

代码就还是github上的代码

嗯，了解。您之前的数据是用last one，现在是用global-time line分隔，这个对程序上没有影响吗？

没影响，只要把训练集测试集放在目录下，就可以直接跑了，
你的数据我也已经改成我这边可以直接用的格式上传了，https://github.com/chenchongthu/ENMF/tree/master/data/game

enoche · 2021-07-22T02:37:28Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

附件中是 amazon-games的数据5-core处理完的。
x_label列是train/valid/test (0/1/2) 的标记，这个数据分隔是按全局时间序来的（SelfCF一样的）。您那边可以测试一下看看效果。
games_processed.csv

谢谢，我大概跑了一下，
parser.add_argument('--dropout', type=float, default=0.5,
help='dropout keep_prob')
parser.add_argument('--negative_weight', type=float, default=0.05,
在第100轮时结果如下：
R@20=0.0764
R@50=0.1323
N@20=0.0367
N@50=0.0511
看起来比SelfCF好很多？🤔
R@20=0.0509
R@50=0.0913
N@20=0.0250
N@50=0.0350

嗯，这个结果确实不错。能分享一下代码不？谢谢啦！您那边还是tensorflow吗？~

代码就还是github上的代码

嗯，了解。您之前的数据是用last one，现在是用global-time line分隔，这个对程序上没有影响吗？

没影响，只要把训练集测试集放在目录下，就可以直接跑了，
你的数据我也已经改成我这边可以直接用的格式上传了，https://github.com/chenchongthu/ENMF/tree/master/data/game

好的，非常感谢！

chenchongthu · 2021-07-22T02:38:14Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

附件中是 amazon-games的数据5-core处理完的。
x_label列是train/valid/test (0/1/2) 的标记，这个数据分隔是按全局时间序来的（SelfCF一样的）。您那边可以测试一下看看效果。
games_processed.csv

谢谢，我大概跑了一下，
parser.add_argument('--dropout', type=float, default=0.5,
help='dropout keep_prob')
parser.add_argument('--negative_weight', type=float, default=0.05,
在第100轮时结果如下：
R@20=0.0764
R@50=0.1323
N@20=0.0367
N@50=0.0511
看起来比SelfCF好很多？🤔
R@20=0.0509
R@50=0.0913
N@20=0.0250
N@50=0.0350

嗯，这个结果确实不错。能分享一下代码不？谢谢啦！您那边还是tensorflow吗？~

代码就还是github上的代码

嗯，了解。您之前的数据是用last one，现在是用global-time line分隔，这个对程序上没有影响吗？

没影响，只要把训练集测试集放在目录下，就可以直接跑了，
你的数据我也已经改成我这边可以直接用的格式上传了，https://github.com/chenchongthu/ENMF/tree/master/data/game

好的，非常感谢！

不客气~随时交流

enoche · 2021-07-22T03:45:10Z

所以我比较好奇跟selfCF的结果差异如何？可否给我一份你的分隔好训练集和测试集的数据呢？

附件中是 amazon-games的数据5-core处理完的。
x_label列是train/valid/test (0/1/2) 的标记，这个数据分隔是按全局时间序来的（SelfCF一样的）。您那边可以测试一下看看效果。
games_processed.csv

谢谢，我大概跑了一下，
parser.add_argument('--dropout', type=float, default=0.5,
help='dropout keep_prob')
parser.add_argument('--negative_weight', type=float, default=0.05,
在第100轮时结果如下：
R@20=0.0764
R@50=0.1323
N@20=0.0367
N@50=0.0511
看起来比SelfCF好很多？🤔
R@20=0.0509
R@50=0.0913
N@20=0.0250
N@50=0.0350

嗯，这个结果确实不错。能分享一下代码不？谢谢啦！您那边还是tensorflow吗？~

代码就还是github上的代码

嗯，了解。您之前的数据是用last one，现在是用global-time line分隔，这个对程序上没有影响吗？

没影响，只要把训练集测试集放在目录下，就可以直接跑了，
你的数据我也已经改成我这边可以直接用的格式上传了，https://github.com/chenchongthu/ENMF/tree/master/data/game

好的，非常感谢！

不客气~随时交流

非常感谢耐心解答。我刚检查了一下，发现我数据给你错了，上面是一个小样本的测试，全部的在下面：
对了，您有微信不（我的303432874)？能加一下方便联系不？谢谢啦！
games5_processed.csv

chenchongthu mentioned this issue Jul 20, 2021

EHCF_Sin 在ML1M上的结果 chenchongthu/EHCF#13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ml-1m数据集结果不一致 #9

ml-1m数据集结果不一致 #9

enoche commented Jul 20, 2021

chenchongthu commented Jul 20, 2021 •

edited

Loading

enoche commented Jul 20, 2021

enoche commented Jul 20, 2021

chenchongthu commented Jul 20, 2021

enoche commented Jul 20, 2021

chenchongthu commented Jul 21, 2021

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021 •

edited

Loading

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021

enoche commented Jul 22, 2021

ml-1m数据集结果不一致 #9

ml-1m数据集结果不一致 #9

Comments

enoche commented Jul 20, 2021

chenchongthu commented Jul 20, 2021 • edited Loading

enoche commented Jul 20, 2021

enoche commented Jul 20, 2021

chenchongthu commented Jul 20, 2021

enoche commented Jul 20, 2021

chenchongthu commented Jul 21, 2021

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021 • edited Loading

enoche commented Jul 22, 2021

chenchongthu commented Jul 22, 2021

enoche commented Jul 22, 2021

chenchongthu commented Jul 20, 2021 •

edited

Loading

chenchongthu commented Jul 22, 2021 •

edited

Loading