value overwrite #20

LanceKnight · 2021-01-05T03:55:24Z

Hi, I don't quite understand the following code excerpt from get_bin_feature(r, max_natoms) in ioutils_direct.py. From importing mol_graph.py, the bond_fdim = 6, so f[1:1+bond_fdim] is f[1:7], which will be overwritten by f[-4] and f[-3], so I'm not sure what this is doing. Also, f[-4] and f[-3] seems to be checking whether two atoms are in the same molecule. I'm not sure why we need both f[-4] and f[-3] to check this. Isn't using one of them enough?

`for i in range(max_natoms):

for j in range(max_natoms):

    f = np.zeros((binary_fdim,)) # binary_fdim = 4+bond_fdim, which equals to 10 in this case

    if i >= n_atoms or j >= n_atoms or i == j:

        features.append(f)

        continue

    if (i,j) in bond_map:

        bond = bond_map[(i,j)]

        f[1:1+bond_fdim] = bond_features(bond)

    else:

        f[0] = 1.0

    f[-4] = 1.0 if comp[i] != comp[j] else 0.0

    f[-3] = 1.0 if comp[i] == comp[j] else 0.0

    f[-2] = 1.0 if n_comp == 1 else 0.0

    f[-1] = 1.0 if n_comp > 1 else 0.0

    features.append(f)`

The text was updated successfully, but these errors were encountered:

connorcoley · 2021-01-05T15:19:23Z

There is a small indexing bug that I've left in so the implementation matches what was used in the paper. See #17 (comment)

f[-4] and f[-3] carry the same information; it's just a one hot encoding of a boolean. I suspect it makes almost no practical difference in terms of model training and performance. Using one should be sufficient!

LanceKnight · 2021-01-05T16:13:40Z

I see. Would you mind pointing out which part of the paper it tries to match? I checked the main script and supplement about the attention mechanism but I didn't find texts related to the get_bin_feature(r, max_natoms) other than a formula that gives attention score. ( In the docstring of get_bin_feature(r, max_natoms), it says it is used in attention mechanism).

Speaking of the paper, it seems there are duplicate 24's in Fig1.B and I guess one of them should be 34. And I guess Fig1.C, I guess the dimension of new bond order should be 5 according to the codes?

Sorry I'm a new student and I've never published in a journal so I'm not sure if I should point those out since they're very trivial things anyway. It is a great paper by all means.

connorcoley · 2021-01-10T23:15:25Z

The precise details of the implementation (i.e., the exact indexing of the binary features) wasn't described in the text. By wanting it to match, I more meant that I didn't want the quantitative results to change at all (e.g., even by 0.01%) since then the code wouldn't match what I ran for the experiments.

You're right that in the schematic in Figure 1B, one of the atom labels should be 34. Nobody, including myself, has noticed that! In the dataset, if my memory serves me correctly, there are no new aromatic bonds created. The code is implemented slightly more generally, however, which would give an extra dimension there.

There's no harm pointing these things out! The indexing of f[-4] and f[-3] is certainly a good catch, and it shows that you're digging deep into the implementation details and doing a great job following the logic in the code. Figure schematics/cartoons like Figure 1 are less important than the technical correctness of the methods and results, but no need to apologize. I wish I had caught these things myself :)

LanceKnight · 2021-01-20T21:11:36Z

Thank you professor for the answer and encouraging words :)

Another thing I noticed was that in Fig.2, the legend says the "cross in a circle" means outer product, but in the codes, it seems the operation is more like an element-wise multiplication?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

value overwrite #20

value overwrite #20

LanceKnight commented Jan 5, 2021 •

edited

Loading

connorcoley commented Jan 5, 2021

LanceKnight commented Jan 5, 2021

connorcoley commented Jan 10, 2021

LanceKnight commented Jan 20, 2021

value overwrite #20

value overwrite #20

Comments

LanceKnight commented Jan 5, 2021 • edited Loading

connorcoley commented Jan 5, 2021

LanceKnight commented Jan 5, 2021

connorcoley commented Jan 10, 2021

LanceKnight commented Jan 20, 2021

LanceKnight commented Jan 5, 2021 •

edited

Loading