Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate and fix random NodeGraph test failures #274

Closed
Tracked by #310
joshuakarp opened this issue Oct 26, 2021 · 9 comments · Fixed by #310
Closed
Tracked by #310

Investigate and fix random NodeGraph test failures #274

joshuakarp opened this issue Oct 26, 2021 · 9 comments · Fixed by #310
Assignees
Labels
bug Something isn't working epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices

Comments

@joshuakarp
Copy link
Contributor

Describe the bug

The tests in NodeGraph can intermittently fail across runs. We'd experienced some similar failures in the past, and I thought I'd resolved them. The past failures were to do with some test utility functions that generate node IDs for a specific bucket index, such that they can be used for dynamic testing. They were previously causing failures by generating node IDs that overflowed into a different bucket index.

If this is still the cause of failure, now that we can use the 32 byte array as a node ID, I can potentially make these test utilities a bit more robust. Previously, they were just incrementing a character in the node ID to force an expected XOR value.

To Reproduce

The test failures only occur randomly. It took me 5 runs of NodeGraph.test.ts to experience a failure in the tests. Some instances of these:

[nix-shell:~/Documents/MatrixAI/js-polykey]$ npm test -- tests/nodes/NodeGraph.test.ts 

> @matrixai/[email protected] test /home/josh/Documents/MatrixAI/js-polykey
> jest "tests/nodes/NodeGraph.test.ts"

Determining test suites to run...
GLOBAL SETUP
 FAIL  tests/nodes/NodeGraph.test.ts (5.69 s)
  NodeGraph
    ✓ finds correct node address (11 ms)
    ✓ unable to find node address (6 ms)
    ✓ adds a single node into a bucket (5 ms)
    ✕ adds multiple nodes into the same bucket (16 ms)
    ✓ adds a single node into different buckets (6 ms)
    ✓ deletes a single node (and removes bucket) (6 ms)
    ✕ deletes a single node (and retains remainder of bucket) (14 ms)
    ✓ enforces k-bucket size, removing least active node when a new node is discovered (83 ms)
    ✓ retrieves all buckets (in expected lexicographic order) (13 ms)
    ✓ refreshes buckets (2050 ms)
    ✓ finds a single closest node (3 ms)
    ✓ finds 3 closest nodes (8 ms)
    ✓ finds the 20 closest nodes (92 ms)
    ✓ updates node (6 ms)

  ● NodeGraph › adds multiple nodes into the same bucket

    expect(received).toEqual(expected) // deep equality

    Expected: {"address": {"ip": "6.6.6.6", "port": 6666}, "lastUpdated": Any<Date>}
    Received: undefined

      222 |         lastUpdated: expect.any(Date),
      223 |       });
    > 224 |       expect(bucket[newNode3Id]).toEqual({
          |                                  ^
      225 |         address: { ip: '6.6.6.6', port: 6666 },
      226 |         lastUpdated: expect.any(Date),
      227 |       });

      at Object.<anonymous> (tests/nodes/NodeGraph.test.ts:224:34)

  ● NodeGraph › deletes a single node (and retains remainder of bucket)

    expect(received).toEqual(expected) // deep equality

    Expected: {"address": {"ip": "6.6.6.6", "port": 6666}, "lastUpdated": Any<Date>}
    Received: undefined

      305 |         lastUpdated: expect.any(Date),
      306 |       });
    > 307 |       expect(bucket[newNode3Id]).toEqual({
          |                                  ^
      308 |         address: { ip: '6.6.6.6', port: 6666 },
      309 |         lastUpdated: expect.any(Date),
      310 |       });

      at Object.<anonymous> (tests/nodes/NodeGraph.test.ts:307:34)

Test Suites: 1 failed, 1 total
Tests:       2 failed, 12 passed, 14 total
Snapshots:   0 total
Time:        5.726 s, estimated 6 s
Ran all test suites matching /tests\/nodes\/NodeGraph.test.ts/i.
GLOBAL TEARDOWN
npm ERR! Test failed.  See above for more details.


Determining test suites to run...
GLOBAL SETUP
 FAIL  tests/nodes/NodeGraph.test.ts (6.165 s)
  NodeGraph
    ✓ finds correct node address (12 ms)
    ✓ unable to find node address (6 ms)
    ✓ adds a single node into a bucket (5 ms)
    ✕ adds multiple nodes into the same bucket (16 ms)
    ✓ adds a single node into different buckets (8 ms)
    ✓ deletes a single node (and removes bucket) (6 ms)
    ✕ deletes a single node (and retains remainder of bucket) (13 ms)
    ✓ enforces k-bucket size, removing least active node when a new node is discovered (85 ms)
    ✕ retrieves all buckets (in expected lexicographic order) (14 ms)
    ✓ refreshes buckets (1996 ms)
    ✓ finds a single closest node (3 ms)
    ✓ finds 3 closest nodes (8 ms)
    ✓ finds the 20 closest nodes (93 ms)
    ✓ updates node (6 ms)

  ● NodeGraph › adds multiple nodes into the same bucket

    expect(received).toEqual(expected) // deep equality

    Expected: {"address": {"ip": "5.5.5.5", "port": 5555}, "lastUpdated": Any<Date>}
    Received: undefined

      218 |         lastUpdated: expect.any(Date),
      219 |       });
    > 220 |       expect(bucket[newNode2Id]).toEqual({
          |                                  ^
      221 |         address: { ip: '5.5.5.5', port: 5555 },
      222 |         lastUpdated: expect.any(Date),
      223 |       });

      at Object.<anonymous> (tests/nodes/NodeGraph.test.ts:220:34)

  ● NodeGraph › deletes a single node (and retains remainder of bucket)

    expect(received).toEqual(expected) // deep equality

    Expected: {"address": {"ip": "5.5.5.5", "port": 5555}, "lastUpdated": Any<Date>}
    Received: undefined

      301 |         lastUpdated: expect.any(Date),
      302 |       });
    > 303 |       expect(bucket[newNode2Id]).toEqual({
          |                                  ^
      304 |         address: { ip: '5.5.5.5', port: 5555 },
      305 |         lastUpdated: expect.any(Date),
      306 |       });

      at Object.<anonymous> (tests/nodes/NodeGraph.test.ts:303:34)

  ● NodeGraph › retrieves all buckets (in expected lexicographic order)

    expect(received).toBe(expected) // Object.is equality

    Expected: 4
    Received: 5

      412 |
      413 |     const buckets = await nodeGraph.getAllBuckets();
    > 414 |     expect(buckets.length).toBe(4);
          |                            ^
      415 |     // Buckets should be returned in lexicographic ordering (using hex keys to
      416 |     // ensure the bucket indexes are in numberical order)
      417 |     expect(buckets).toEqual([

      at Object.<anonymous> (tests/nodes/NodeGraph.test.ts:414:28)

Test Suites: 1 failed, 1 total
Tests:       3 failed, 11 passed, 14 total
Snapshots:   0 total
Time:        6.199 s
Ran all test suites matching /tests\/nodes\/NodeGraph.test.ts/i.
GLOBAL TEARDOWN
npm ERR! Test failed.  See above for more details.

I've found it useful to use the !# in the terminal to repeat the command (repeats everything prior). e.g. npm test -- tests/nodes/NodeGraph.test.ts; !# !# to run the tests 4 times.

Additional context

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Oct 27, 2021

Multiple suggestions to fix this:

Mock documentation is on jest, but because we are using TS, we need to ensure that TS doesn't complain about types. So you would want to mock the nodeid/rootkey that is being provided by the KeyManager.

But you should probably need to fix the node id operations, since char code increment is not correct anyway.

@CMCDragonkai
Copy link
Member

Consider changing the timeouts in the code when doing these tests too. Jest also supports timeout mocking that can help too: https://jestjs.io/blog/2020/05/05/jest-26#new-fake-timers

@CMCDragonkai CMCDragonkai mentioned this issue Nov 1, 2021
11 tasks
@joshuakarp joshuakarp added the epic Big issue with multiple subissues label Nov 29, 2021
@joshuakarp
Copy link
Contributor Author

I've added a note that this would be blocked by #261. This is because the current NodeGraph tests are failing on account of the node ID being used in the tests. Depending on the representation to be used in #261, this may have consequences for how this issue should be resolved (however, this is probably unlikely - the tests themselves are already using the 32-byte representation of the node ID). Basically, if there's a choice in which issue should be completed first, #261 should be completed first.

@joshuakarp
Copy link
Contributor Author

Regarding mocking from this earlier comment #274 (comment), can look into the following for mocking only a single function: https://stackoverflow.com/questions/59312671/mock-only-one-function-from-module-but-leave-rest-with-original-functionality (originally brought up in #283 (comment))

@joshuakarp
Copy link
Contributor Author

There's some discussion of function mocking in #283:

@joshuakarp
Copy link
Contributor Author

Seems that mocking with a static keypair has also created problems for the nodes tests, as per #283 (comment). Off the top of my head, not too sure what could be causing this - I thought I was mutating node IDs directly, as opposed to creating new keypairs. But will investigate this when looking into this issue too.

@joshuakarp
Copy link
Contributor Author

I should also add an exception thrown for when we attempt to calculate the bucket index with a target node ID that matches the current node ID. This currently (albeit, incorrectly) returns -1 for the bucket index, which causes some hard-to-trace problems when debugging (most recently here #278 (comment)).

Alternatively, find the calls to this function, and prevent this calculation from occurring at all.

@CMCDragonkai
Copy link
Member

What was the resolution to this issue? @joshuakarp @tegefaulkes can you describe how #310 solved this?

@tegefaulkes
Copy link
Contributor

The problems was caused by incrementNodeId creating a nodeId that fell outside of the expected bucket randomly. I removed it and updated generateNodeIdForBucketto take an alt number to make an alternative NodeId that will always fit within the expected bucket.

@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices label Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working epic Big issue with multiple subissues r&d:polykey:core activity 4 End to End Networking behind Consumer NAT Devices
Development

Successfully merging a pull request may close this issue.

3 participants