Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API for localstore scanning #4566

Closed
nikipapadatou opened this issue Feb 5, 2024 · 3 comments · Fixed by #4573 · May be fixed by #4587
Closed

API for localstore scanning #4566

nikipapadatou opened this issue Feb 5, 2024 · 3 comments · Fixed by #4573 · May be fixed by #4587
Labels

Comments

@nikipapadatou
Copy link
Collaborator

nikipapadatou commented Feb 5, 2024

A new tool has been implemented which scans a localstore folder for corrupted and invalid chunks versus total amount of chunks per file belonging to pinned content.

The tool is here: https://github.com/ethersphere/bee/tree/feat-integrity-cmd

We need an API to serve this tool to the users, along with its relevant documentation:


  • go run $(pwd)/cmd/bee db validate-pin --data-dir /path/to/localstore
    This will produce a csv file with all the pins and their stats.

To select the ones that are problematic run:

  • awk '$1 >0 || $2>0' address.csv > invalid.csv
    The file will contain the addresses of pins that the user might want to unpin/re-upload. The original files can be found by their hash.

@bee-runner bee-runner bot added the issue label Feb 5, 2024
@ldeffenb
Copy link
Collaborator

ldeffenb commented Feb 5, 2024

I simply expanded the /pins/{reference} to include detailed information on the chunks involved in the pin set.

curl http://192.168.10.36:11633/pins/0000001bf5eff96586a0ed6fe34fc8d859c69b0316a29912f14c4edcbbe732dc | jq
{
  "reference": "0000001bf5eff96586a0ed6fe34fc8d859c69b0316a29912f14c4edcbbe732dc",
  "chunkCount": 5,
  "chunks": [
    {
      "address": "0000001bf5eff96586a0ed6fe34fc8d859c69b0316a29912f14c4edcbbe732dc",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "4bd95ce93becc5e9fc1bf3f2304c7b2863bbbc9479fe248f319a809faf166a60",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "57b2c67a935478cbd5b72332a942751a9536931e51e73a00a90e009b9ea959c1",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "9b7f230858aabf6cf3e7b24effe56f821fcbe4989d3b12652865a83e5ad65d85",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "ec38e1764ec149a409a5bd1d4cd4e8c63d6148b00a899f481e954f0c230e94ac",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    }
  ]
}

My hacked API also includes the option to ?repair=true that will do some attempt at repairing the refCnt for local chunks.
https://github.com/ldeffenb/bee/blob/a3437b45cf0dfcad18a6817c22d352a17557a217/pkg/api/pin.go#L188
And the structures just before that. It also requires various hacks in the database layers to get the information, but they're all in my 1.18.2-cumulative-hacks branch.

@ldeffenb
Copy link
Collaborator

ldeffenb commented Feb 5, 2024

Here's an example of a multi-chunk pinned reference with a chunk missing in the middle. Notice the err: true and local: false on the next to last chunk. If it's pinned, it should be local, obviously.

curl http://192.168.10.36:11633/pins/79fc9211763894014cb73fa9a3a05210ed07813c5b85bd292745b4bd90de217e | jq
{
  "reference": "79fc9211763894014cb73fa9a3a05210ed07813c5b85bd292745b4bd90de217e",
  "chunkCount": 4,
  "chunks": [
    {
      "address": "29e7f4854c447ea7a86fa161e6a59c06f82f2fa1a9b75cdf375c26c7689de3f3",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "79fc9211763894014cb73fa9a3a05210ed07813c5b85bd292745b4bd90de217e",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "da99078b366a86b5845e82388cdbc4fb31003155a6cc74c8d3f3afb8984583f4",
      "err": true,
      "refCnt": 0,
      "local": false,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    },
    {
      "address": "eccaf183984bb9b4f75367048c39de4716da62c68f4ca491aea650ae8ba53486",
      "err": false,
      "refCnt": 100,
      "local": true,
      "cached": false,
      "reserve": 0,
      "upload": 0,
      "repaired": false
    }
  ]
}

@ldeffenb
Copy link
Collaborator

ldeffenb commented Feb 5, 2024

Oh, and I also have hacked the /pins API itself to handle a ?limit=L&offset=O similar to what /tags has. Otherwise, I can't even query all of my pins, the node runs out of memory and panics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants