Skip to content
This repository has been archived by the owner on Aug 23, 2020. It is now read-only.

Investigate and fix rescan and revalidate on LS and non-LS nodes #1391

Open
jakubcech opened this issue Mar 28, 2019 · 3 comments
Open

Investigate and fix rescan and revalidate on LS and non-LS nodes #1391

jakubcech opened this issue Mar 28, 2019 · 3 comments
Assignees
Labels
L-Groom This issue needs to be groomed T-Bug

Comments

@jakubcech
Copy link
Contributor

jakubcech commented Mar 28, 2019

Description

Investigate and fix the rescan & revalidate options failing when milestones aren't in consecutive order (in situations where the milestones are skipped). This happens when you rescan an older database, or even a new DB with older data, e.g., from 1.5.6.

What happens if you do a rescan with local snapshots and pruning enabled? Does this work correctly with the local snapshot builds? The suspicion here is that we don't go back to first transaction in this case.

This needs to be then covered with tests so that we know it works in the future. #1466

Motivation

Ability to revalidate and rescan DBs on both LS and non-LS nodes.

Requirements

Scenario 1

  1. I run rescan on an LS disabled node.
  2. Tables are dropped and restored from the latest global snapshot.

Scenario 2

  1. I run rescan on an LS enabled node.
  2. Tables are dropped and restored from the last (oldest?) available snapshot.

Scenario 3

  1. I run revalidate on an LS disabled node.
  2. Solid milestone data is dropped and restored from the latest global snapshot.

Scenario 4

  1. I run rescan on an LS enabled node.
  2. Solid milestone data is dropped and restored from the last (oldest?) available snapshot.
@jakubcech jakubcech added C-Tests This PR adds testing functionality L-Groom This issue needs to be groomed T-Bug labels Mar 28, 2019
@jakubcech jakubcech removed the C-Tests This PR adds testing functionality label May 28, 2019
@jakubcech jakubcech changed the title Investigate rescan failures Investigate rescan and revalidate on LS and non-LS nodes May 28, 2019
@jakubcech jakubcech changed the title Investigate rescan and revalidate on LS and non-LS nodes Investigate and fix rescan and revalidate on LS and non-LS nodes May 28, 2019
@GalRogozinski
Copy link
Contributor

When this is done #371 should be closed as well

@GalRogozinski
Copy link
Contributor

old databases in https://dbfiles.iota.org/

@karimodm
Copy link
Contributor

karimodm commented Dec 21, 2019

With #1682 we are introducing a database corruption upon '--revalidate'.
I've used DB::VerifyChecksum() RocksDB API call to verify integrity of databases before and after the revalidate process with #1682 applied. After the deletion of objects checksums of database blocks appear to get corrupted. I am going investigate this further to understand exactly what specific action is corrupting the DB.

12/20 17:58:20.204 [main] INFO  RocksDBPersistenceProvider:427 - Deleted: 109230000                                                                                                                         [1/9053]
12/20 17:58:20.317 [main] INFO  RocksDBPersistenceProvider:427 - Deleted: 109240000                                                                                                                                 
12/20 17:58:20.440 [main] INFO  RocksDBPersistenceProvider:427 - Deleted: 109250000
12/20 17:58:20.568 [main] INFO  RocksDBPersistenceProvider:427 - Deleted: 109260000
12/20 17:58:39.810 [main] INFO  RocksDBPersistenceProvider:435 - Deleted 109266772 entries in total
12/20 17:58:39.812 [main] INFO  Iota:201 - AFTER DROPPING
12/20 17:58:39.817 [main] INFO  Iota:175 - Validating persistance providers length 1
12/20 17:58:39.818 [main] INFO  Iota:177 - com.iota.iri.storage.rocksDB.RocksDBPersistenceProvider@383790cf                                                                                                        
12/20 17:58:39.818 [main] INFO  Iota:179 - Validating checksum of org.rocksdb.RocksDB@74971ed9
12/20 18:18:59.209 [main] ERROR IRI$IRILauncher:141 - Exception during IOTA node initialisation:
org.rocksdb.RocksDBException: block checksum mismatch: expected 3373650606, got 2469076533  in mainnetdb/1153200.sst offset 22390842 size 3269                                                                     
        at org.rocksdb.RocksDB.verifyChecksum(Native Method) ~[iri-1.8.2.jar:1.8.2]
        at org.rocksdb.RocksDB.verifyChecksum(RocksDB.java:3691) ~[iri-1.8.2.jar:1.8.2]
        at com.iota.iri.Iota.validateChecksums(Iota.java:180) ~[iri-1.8.2.jar:1.8.2]
        at com.iota.iri.Iota.init(Iota.java:202) ~[iri-1.8.2.jar:1.8.2]
        at com.iota.iri.IRI$IRILauncher.main(IRI.java:135) ~[iri-1.8.2.jar:1.8.2]
        at com.iota.iri.IRI.main(IRI.java:64) [iri-1.8.2.jar:1.8.2]
Exception in thread "main" org.rocksdb.RocksDBException: block checksum mismatch: expected 3373650606, got 2469076533  in mainnetdb/1153200.sst offset 22390842 size 3269                                          
        at org.rocksdb.RocksDB.verifyChecksum(Native Method)
        at org.rocksdb.RocksDB.verifyChecksum(RocksDB.java:3691)
        at com.iota.iri.Iota.validateChecksums(Iota.java:180)
        at com.iota.iri.Iota.init(Iota.java:202)
        at com.iota.iri.IRI$IRILauncher.main(IRI.java:135)
        at com.iota.iri.IRI.main(IRI.java:64)
12/20 18:18:59.257 [Shutdown Hook] INFO  IRI$IRILauncher:152 - Shutting down IOTA node, please hold tight...                                                                                                       
12/20 18:18:59.267 [Shutdown Hook] INFO  Tangle:78 - Shutting down Tangle Persistence Providers...
12/20 18:18:59.359 [Shutdown Hook] INFO  Tangle:81 - Shutting down Tangle MessageQueue Providers...

References:

@karimodm karimodm self-assigned this Dec 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
L-Groom This issue needs to be groomed T-Bug
Projects
None yet
Development

No branches or pull requests

3 participants