Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-5148 Introduce "soft fail" for corrupt ValueStore #5150

Conversation

ebner
Copy link
Contributor

@ebner ebner commented Oct 10, 2024

GitHub issue resolved: #5148

Briefly describe the changes proposed in this PR:

  • Improved use of a corrupted ValueStore by handling error situations in ValueStore.data2value() with SailException (previously it was possible to get an ArrayIndexOutOfBoundsException and IllegalArgumentException).
  • Introduction of a new system property org.eclipse.rdf4j.sail.nativerdf.softFailOnCorruptData to enable "soft failure mode" and to avoid exceptions as described above. Instead, an instance of the newly introduced CorruptValue will be returned.
  • This solves the problem of not being able to iterate (and in consequence, backup) over the NativeStore in order to save uncorrupted statements. Without "soft failure mode" the RepositoryConnection would be forcefully closed when hitting the first corrupted value.

PR Author Checklist (see the contributor guidelines for more details):

  • my pull request is self-contained
  • I've added tests for the changes I made
  • I've applied code formatting (you can use mvn process-resources to format from the command line)
  • I've squashed my commits where necessary
  • every commit message starts with the issue number (GH-xxxx) followed by a meaningful description of the change

@@ -534,7 +555,10 @@ private NativeValue data2value(int id, byte[] data) throws IOException {
case LITERAL_VALUE:
return data2literal(id, data);
default:
throw new IllegalArgumentException("Invalid type " + data[0] + " for value with id " + id);
if (softFailOnCorruptData) {
return new CorruptValue(revision, id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could potentially include data[0] at this point.

@@ -0,0 +1,91 @@
/*******************************************************************************
* Copyright (c) 2024 Eclipse RDF4J contributors, Aduna, and others.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hmottestad
Copy link
Contributor

Any chance you would be interested in adding some info about this to our documentation?

I'm not sure where, but maybe at the end of the NativeStore section here:

The NativeStore saves data to disk in a binary format which is optimized for compact storage and fast retrieval. If there is sufficient physical memory, the Native store will act like the MemoryStore on most operating systems because the read/write commands will be cached by the OS.

Could we also try to create a test? Maybe create a NativeStore, add data, verify that the data is there. Do a shutdown() and then modify the files on disk to make them corrupted. Then init() the store again and try to read the data.

@hmottestad hmottestad changed the base branch from main to GH-5148-soft-fail-native-store October 23, 2024 09:24
@hmottestad hmottestad merged commit 34807e1 into eclipse-rdf4j:GH-5148-soft-fail-native-store Oct 23, 2024
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make ValueStore more resilient against data corruption
2 participants