Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Is there any way to get all keys (not the entire record) from the DB #103

Open
PD-Pramila opened this issue Sep 13, 2023 · 5 comments

Comments

@PD-Pramila
Copy link

I Checked the code and saw that store.Find() can return all the records (with all fields). But if there are billion of records, then it will consume lot of memory to hold all records with all fields.

There is store.Foreach(), it will call func for every record but same thins, if there are billion of records, it might be slow to process every records and then get list of all keys.

Is there any other way to get all keys only?

I opened an issue for stream read, is there any plan to add that feature in badhgerhold?

@timshannon
Copy link
Owner

You're going to run into the same issue as #94

If you want to query any of the fields that aren't the Key, then they need to be retrieved from the DB. If everything you need to loop through is in the key, then you can iterate on the key yourself directly against the Badger DB.

@PD-Pramila
Copy link
Author

PD-Pramila commented Sep 13, 2023

@timshannon Thanks for the reply.
When we insert record (func (s *Store) Insert(key, data interface{})),
we specify key and data. Badgerhold encode it using gob.
I want to retrieve that key only (which is Badgerhold.Key) and which is not part of data from DB.

Just to get the DB, why i need to get entire data part for that?
Also, the badger DB iterator, how it will decode the gob value, which is badger hold DB specific?

@timshannon
Copy link
Owner

Just to get the DB, why i need to get entire data part for that?

You don't have to. Just use Badger directly.

how it will decode the gob value, which is badger hold DB specific?

With the core Gob libraries: https://pkg.go.dev/encoding/[email protected]

The Key value is just the Gob encoded value of whatever key object you pass in.

Based on all of the other issues you submitted, I'm not sure BadgerHold is a good fit for your project at all. You're requesting re-implementing features that are already built into Badger. I'm guessing you should just use Badger.

I'd recommend taking a look at the Badger documentation: https://dgraph.io/docs/badger/get-started/#iterating-over-keys

@PD-Pramila
Copy link
Author

I understood your point.
The one of the reasons we are using badgerhold and not badger DB, is aggregate queries with groupby, which are not there in badger and that's our main use case.

Also, it uses indexing and gob(which we can use on our own though).

I got your point to use badger DB iterator in start itself, but i dont see that it's anyway different than For each of badgerhold. They don't iterate on just keys. they get the data along with these.

What I was looking for is that DB should not fetch the the value/data part, and just return the keys, if that is stored in mem.
That's why asked, is there any way to get only keys from DB, which can be faster than getting key+value and then extract the key from results.

Thanks for your replies. I will see what can be the better option.

@timshannon
Copy link
Owner

Once last recommendation. If you're honestly talking about billions of records, you absolutely should start looking at a real database, if even just something like sqlite. You're going to run into issues constantly otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants