Split fingerprinting #87

thesunlover · 2015-08-05T17:23:09Z

test_fingerprint_by_splitting.py creates long file from the existing files in mp3/.

to fingerprint with a length check you should use
djv.fingerprint_with_duration_check(long_song, song_name="Concatenates_test")
as shown in the split-test file

thesunlover · 2015-08-05T17:25:08Z

this is the updated version of the previous pull request
it should consider verification checks with the wavio situation
there is no chance for me to test that.

thesunlover · 2015-08-05T17:26:43Z

repost of PR #75
for issue #18

wangzhengyi · 2015-08-19T08:11:48Z

You don't consider the offset_seconds.

thesunlover · 2015-09-01T13:36:46Z

@wangzhengyi 👍
I would request code comments and recommendations.
Edit1:
~~the new function is based on all existing~~
~~For me it is enough the use of offset_seconds to happen in there~~
Got It. What is needed is to calculate and add the previous lengths...
any proper suggestions are welcomed

sheffieldnikki · 2016-08-03T20:22:25Z

Any news on merging this with the master branch? dejavu is almost unusable on low memory machines - even the example mp3 files give out of memory errors when trying to fingerprint on a 512MB machine :( (and relying on swap is a disaster on this machine - its only storage is a memory card). Thanks

NathanielCustom · 2019-03-31T18:27:11Z

The solution that worked for me to get the offsets correct is to (A) extract the offset (in seconds) as defined by the split file name (ex. start_sec60_end_sec120.mp3), (B) convert the seconds value to the equivalent sampling offset value, and (C) add the derived sampling offset value to the offset as determined by the fingerprinting process for the given file.

Note: I am using a different fork so some of the smaller details may be different ex. database.py naming.

(A) Extract Offset Data & (B) Extract Offset Data

# __init__.py

def _fingerprint_worker(filename, limit=None, song_name=None):
   ...
   channel_amount = len(channels)

   # Get Offset from name.
   try:
       first_split = song_name.split("start_sec", 1)
       select_second = first_split[1]
       second_split = select_second.split("_end_sec", 1)
       
       # Convert second_split[0] to sampling offset
       split_offset = round(
           int(second_split[0]) * fingerprint.DEFAULT_FS /
           fingerprint.DEFAULT_WINDOW_SIZE / fingerprint.DEFAULT_OVERLAP_RATIO,
           5
       )
   except:
       split_offset = 0
    ...
    return song_name, result, file_hash, split_offset

Iterate and Pass the Value

# __init__.py

while True:
            try:
                song_name, hashes, file_hash, split_offset = next(iterator)
            ...
            else:
                #sid = self.db.insert_song(song_name, file_hash) # REMOVE
                if treat_as_split and song_name_for_the_split:
                    sid = self.db.insert_song(song_name_for_the_split, file_hash)
                if not treat_as_split:
                    sid = self.db.insert_song(song_name, file_hash)               

                self.db.insert_hashes(sid, hashes, split_offset)
                ...

(C) Apply Offset

# database.py

    def insert_hashes(self, sid, hashes, split_offset=0):
        ...
        for hash, offset in set(hashes):
            fingerprints.append(
                Fingerprint(
                    hash=binascii.unhexlify(hash),
                    song_id=sid,
                    offset=int(offset+split_offset)
                )
            )
        self.session.bulk_save_objects(fingerprints)

thesunlover added 2 commits April 20, 2015 17:55

port branch

0a0df7e

fix to match with the master branch

d15a90a

thesunlover mentioned this pull request Aug 5, 2015

Split big files #75

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split fingerprinting #87

Split fingerprinting #87

thesunlover commented Aug 5, 2015

thesunlover commented Aug 5, 2015

thesunlover commented Aug 5, 2015

wangzhengyi commented Aug 19, 2015

thesunlover commented Sep 1, 2015

sheffieldnikki commented Aug 3, 2016

NathanielCustom commented Mar 31, 2019

Split fingerprinting #87

Are you sure you want to change the base?

Split fingerprinting #87

Conversation

thesunlover commented Aug 5, 2015

thesunlover commented Aug 5, 2015

thesunlover commented Aug 5, 2015

wangzhengyi commented Aug 19, 2015

thesunlover commented Sep 1, 2015

sheffieldnikki commented Aug 3, 2016

NathanielCustom commented Mar 31, 2019