downsampling causes errors in calibration #153

mshallow · 2024-06-13T21:56:51Z

I have been trying to use the downsampling built into your code database, and it seems successful at downsampling the data (the initial lines for temporal downsampling don't error) but then when I try to run the calibration step, it loads ~30-50% of the frames needed and then errors. The error message is attached here.
I am currently trying to downsample from 200 fps to 40 fps, so I set the downsampling factor to 5, I also tried with the example downsampling factor of 2 and ran into the same error. It seems to be indexing something incorrectly after the downsampling occurs.

Code:
`# load data (e.g. from DeepLabCut)
keypoint_data_path = '/Users/mollyshallow/Desktop/new_demo_project/data' # can be a file, a directory, or a list of files
coordinates, confidences, bodyparts = kpms.load_keypoints(keypoint_data_path, 'deeplabcut', exclude_individuals='single')

#downsample data
downsample_rate = 5 # keep every 2nd frame
coordinates = kpms.downsample_timepoints(coordinates, downsample_rate)
confidences = kpms.downsample_timepoints(confidences, downsample_rate)

format data for modeling

data, metadata = kpms.format_data(coordinates, confidences, **config())

kpms.noise_calibration(project_dir, coordinates, confidences, **config(), downsample_rate=downsample_rate)`

Error Message:
`Loading sample frames: 49%|█████▊ | 40/82 [00:02<00:02, 16.49it/s]

ValueError Traceback (most recent call last)
Cell In[11], line 1
----> 1 kpms.noise_calibration(project_dir, coordinates, confidences, **config(), downsample_rate=downsample_rate)

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/keypoint_moseq/calibration.py:528, in noise_calibration(project_dir, coordinates, confidences, bodyparts, use_bodyparts, video_dir, video_extension, conf_pseudocount, downsample_rate, **kwargs)
525 annotations = load_annotations(project_dir)
526 sample_keys.extend(annotations.keys())
--> 528 sample_images = load_sampled_frames(
529 sample_keys, video_dir, video_extension, downsample_rate
530 )
532 return _noise_calibration_widget(
533 project_dir,
534 coordinates,
(...)
540 **kwargs,
541 )

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/keypoint_moseq/calibration.py:110, in load_sampled_frames(sample_keys, video_dir, video_extension, downsample_rate)
102 readers = {key: OpenCVReader(video) for key, video in zip(keys, videos)}
103 pbar = tqdm.tqdm(
104 sample_keys,
105 desc="Loading sample frames",
(...)
108 ncols=72,
109 )
--> 110 return {
111 (key, frame, bodypart): readers[key][frame * downsample_rate]
112 for key, frame, bodypart in pbar
113 }

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/keypoint_moseq/calibration.py:111, in (.0)
102 readers = {key: OpenCVReader(video) for key, video in zip(keys, videos)}
103 pbar = tqdm.tqdm(
104 sample_keys,
105 desc="Loading sample frames",
(...)
108 ncols=72,
109 )
110 return {
--> 111 (key, frame, bodypart): readers[key][frame * downsample_rate]
112 for key, frame, bodypart in pbar
113 }

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/vidio/read.py:70, in BaseReader.getitem(self, *args, **kwargs)
62 def getitem(self, *args, **kwargs) -> Union[np.ndarray, list]:
63 """Wrapper around read
64
65 Args:
(...)
68 frame = reader[10]
69 """
---> 70 return self.read(*args, **kwargs)

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/vidio/read.py:107, in OpenCVReader.read(self, framenum)
102 """Read the frame indicated in framenum from disk
103
104 Uses sequential reads where possible if using OpenCV to read
105 """
106 # does checks. if framenum is a slice, calls read recursively. In that case, just return
--> 107 output = super().read(framenum)
108 if output is not None:
109 return output

File /opt/anaconda3/envs/keypoint_moseq/lib/python3.9/site-packages/vidio/read.py:39, in BaseReader.read(self, framenum)
37 return [self.read(i) for i in self.slice_to_list(framenum)]
38 if framenum < 0 or framenum > self.nframes:
---> 39 raise ValueError('frame number requested outside video bounds: {}'.format(framenum))

ValueError: frame number requested outside video bounds: 270590
Loading sample frames: 49%|█████▊ | 40/82 [00:19<00:02, 16.49it/s]`

The text was updated successfully, but these errors were encountered:

calebweinreb · 2024-06-13T23:49:59Z

Hmm weird. To diagnose, could you pick a recording and then tell me the shape the corresponding array in coordinates before and after downsampling, and also the number of frames in the corresponding video?

mshallow · 2024-06-14T02:17:21Z

Yea I can do that! Just using the first video in my list of videos as a test, here are the results of testing that out.

full_array=coordinates['Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse'] np.shape(full_array)
(7687, 6, 2)

downsample_rate = 5 # keep every 2nd frame coordinates_down = kpms.downsample_timepoints(coordinates, downsample_rate) confidences_down = kpms.downsample_timepoints(confidences, downsample_rate)
downsamp_array=coordinates_down['Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse'] np.shape(downsamp_array)
(308, 6, 2)

The length of the original coordinates array should be the full length of frames of the video. The video is ~38s long at 200fps which is around 7600 frames.

calebweinreb · 2024-06-14T13:34:12Z

can you check this systematically?

from vidio.read import OpenCVReader

keys = sorted(coordinates.keys())
videos = kpms.find_matching_videos(keys, video_dir)
for key,video in zip(keys,videos):
    print(len(coordinates[key]), len(OpenCVReader(video)), key)

mshallow · 2024-06-14T17:07:05Z

Using that method, I get these values:
1538 7687 Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse
The first seems like it's the downsampled frames, and the second the full length, which doesn't really match the output that I got using the other method, so not quite sure what is going on there.

calebweinreb · 2024-06-14T17:09:45Z

"which doesn't really match the output that I got using the other method".. can you elaborate on that?

also can you do this for all your data? Given the frame number in the error "270590" it seems like the short video you've been testing isn't the one that caused the error.

mshallow · 2024-06-14T17:23:41Z

Here is the output from all the videos:
731 3653 Sky_mouse-0893_2022-07-11T08_36_51DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1216 6077 Sky_mouse-0893_2022-07-12T08_47_30DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1404 7017 Sky_mouse-0893_2022-07-19T09_45_45DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1844 9220 Sky_mouse-0893_2022-07-21T11_59_16DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1368 6838 Sky_mouse-0893_2022-07-31T11_31_56DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1198 5988 Sky_mouse-0893_2022-08-22T10_55_14DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1080 5397 Sky_mouse-0893_2022-08-22T10_56_56DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 3350 16746 Sky_mouse-0895_2022-07-12T10_42_42DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2272 11360 Sky_mouse-0895_2022-07-26T10_45_21DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1972 9859 Sky_mouse-0895_2022-07-29T10_27_36DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 3626 18126 Sky_mouse-0895_2022-07-31T10_26_26DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2195 10974 Sky_mouse-0895_2022-08-15T11_55_18DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1820 9098 Sky_mouse-0895_2022-08-22T10_16_09DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1359 6794 Sky_mouse-0895_2022-08-23T10_49_34DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 12870 64347 Sky_mouse-0896_2022-04-06T10_01_10DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2074 10367 Sky_mouse-0896_2022-04-13T09_03_40DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2255 11271 Sky_mouse-0896_2022-04-27T08_35_14DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1518 7590 Sky_mouse-0896_2022-05-04T09_33_27DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2437 12182 Sky_mouse-0897_2022-04-04T09_15_51DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1404 7017 Sky_mouse-0897_2022-04-08T09_03_27DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 344 1720 Sky_mouse-0897_2022-04-14T08_39_05DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 1538 7687 Sky_mouse-0897_2022-05-06T08_16_08DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_filtered_mouse 1589 7943 Sky_mouse-0898_2022-04-12T12_42_28DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 2168 10838 Sky_mouse-0898_2022-04-29T09_16_40DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 897 4485 Sky_mouse-0898_2022-05-03T09_52_50DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 11883 59412 Sky_mouse-1337_2023-01-05T11_37_41DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2656 13277 Sky_mouse-1337_2023-01-13T12_25_28DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 12309 61545 Sky_mouse-1337_2023-01-15T16_46_51DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 13194 65969 Sky_mouse-1337_2023-01-20T11_02_26DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1518 7588 Sky_mouse-1429_2023-01-04T15_51_08DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1572 7858 Sky_mouse-1429_2023-01-15T16_15_26DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1441 7204 Sky_mouse-1429_2023-01-16T18_00_32DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2027 10131 Sky_mouse-1429_2023-01-24T11_32_56DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1929 9642 Sky_mouse-1429_2023-01-29T12_54_54DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2496 12479 Sky_mouse-1430_2023-01-16T16_46_28DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1190 5949 Sky_mouse-1430_2023-01-17T10_39_22DLC_dlcrnetms5_optopreycapFeb16shuffle1_150000_el_mouse 5683 28411 Sky_mouse-1430_2023-01-23T10_37_18DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 2122 10607 Sky_mouse-1430_2023-01-24T12_15_10DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1472 7360 Sky_mouse-1430_2023-01-24T12_23_54DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse 1699 8494 Sky_mouse-1430_2023-01-29T13_25_16DLC_resnet50_optopreycapFeb16shuffle1_300000_el_mouse

And previously to try to figure out the length of the downsampled array I had just looked at one entry in the dictionary of coordinates, and after downsampling, it was (308,6,2) which doesn't match up with taking every 5th frame as the downsampling factor should be telling it to do, the value of 1538 is the correct downsampled length for the coordinates.

calebweinreb · 2024-06-14T17:35:37Z

Hmm it's also confusing where "270590" came from since none of the coodinates are even 1/5 that long...

My guess is that there is some accidental failure to downsample or double-downsampling happening here. Can run the code from a clean starting point and do the following?

Load coordinates fresh
Run this code block before downsampling

from vidio.read import OpenCVReader

keys = sorted(coordinates.keys())
videos = kpms.find_matching_videos(keys, video_dir)
for key,video in zip(keys,videos):
    if len(coordinates[key]) != len(OpenCVReader(video)):
        print(len(coordinates[key]), len(OpenCVReader(video)), key)

assuming nothing prints, try running calibration.

BTW calibration itself is kind of buggy lately but this troubleshooting will also be useful for the subsequent viz steps

mshallow · 2024-06-14T18:04:47Z

There are no steps in this that downsample at all, were you suggesting just to try this to see if it was a bug with calibration even without the downsampling?
When I tried this, there was no output from that code block, and everything for the calibration loaded completely fine.
Is there a possibility that something is being concatenated to load the frames for the calibration and that is where the 270590 comes from? I've messed around with a couple of different downsampling rates, and if I set the downsampling rate to 2, the same error occurs, but the value error is then "ValueError: frame number requested outside video bounds: 108236"

calebweinreb · 2024-06-14T18:36:11Z

O sorry yeah I forgot the downsampling. Can you try the following?

keypoint_data_path = '/Users/mollyshallow/Desktop/new_demo_project/data' # can be a file, a directory, or a list of files
coordinates, confidences, bodyparts = kpms.load_keypoints(keypoint_data_path, 'deeplabcut', exclude_individuals='single')

downsample_rate = 5 # keep every 2nd frame
coordinates = kpms.downsample_timepoints(coordinates, downsample_rate)
confidences = kpms.downsample_timepoints(confidences, downsample_rate)

from vidio.read import OpenCVReader
keys = sorted(coordinates.keys())
videos = kpms.find_matching_videos(keys, video_dir)
for key,video in zip(keys,videos):
    if (len(coordinates[key])-1)*downsample_rate >= len(OpenCVReader(video)):
        print(len(coordinates[key]), len(OpenCVReader(video)), key)

kpms.noise_calibration(project_dir, coordinates, confidences, **config(), downsample_rate=downsample_rate)

mshallow · 2024-06-14T19:37:41Z

That still gave the same error.

calebweinreb · 2024-06-14T19:50:46Z

Have you run calibration previously? If so it may looking for frames from a non-downsampled instance of calibration. Look for a file called error_annotations.csv in the project directory, delete if present and try again?

mshallow · 2024-06-14T19:55:35Z

Ok that seems to be what it was doing, deleting the file fixed the error! Thanks!

mshallow · 2024-06-14T20:12:24Z

I also don't know if anyone else has encountered general bugginess with the calibration, but I feel like in more recent times I've tried to use it, the loading of the frames seems really glitchy. It'll load the first couple for me to click through smoothly and then after that you have to advance two or three at a time to get it to change frames or load the image and not just the skeleton, unless I wait a long time (around 10+ seconds), before I try to advance.

calebweinreb · 2024-06-14T22:01:46Z

That's different from other bug reports but glitchiness is the consensus. We're planning to change the backend for calibration in the next release/

mshallow · 2024-06-18T22:12:17Z

Following up on downsampling bugs: initial calibration step is no longer erroring, but after downsampling, there appear to be strange issues with the trajectory plots and grid videos. From what I can tell, it seems like something with the scaling for generating these plots and videos is off.
The trajectory plots that are generated after the PCA to initialize the model look like the correct scale (see attached photo) but after training the model, all of the points are overlayed on top of each other (second attached photo). The same issue seems to apply to the grid movies where everything is super zoomed out rather than cropping and zooming in on the mouse like it did previously. None of these lines of code throw actual errors, just user warnings, but I was curious if solving these warnings would change the output or if there is something else going on.
Before training model plots from PCA:

Trajectory plots after training:

Grid movies after training:

I tried running this with two different downsampling rates as well as two different kappa values and encountered the same issues.

calebweinreb · 2024-06-19T21:35:06Z

HI,

Hmm that's weird! But it doesn't strike me as related to downsampling per se. Have you ever run kpms without downsampling? Did it work in that case?

mshallow · 2024-06-19T21:37:02Z

Yea it has always worked without downsampling. I just tried it again on the same dataset without downsampling since previous attempts without downsampling were run on a different computer, and did not encounter this issue.

mshallow · 2024-06-19T21:41:18Z

This is the same set of outputs from a run without downsampling:

calebweinreb · 2024-06-19T21:41:35Z

Hmm maybe the fitting got wonky for some reason. Can you try exporting the inferred coordinates and see if they're weird?

https://keypoint-moseq.readthedocs.io/en/latest/advanced.html#exporting-pose-estimates

mshallow · 2024-06-19T21:52:39Z

The first half of that code runs fine, and then when it gets to making the video and overlaying the coordinates, it gives me a similar issue to what I was encountering with the calibration earlier.
Error message:
IndexError: index 1495 is out of bounds for axis 0 with size 1495
Just by eye, there are a few coordinates that look a little weird ( a bunch of negative values or really high values) but not quite sure how to systematically check that.
These are the coordinate outputs from the first video in the dictionary for the downsampled data and not downsampled data.
Downsampled:

Not downsampled:

mshallow · 2024-08-08T21:56:29Z

I just wanted to check back in and see if you had any more insight as to what could be going wrong with the downsampling here. I have tried this a couple more times with different kappa values and clearing all of the outputs from the directory before starting with the downsampling and the same issue arises every time where it appears to have trained fine and have decent median syllable lengths, but it only finds about 2 syllables and the trajectory plots look very wrong.

calebweinreb · 2024-08-12T20:05:03Z

So just to be confirm:

without downsampling, everything runs fine and the syllables look normal
with downsampling by 5X, only 2 syllables are found (based on trajectory plots)
also there's an additional issue of an IndexError when making grid movies?

I'm totally sure what the problem is... how much total data do you have? Are there still very few syllables (+ weird trajectory plots) when you only downsample 2X?

mshallow · 2024-08-12T22:32:06Z

Yes correct. Without downsampling, everything looks completely fine and the syllables look normal, but with downsampling it finds very few states and the plotting/ grid movies look really weird.
I have been running this on a test dataset of ~40 videos that are 1-5min in length at 200fps so should contain a couple hundred thousand frames as was suggested in your documentation. I just tried running the downsampling with 2x downsampling factor and it found ~10 syllables, but the same issue still occurs with the trajectory plots and grid movies.

calebweinreb · 2024-08-12T23:54:17Z

Hmm maybe at this point the easiest thing would be to send me the dataset and the code you have been running

mshallow · 2024-08-13T00:03:03Z

I'd be happy to do that! What is the easiest format to get the data to you? I've just been working in your demo jupyter notebooks for the majority of the code base, but I can share the versions I'm using for any of the changes I've made over time.

calebweinreb · 2024-08-13T01:12:52Z

Google drive or dropbox to [email protected]

mshallow · 2024-08-13T22:19:25Z

Data and notebooks are still uploading, but I shared a dropbox folder that should have what I have been working with. Do you need any of the model checkpoints or outputs as well? From: Caleb Weinreb ***@***.***> Date: Monday, August 12, 2024 at 6:13 PM To: dattalab/keypoint-moseq ***@***.***> Cc: Molly Shallow ***@***.***>, Author ***@***.***> Subject: Re: [dattalab/keypoint-moseq] downsampling causes errors in calibration (Issue #153) Google drive or dropbox to ***@***.******@***.***> — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/dattalab/keypoint-moseq/issues/153*issuecomment-2285160450__;Iw!!C5qS4YX3!EVHKc8YCbqDD2O0GoHl8PDZDVZ-VFeFzB9psvu-Zo2sx88A0ru47uoRsMlIyEeHkG5N41DNftUR3E6ctRKjgtHrXNuY$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJ6GK2QRDZHLI67WDI5QEXTZRFMSTAVCNFSM6AAAAABJJG3EWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBVGE3DANBVGA__;!!C5qS4YX3!EVHKc8YCbqDD2O0GoHl8PDZDVZ-VFeFzB9psvu-Zo2sx88A0ru47uoRsMlIyEeHkG5N41DNftUR3E6ctRKjgi-0EZ7M$>. You are receiving this because you authored the thread.Message ID: ***@***.***>

mshallow · 2024-08-26T18:29:47Z

Any insights into what might be happening with the downsampling?

calebweinreb · 2024-08-26T19:28:33Z

I haven't confirmed it yet but I think the problem is coming from two recordings where the keypoint tracking went screwy. You can see which ones in the y-axis labels of the attached screenshot

Can you try modeling with those recordings excluded and let me know how it goes?

mshallow · 2024-08-26T19:32:40Z

I’ll try that right now! From: Caleb Weinreb ***@***.***> Date: Monday, August 26, 2024 at 12:29 PM To: dattalab/keypoint-moseq ***@***.***> Cc: Molly Shallow ***@***.***>, Author ***@***.***> Subject: Re: [dattalab/keypoint-moseq] downsampling causes errors in calibration (Issue #153) I haven't confirmed it yet but I think the problem is coming from two recordings where the keypoint tracking went screwy. You can see which ones in the y-axis labels of the attached screenshot Screenshot.2024-08-26.at.3.27.10.PM.png (view on web)<https://urldefense.com/v3/__https:/github.com/user-attachments/assets/45deb22c-8126-4def-8531-e47415d02f4e__;!!C5qS4YX3!DfNpp9wLVkdUhk2-WpLNfmn9CCFL28nOCbM5GPEoea5JktwwF60F3wCIUIKagSdSxfdasMGLFeIUuB1OAjNOXBMQP7o$> Can you try modeling with those recordings excluded and let me know how it goes? — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/dattalab/keypoint-moseq/issues/153*issuecomment-2310921192__;Iw!!C5qS4YX3!DfNpp9wLVkdUhk2-WpLNfmn9CCFL28nOCbM5GPEoea5JktwwF60F3wCIUIKagSdSxfdasMGLFeIUuB1OAjNOsopQtNU$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJ6GK2SM4ZWFIXYWXQ3MDUTZTN6XNAVCNFSM6AAAAABJJG3EWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJQHEZDCMJZGI__;!!C5qS4YX3!DfNpp9wLVkdUhk2-WpLNfmn9CCFL28nOCbM5GPEoea5JktwwF60F3wCIUIKagSdSxfdasMGLFeIUuB1OAjNOrDzWgGY$>. You are receiving this because you authored the thread.Message ID: ***@***.***>

mshallow · 2024-08-26T20:38:34Z

That doesn't seem to have changed it. The training of the model looked a little better, longer and more states, but the trajectory plots still are weirdly small.

calebweinreb · 2024-08-26T20:45:51Z

OK I'll take another look.

mshallow · 2024-08-27T18:33:57Z

I also tried running the modeling with the data at its full frame rate (no downsampling) and now the trajectory plots for that run also look the same.

calebweinreb · 2024-08-28T02:28:41Z

Hi,

So I tried modeling your data in a bunch of different configurations:

latent_dim=2 vs. latent_dim=4
with or without excluding the bad session mentioned above
with or without location-aware modeling
with 5X downsampling or no downsampling

In every case, the trajectory plots looked very reasonable and didn't look like the screenshot you posted above. I'm not entirely sure what's going on. You can see all the notebooks I used and their full output including checkpoints and everything here.

Separate from the trajectory plot issue, I had some suggestions based on your modeling notebook:

Only using the right ear for anterior_keypoints leads to the mouse always being slightly tilted. It would be better to set anterior_keypoints=["Rear", "Lear"]
latent_dim=2 is very low (even if it does explain 90% of variance). I would suggest latent_dim=3 or 4 instead.
In the kappa scan, kappa is decreased by a factor of 10 between the ar_only and full model steps. You should recapitulate this when you run those steps yourself.
If you plan to set "location_aware=True" in your final modeling then you should also do so for the kappa scan.

mshallow · 2024-08-28T17:58:58Z

Thank you for all the help! I now think it maybe is just some weird bug with jupyter lab holding onto some sort of incorrect coordinates because restarting things at least fixed the trajectory plots for the non downsampled data. I’ll see if starting over from scratch or using the notebooks that worked for you will solve the issue for me. Also, thanks for the tips about the kappa scan and values for the different steps of the model, I hadn’t noticed that I forgot to use location aware in the scan. I noticed that the kappa scan only goes up to 1E7, for different HMMs I’ve run in the past we occasionally have used much higher kappa values. Is the 1E7 the max that you would suggest for this model or was that just the highest value that the scan reaches up to? Thanks again for all your help with the troubleshooting! Best, Molly From: Caleb Weinreb ***@***.***> Date: Tuesday, August 27, 2024 at 7:29 PM To: dattalab/keypoint-moseq ***@***.***> Cc: Molly Shallow ***@***.***>, Author ***@***.***> Subject: Re: [dattalab/keypoint-moseq] downsampling causes errors in calibration (Issue #153) Hi, So I tried modeling your data in a bunch of different configurations: * latent_dim=2 vs. latent_dim=4 * with or without excluding the bad session mentioned above * with or without location-aware modeling * with 5X downsampling or no downsampling In every case, the trajectory plots looked very reasonable and didn't look like the screenshot you posted above. I'm not entirely sure what's going on. You can see all the notebooks I used and their full output including checkpoints and everything here<https://urldefense.com/v3/__https:/www.dropbox.com/scl/fo/pqrgdrymulcdxnhrieelw/AMh3rkkkB8vSYM8U_X2Qeug?rlkey=bb8jhe32t2hprxcp51ok430aa&dl=0__;!!C5qS4YX3!HMP7a6e3_n80RbJohBv2kI_j_nYk-D43gafEq6cAX_--a-q10WgdFZ_bS_Vs1HlkKF3fj-YI_2ZoofQbO5Le6YZe8J0$>. Separate from the trajectory plot issue, I had some suggestions based on your modeling notebook: * Only using the right ear for anterior_keypoints leads to the mouse always being slightly tilted. It would be better to set anterior_keypoints=["Rear", "Lear"] * latent_dim=2 is very low (even if it does explain 90% of variance). I would suggest latent_dim=3 or 4 instead. * In the kappa scan, kappa is decreased by a factor of 10 between the ar_only and full model steps. You should recapitulate this when you run those steps yourself. * If you plan to set "location_aware=True" in your final modeling then you should also do so for the kappa scan. — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/dattalab/keypoint-moseq/issues/153*issuecomment-2313967570__;Iw!!C5qS4YX3!HMP7a6e3_n80RbJohBv2kI_j_nYk-D43gafEq6cAX_--a-q10WgdFZ_bS_Vs1HlkKF3fj-YI_2ZoofQbO5Le8-EO3Wg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJ6GK2UUQKXBVMW5T7J3Z33ZTUYW5AVCNFSM6AAAAABJJG3EWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJTHE3DONJXGA__;!!C5qS4YX3!HMP7a6e3_n80RbJohBv2kI_j_nYk-D43gafEq6cAX_--a-q10WgdFZ_bS_Vs1HlkKF3fj-YI_2ZoofQbO5LeDsKhZiE$>. You are receiving this because you authored the thread.Message ID: ***@***.***>

calebweinreb · 2024-08-28T19:05:38Z

The maximum of 1e7 is just because that's typically high enough. But higher values could definitely be necessary, especially for high speed video where the syllable durations are a higher number of frames.

mshallow · 2024-08-28T19:07:30Z

Ok that makes sense. I also noticed that you didn’t run the error calibration in any of the notebooks that you made. Is there a reason for that? I am wondering if that is the step that is generating the weirdness with the plots. Molly From: Caleb Weinreb ***@***.***> Date: Wednesday, August 28, 2024 at 12:06 PM To: dattalab/keypoint-moseq ***@***.***> Cc: Molly Shallow ***@***.***>, Author ***@***.***> Subject: Re: [dattalab/keypoint-moseq] downsampling causes errors in calibration (Issue #153) The maximum of 1e7 is just because that's typically high enough. But higher values could definitely be necessary, especially for high speed video where the syllable durations are a higher number of frames. — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/dattalab/keypoint-moseq/issues/153*issuecomment-2316064259__;Iw!!C5qS4YX3!A978lGVhL0X7qmNsyr1rd3wvI58evwql4rXuiGXh6U2CTDtyS3q3q6ALX85lPlJtMJ8nUbh0ZKZQUayyYanKQoAJvIQ$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJ6GK2QQZL6CAYYTMP4Z5DLZTYNRRAVCNFSM6AAAAABJJG3EWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJWGA3DIMRVHE__;!!C5qS4YX3!A978lGVhL0X7qmNsyr1rd3wvI58evwql4rXuiGXh6U2CTDtyS3q3q6ALX85lPlJtMJ8nUbh0ZKZQUayyYanKn3Yxbwo$>. You are receiving this because you authored the thread.Message ID: ***@***.***>

calebweinreb · 2024-08-28T19:39:36Z

The only way error calibration would affect things is via the "slope" and "intercept" parameters under "error_estimator" in the config. It's possible you were using a different values of those parameters which could have caused the problem.

mshallow · 2024-08-28T20:08:40Z

The slope and intercept values I had were slightly different than yours, but also about the same for the downsampled and full data and didn’t run into that issue all the time. From: Caleb Weinreb ***@***.***> Date: Wednesday, August 28, 2024 at 12:40 PM To: dattalab/keypoint-moseq ***@***.***> Cc: Molly Shallow ***@***.***>, Author ***@***.***> Subject: Re: [dattalab/keypoint-moseq] downsampling causes errors in calibration (Issue #153) The only way error calibration would affect things is via the "slope" and "intercept" parameters under "error_estimator" in the config. It's possible you were using a different values of those parameters which could have caused the problem. — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/dattalab/keypoint-moseq/issues/153*issuecomment-2316115759__;Iw!!C5qS4YX3!HPKGvwMGc7tmJRRmNqwOIg0Z9Ykt1wCCA-HvE-XJp0INcWpafySjONbyBgj716nLLTZmUM0WsCkE0557FNbPRub17Jg$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJ6GK2WGCGHYTWRG4CZCN4LZTYRQ3AVCNFSM6AAAAABJJG3EWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJWGEYTKNZVHE__;!!C5qS4YX3!HPKGvwMGc7tmJRRmNqwOIg0Z9Ykt1wCCA-HvE-XJp0INcWpafySjONbyBgj716nLLTZmUM0WsCkE0557FNbPNVxpJKo$>. You are receiving this because you authored the thread.Message ID: ***@***.***>

mshallow · 2024-08-30T02:39:17Z

Ok as far as I can tell, the only difference between how you had the notebooks you created running and how I had been working with the data is the jitter value. Due to NaNs in the dataset, previously you had suggested that I set that value to the maximum value of 1e-1. When I do this, that is when I get the strange coordinates and the trajectory plots don't look correct. If I leave it at the default value, the training of the model terminates early after finding too many NaNs in the data, but the trajectory plots are not messed up. Without jitter=1e-1, generally it gets through about 50% of the iterations, sometimes 60-65% before it terminates. would you say that is enough iterations, or should I figure out a way to get it to run for the full 500?

calebweinreb · 2024-08-30T18:09:46Z

I'm going to test a few things to get rid of the NaNs. Also I have an idea to make your trajectory plots render more clearly but I'd like to test it on a real example. Would you be willing to send me the results.h5 file from which the weird looking trajectory plots were generated?

mshallow · 2024-08-30T18:38:59Z

Yes I’ll put one in the same dropbox file once I find one where they were generated weirdly. I’ve run so many tests on this in the past 48hrs I need to find the right one. Molly From: Caleb Weinreb ***@***.***> Date: Friday, August 30, 2024 at 11:10 AM To: dattalab/keypoint-moseq ***@***.***> Cc: Molly Shallow ***@***.***>, Author ***@***.***> Subject: Re: [dattalab/keypoint-moseq] downsampling causes errors in calibration (Issue #153) I'm going to test a few things to get rid of the NaNs. Also I have an idea to make your trajectory plots render more clearly but I'd like to test it on a real example. Would you be willing to send me the results.h5 file from which the weird looking trajectory plots were generated? — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/dattalab/keypoint-moseq/issues/153*issuecomment-2322087420__;Iw!!C5qS4YX3!FFkYXw1CDwuhoRK8-YSy2Z7SCYrEwgxEspVjwJ6gM2TMIzUAvmRlzNLAqRTb9TLx7QfZQCL2cJMIkSFDTLi5ikAhg8A$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AJ6GK2XQSXBMT3U3MJM2BELZUCYQBAVCNFSM6AAAAABJJG3EWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRSGA4DONBSGA__;!!C5qS4YX3!FFkYXw1CDwuhoRK8-YSy2Z7SCYrEwgxEspVjwJ6gM2TMIzUAvmRlzNLAqRTb9TLx7QfZQCL2cJMIkSFDTLi5qm60oJo$>. You are receiving this because you authored the thread.Message ID: ***@***.***>

mshallow · 2024-08-30T20:56:02Z

I uploaded several folders from different times that I tried to train the model, two with the weird trajectory plots, and one without. The two models that were trained today (8/30) are identical except for the jitter value; the one with normal trajectory plots (11_51_33) was run with a jitter=1e-2 and the one with weird plots (12_35_27) was run with 1e-1.
I uploaded everything including checkpoints, training progress and the plots and grid videos as well as the results.h5 just in case you needed anything else.
Neither of these trajectory plots are the full size that the not downsampled data give, but without jitter=1e-1 is definitely better.

mshallow · 2024-09-17T22:20:56Z

Did you ever figure out a way to eliminate some of the NaNs or get the trajectory plots to render more clearly? I tested a couple more things and without the jitter=1E-1 the plots are definitely better but still not the full size that they were without downsampling. Does that have something to do with the amount of data or is it just a visualization change? I'm trying to figure out how important the size of those plots is or if the syllable data are fine and could be used for further data analysis.

calebweinreb · 2024-09-20T14:57:40Z

Hey! So sorry for the late response. A bunch of stuff came up and I lost the thread on this. Overall, regarding NaNs, I don't have a full solution but in general would recommend cleaning up the keypoint data as much as possible and removing any sessions that seem like they contain a lot of errors. But it seems like you can get pretty decent results before the NaNs happen so it might be fine to just go with that.

Regarding the trajectory plots, I was hoping to implement a more general solution but for now you can use the lims parameter of generate_trajectory_plots. Basically what's currently happening is that the limits are being set too large so al the keypoints get smooshed together. You can avoid that by setting the lims manually. Here's the relevant part of the docstring:

lims: ndarray of shape (2,2), default=None
        Axis limits used for all the trajectory plots with format
        `[[xmin,ymin],[xmax,ymax]]`. If None, the limits are determined
        automatically based on the coordinates of the keypoints using
        :py:func:`keypoint_moseq.viz.get_limits`.

I would recommend experimenting until you get a good size where the keypoints arent getting cropped but also you can see them clearly.

downsampling causes errors in calibration #153

downsampling causes errors in calibration #153

Comments

mshallow commented Jun 13, 2024

format data for modeling

Error Message: `Loading sample frames: 49%|█████▊ | 40/82 [00:02<00:02, 16.49it/s]

calebweinreb commented Jun 13, 2024

mshallow commented Jun 14, 2024 • edited Loading

calebweinreb commented Jun 14, 2024

mshallow commented Jun 14, 2024

calebweinreb commented Jun 14, 2024

mshallow commented Jun 14, 2024

calebweinreb commented Jun 14, 2024

mshallow commented Jun 14, 2024

calebweinreb commented Jun 14, 2024

mshallow commented Jun 14, 2024

calebweinreb commented Jun 14, 2024

mshallow commented Jun 14, 2024

mshallow commented Jun 14, 2024 • edited Loading

calebweinreb commented Jun 14, 2024

mshallow commented Jun 18, 2024

calebweinreb commented Jun 19, 2024

mshallow commented Jun 19, 2024

mshallow commented Jun 19, 2024 • edited Loading

calebweinreb commented Jun 19, 2024

mshallow commented Jun 19, 2024

mshallow commented Aug 8, 2024

calebweinreb commented Aug 12, 2024

mshallow commented Aug 12, 2024

calebweinreb commented Aug 12, 2024

mshallow commented Aug 13, 2024 • edited Loading

calebweinreb commented Aug 13, 2024

mshallow commented Aug 13, 2024 via email

mshallow commented Aug 26, 2024

calebweinreb commented Aug 26, 2024

mshallow commented Aug 26, 2024 via email

mshallow commented Aug 26, 2024

calebweinreb commented Aug 26, 2024

mshallow commented Aug 27, 2024

calebweinreb commented Aug 28, 2024

mshallow commented Aug 28, 2024 via email

calebweinreb commented Aug 28, 2024

mshallow commented Aug 28, 2024 via email

calebweinreb commented Aug 28, 2024

mshallow commented Aug 28, 2024 via email

mshallow commented Aug 30, 2024

calebweinreb commented Aug 30, 2024

mshallow commented Aug 30, 2024 via email

mshallow commented Aug 30, 2024 • edited Loading

mshallow commented Sep 17, 2024

calebweinreb commented Sep 20, 2024

Error Message:
`Loading sample frames: 49%|█████▊ | 40/82 [00:02<00:02, 16.49it/s]

mshallow commented Jun 14, 2024 •

edited

Loading

mshallow commented Jun 14, 2024 •

edited

Loading

mshallow commented Jun 19, 2024 •

edited

Loading

mshallow commented Aug 13, 2024 •

edited

Loading

mshallow commented Aug 30, 2024 •

edited

Loading