Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect PTS after 12 hours of process operation #197

Open
restorepoint1 opened this issue Nov 16, 2024 · 29 comments
Open

Incorrect PTS after 12 hours of process operation #197

restorepoint1 opened this issue Nov 16, 2024 · 29 comments
Labels
bug Something isn't working Unable to Reproduce Developers are unable to reproduce the issue

Comments

@restorepoint1
Copy link

Did you read the guide?

Yes

What is the base version number of MotionPlus being used?

0.2.x

What was the install method?

Built from source code

What is base architecture?

x86/x64

What is the distro being used?

Ubuntu

Disto version number

24.04

Camera/Sound type(s) being used?

Network camera with RTSP

Describe the issue/problem and steps to reproduce

Motionplus d79e5c5 commit

I have 2 cameras with different configs: CAM1 and CAM2
CAM1 config:

device_name CAM1
device_id 1
pause on

netcam_url rtsp://x.x.x.x
width 1920
height 1080

timelapse_interval 1
timelapse_filename laps_%Y-%m-%d
timelapse_container mpeg4

picture_output off
picture_filename pic_%Y-%m-%d_%H-%M-%S

movie_output on
movie_filename mov_%Y-%m-%d_%H-%M-%S
movie_passthrough on

So it is 1 fps 24/7 daily laps recording and external api calls (eventstart and eventend) for 5 fps movies. Everything was ok but today after 12:00AM movies become broken until sudo service motionplus restart
All files attached

image_2024-11-16_18-48-54

journalctl.zip
mov_ok.zip
mov_broken.zip

Relevant MotionPlus log output (at log_level 8)

Attached
@restorepoint1 restorepoint1 added the bug Something isn't working label Nov 16, 2024
@Mr-Dave
Copy link
Member

Mr-Dave commented Nov 16, 2024

Looks like somehow the PTS got messed up. Can you run the following command and tell me the value of the first PTS in the resulting file?

ffprobe -v 0 -show_entries packet=pts,duration -select_streams v ./mov_2024-11-16_17-39-35.mkv >probetest.txt
(I want to see if the PTS reset to a different number after Motionplus was restarted.)

Also, do you know if the camera rebooted at 12?

@restorepoint1
Copy link
Author

Autoreboot is configured every tuesday and there is no any logs about linkdown (Last Link Up Time: Nov/12/2024 04:20:05). Today after midnight the situation is same, new movies after 0:00 are broken (0:00 is time of creation new laps file). I can keep it broken for you

image
probetest_ok.txt
probetest_broken.txt

@restorepoint1
Copy link
Author

I used old motion (not motionplus) for years and there was no such issue.
Also i can switch timelapse to hourly to check movies

@Mr-Dave
Copy link
Member

Mr-Dave commented Nov 17, 2024

Just to confirm. This is passthrough recording correct?

If so, when it is broken, can you use the web control to change the fflog_level to something higher so that those messages appear. Also validate that the camera is still feeding live images and that the problem is with the recording instead of the capture.

@restorepoint1
Copy link
Author

restorepoint1 commented Nov 17, 2024

This is passthrough recording correct?

Yes, movie_passthrough on

validate that the camera is still feeding live

Camera is working correcly, i can watch it via vlc rtsp, motion web and also timelapse has correct recording during broken movies time

I have changed fflog level but there is no changes. So i changed it in conf, switched timelapse to hourly and restarted service.

Where is fflog has to be stored? I am not observing any changes in journalctl.

@restorepoint1
Copy link
Author

After i switched timelapse to hourly the movie file between two hourly files are ok yet

@restorepoint1
Copy link
Author

Today broken videos started after 08:00
image

I will try to switch it back to daily

@Mr-Dave
Copy link
Member

Mr-Dave commented Nov 18, 2024

I've still not been able to recreate this but I think that the time of the broken MKVs being correlated to the timelapse option provides some clues.

In Motionplus the movie and timelapse are separate objects but perhaps there are some interactions inside the ffmpeg libs. If the broken MKVs go back to being corrupted past midnight once you change it back to daily, I think we can confirm that something behind the scenes is occurring.

As for the resolution, my first thought would be to change this line from retcd = av_interleaved_write_frame(oc, pkt); to retcd = av_write_frame(oc, pkt); and see if that was the cause of the cross contamination.

Let me know if you'd be able to test that change.

@restorepoint1
Copy link
Author

Today (when hourly timelapse configured) issue reverted:
Screenshot_2

let me know if you still need test with movie.cpp

@Mr-Dave
Copy link
Member

Mr-Dave commented Nov 19, 2024

You can try it. I'm at a loss as to what could be causing this. From the ffprobe of the broken MKVs, it appears that something this wrong with the PTS values. But there isn't much that Motionplus does with those. It just copies from the source, rescales to the container and writes them.

It could be interesting to know whether the issue persists when there is not any timelapse running. Or whether it happens when the mov or mp4 containers are used. After that, I'd think that a lot of debugging messages would have to be added to determine when the PTS values were getting messed up. e.g. Are they coming in from the camera wrong, rescaled wrong or only after they are written out. (That would be a massive amount of output since the issue is intermittent.)

@Mr-Dave Mr-Dave added the Unable to Reproduce Developers are unable to reproduce the issue label Nov 22, 2024
@restorepoint1
Copy link
Author

is there any little manual to build it locally using buildplus.sh?
or should i read source code of .sh?

@Mr-Dave
Copy link
Member

Mr-Dave commented Nov 22, 2024

You could use the buildplus.sh but that could be a bit much for testing purposes. (I'd only do that for long term use). I would just create a directory in your home and follow the directions for building except do not do the make install.

Once it is built, use the command /path/to/build/of/motionplus/src/motionplus -c /path/to/your/motionplus.conf That will use your newly built version of Motionplus without changing the existing install.

If you want to use the changed version long term, then you could use the buildplus.sh. If you put the buildplus.sh script into the motionplus directory and run it from there, it will use that local copy of the source code rather then cloning a fresh version from this repo.

@restorepoint1
Copy link
Author

I didnt build yet, but maybe it is important info to reproduce that i have second camera (almost the same) with config:

emulate_motion on

device_name CAM2
device_id 2

netcam_url x.x.x
width 800
height 448

timelapse_interval 0
timelapse_filename laps_%Y-%m-%d
timelapse_container mpeg4

picture_output off
picture_filename pic_%Y-%m-%d_%H-%M-%S

movie_output on
movie_filename mov_%Y-%m-%d_%H-%M-%S
movie_passthrough on
movie_max_time 3600

target_dir /mnt/share/camera/cam2

@Mr-Dave
Copy link
Member

Mr-Dave commented Nov 29, 2024

You could also turn off the timelapse and see if issue persists

@restorepoint1
Copy link
Author

restorepoint1 commented Nov 30, 2024

Yep, issue still perisist with timelapse disabled, that means the reason may be CAM2. I will disable it today (pause on, emulate_motion off)

@restorepoint1
Copy link
Author

restorepoint1 commented Dec 2, 2024

Issue still persist with timelapse disabled and pause on on CAM2
But i have noticed that break time is not 0:00 but between 09:16 and 11:13, i have restarted service at 21:49 day before (maybe it is 12-hour countdown or some bug with 12 and 24-hour clock?).

proccess restart at Dec 01 21:11:56 - broke between Dec 02 09:16 and 11:13
proccess restart at Nov 30 12:24:04 - broke between Dec 01 01:15 and 10:04
proccess restart at Nov 24 21:57:14 - broke between Nov 25 9:26 and 11:40, fixed by itself between Nov 25 19:24 and Nov 26 06:37, broke again between Nov 26 17:32 and 17:45
proccess restart at Nov 20 07:01:03 - broke between Nov 20 18:39 and 20:01
proccess restart at Nov 17 20:42:21 - broke between Nov 18 8:12 and 10:02, fixed by itself between Nov 18 20:35 and Nov 19 06:32, broke again between Nov 19 16:42 and Nov 19 17:45
proccess restart at Nov 16 17:43:10 - broke between Nov 16 23:13 and Nov 17 11:14

@Mr-Dave
Copy link
Member

Mr-Dave commented Dec 4, 2024

Were you able to compile it and run with the test version?

@restorepoint1
Copy link
Author

Yes i can, but i have no free time next few days

@restorepoint1
Copy link
Author

restorepoint1 commented Dec 8, 2024

i have built noble_motionplus_0.2.0-dirty-20241121-3e28491-1_amd64.deb, there is no any changes, pts is broken after 12 hours
service restart at Dec 07 19:10, broke between Dec 08 07:18 and 10:00

Also my cpu was 100% for a long time (till i restarted process):
CPU%▽MEM% TIME+ Command
100.7 1.4 18h46:05 /usr/bin/motionplus

UPD: i cant reproduce 100% cpu utilization with CAM1 config only. I'll try to return CAM2 config.
UPD2: i successfully reproduced 100% util with CAM1+CAM2 configs

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 160759 motion    20   0 1173664 272428  64256 S 111,7   1,7     11,50 mp00   

@Mr-Dave
Copy link
Member

Mr-Dave commented Dec 10, 2024

Using the deb creation script probably isn't practical for debugging what may be occurring with your cameras. You'll probably need to compile and run separately to understand what is occurring. Probably also want to review the logs and/or thread cpu usage via htop to understand what is occurring.

@restorepoint1
Copy link
Author

restorepoint1 commented Dec 10, 2024

htop.txt
journalctl.txt

UPD:
Just noticed even clean git commit 3e28491 causes 100% cpu utilization
MotionPlus version 0.2.0-git-20241121-3e28491
Should i create new issue for cpu util?

@restorepoint1 restorepoint1 changed the title Broken .mkv movies after midnight Incorrect PTS after 12 hours of process operation Dec 11, 2024
@Mr-Dave
Copy link
Member

Mr-Dave commented Dec 11, 2024

You can try to change this line to retcd = av_write_frame(oc, pkt); and see if that fixes the broken mkvs.

@restorepoint1
Copy link
Author

restorepoint1 commented Dec 12, 2024

You can try to change this line to retcd = av_write_frame(oc, pkt); and see if that fixes the broken mkvs.

I have built it already at #197 (comment) and it didnt solve the issue

@restorepoint1
Copy link
Author

What kind of debug info can i provide for you?

@Mr-Dave
Copy link
Member

Mr-Dave commented Dec 15, 2024

This one is still confusing. There isn't a whole lot happening to the PTS/DTS between the camera and writing it out. All I can suggest are various things to try in order to determine the cause.

After implementing the change to av_interleaved_write_frame, wasn't there any sort of change in log messages? I expected that this message would have flooded the log when the bad files were created.

You can try to put
MOTPLS_LOG(INF, TYPE_ENCODER, NO_ERRNO, "invalid base_pddts");
in here and here to see if somehow the base pts is getting messed up. Maybe change the wording slightly for each so we know which one it is failing upon.

You can try to put
return -1;
in here and here (same spots as above) to not write out the packets if the base pts is messed up.

You can try to put

    av_dict_set(&opts,"allowed_media_types","video", 0);
    av_dict_set(&opts,"allowed_media_types","audio", 0);

In here (right before last brace) and we can see whether it is some other stream from the camera causing the problem.

You can try using
netcam_params allowed_media_types=video
in the configuration to see if it is a non video stream causing the problem.

@restorepoint1
Copy link
Author

Step 1/4
This is not working at all

    av_dict_set(&opts,"allowed_media_types","video", 0);
    av_dict_set(&opts,"allowed_media_types","audio", 0);

av_dict_set.txt

@restorepoint1
Copy link
Author

restorepoint1 commented Dec 20, 2024

Step 2-3/4
I have applied both fixes. No any changes, no new logs. Service started at 2024-12-19 12:19:34, video broke between 2024-12-20 01:30 and 01:40

        if (pkt->pts < base_pdts) {
            MOTPLS_LOG(INF, TYPE_ENCODER, NO_ERRNO, "invalid base_pddts");
            return -1;
        }

motpls_return.txt

@restorepoint1
Copy link
Author

restorepoint1 commented Dec 21, 2024

Step 4/4
This also didn't help (in global motionplus.conf). Service started at 2024-12-20 7:24, broke between 20:30 and 20:40

netcam_params allowed_media_types=video

Any idea?
Idk why, but second camera works properly

@restorepoint1
Copy link
Author

restorepoint1 commented Dec 24, 2024

Interesting.

What i had before:
cam1.conf:
passthrough netcam_url rtsp://[hi-res outdoor url] + timelapse 1 fps + events 5 fps triggered by api
cam2.conf:
passthrough netcam_url rtsp://[low-res indoor url] + emulate_motion on + split 1h

cam1 (outdoor) events broke after ~13h15min, cam2 is ok.

I swapped urls only, it become:
cam1.conf:
passthrough netcam_url rtsp://[low-res indoor url] + timelapse 1 fps + events 5 fps triggered by api
cam2.conf:
passthrough netcam_url rtsp://[low-res outdoor url] + emulate_motion on + split hourly

cam2 videos (that now are low-res hourly videos from outdoor camera) broke after 13h15min

I have changed both urls to hi-res (fhd) and everything seems to be ok

Hi-res is 1920x1080, low-res is 800x448

UPD: cannot reproduce, issue still persist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Unable to Reproduce Developers are unable to reproduce the issue
Projects
None yet
Development

No branches or pull requests

2 participants