Snapshot should write to a different file every time it is triggered #1099

bijoua29 · 2022-09-25T00:38:30Z

Description

When you trigger multiple snapshots through the service call, each snapshot data is written to the same file. This is not very useful for multiple snapshots.

The typical use-case for the snapshot feature is to save a snapshot when the user see something "interesting". This might happen multiple times. Usually triggering a snapshot is not always a one-time event.

Related Issues

N/A

Completion Criteria

Save to a different file when a snapshot is triggered
Add ability to specify filename for each snapshot or allow option for timestamped file

Implementation Notes / Suggestions

N/A

Testing Notes / Suggestions

N/A

swiz23 · 2023-02-10T18:57:30Z

I second this feature request. This seems helpful.

wvergouwenavular · 2023-07-07T12:04:28Z

+1

emersonknapp · 2023-07-07T17:36:34Z

Note - the recorder does provide the service split_bagfile - you could call that immediately after the snapshot to split. Though, I think it's not unreasonable to provide a split option in Snapshot.srv so that it can happen atomically. PRs welcome!

Timple · 2024-01-22T13:18:53Z

Shall we extend the Snapshot.srv from this:

---
bool success

To this?

string filename   # Name of bag file to save snapshot to.
---
bool success

In line with this: https://github.com/ros/rosbag_snapshot/blob/main/rosbag_snapshot_msgs/srv/TriggerSnapshot.srv

We probably need to come up with a way to append the timestamp as well. Perhaps if we enter something like my_file_{stamp} that this would result in my_file_2024_01_22-14_18_28/?
Suggestions welcome

MichaelOrlov · 2024-01-23T04:44:25Z

@Timple Extend Snapshot.srv with a few new parameters is not a big deal or a problem.
However, it turns out that it is not trivial to implement this feature on the lower rosbag2 layers.
There is also another request to extend snapshot feature to be able to write a buffer after the event as well.
These two extensions are sort of overlaps in implementation and even alone are not trivial to implement.

bijoua29 · 2024-01-23T18:48:41Z

I guess I'm surprised that just adding the ability to name the file seems not trivial to implement. But I have no knowledge of the structure of the code.

Timple · 2024-02-15T15:04:36Z

Would it make sense to align on the service call syntax already?

ROS2 Jade is coming up, and changing a service after release won't be possible I guess. Implementation can then be worked on after the interface is agreed.

bijoua29 · 2024-02-15T15:27:14Z

Yes, it would make sense on the service call syntax. Was there a proposal for an updated syntax?

Timple · 2024-02-15T16:12:42Z

#1099 (comment) ?

reinzor · 2024-04-09T13:14:34Z

Hi all,

Thanks for all the information so far. I've read this issue and studied this PR with all its comments. Based on this info and fiddling a bit in the rosbag2_cpp package, I can come up with the following plan:

Extend Snapshot.srv to include a name, as suggested here
This name can then be passed in the take_snapshot method as an input argument.
The SequentialWriter will then receive the name and implement the following logic after the notify_data_ready
- Store the current storage_ path
- Call the switch_to_next_storage method
- Rename the last written storage file and alter the metadata (rename based on the name and the current stamp)
- Write the metadata (as is being done in the close() method)

I rapidly implemented this and this seemed to work.

This will give the following output:

.
└── rosbag2_2024_04_09-14_59_55
    ├── metadata.yaml
    ├── rosbag2_2024_04_09-14_59_55_0_snapshot_A.mcap
    ├── rosbag2_2024_04_09-14_59_55_1_snapshot_B.mcap
    ├── rosbag2_2024_04_09-14_59_55_2_snapshot_C.mcap

@bijoua29 @dkorenkovnl @Timple @MichaelOrlov could you give your feedback on this approach before I send a PR? Thanks!

MichaelOrlov · 2024-04-11T02:23:02Z

@reinzor That is one more very hacky way to implement this feature.

I don't like the idea of renaming a file right after writing it because we should have a callback bag_events::BagEvent::WRITE_SPLIT right after creating a new file and closing the old one. Essentially we need functionality from the SequentialWriter::split_bagfile()
The SequentialWriter::switch_to_next_storage() will call cache_consumer_->stop(); which will try to swap circular buffers again and flush data to disk one more time. And there will be a race condition between the current call to data dumping from the buffer. if we are lucky and we have a lot of messages in buffer and slow hdd we probably in a good luck. If not we will dump extra data randomly one more time which we were not asked for. This is undefined behavior!

reinzor · 2024-04-11T05:28:59Z

@reinzor That is one more very hacky way to implement this feature.

I don't like the idea of renaming a file right after writing it because we should have a callback bag_events::BagEvent::WRITE_SPLIT right after creating a new file and closing the old one. Essentially we need functionality from the SequentialWriter::split_bagfile()

The SequentialWriter::switch_to_next_storage() will call cache_consumer_->stop(); which will try to swap circular buffers again and flush data to disk one more time. And there will be a race condition between the current call to data dumping from the buffer. if we are lucky and we have a lot of messages in buffer and slow hdd we probably in a good luck. If not we will dump extra data randomly one more time which we were not asked for. This is undefined behavior!

Thanks for your feedback @MichaelOrlov ; could you perhaps point me in the right direction for this implementation? Regarding point 1, what would be your suggestion instead of renaming? As far as I know, the name of the file is already defined when starting the record so if we would like the snapshot to have a specific filename, this would not be possible. We could maybe add some additional data to the custom_data structure in the metadata? As for point 2, it is not very clear for me how the internal consumer work so how can I implement this in a proper way without the undefined behavior? Thanks again!

Timple · 2024-04-11T05:44:24Z

I believe #1584 was a step in that direction. However some workarounds where proposed there and now it's closed.
It does contain a hint on where to start it seems: #1584 (comment)

@MichaelOrlov you seem to be most familiar with the codebase in this thread. Would it be possible to write up a small set of steps to take to tackle this issue?

MichaelOrlov · 2024-04-11T07:09:53Z

I am sorry folks, I am currently kinda busy with Jazzy release preparation. A lot of PRs and other things need to wrap up.
I can get back to this task after a couple of weeks.
Need to think thoroughly about how we can implement this feature in the best way.
In the case of the filename not sure if it would be possible to enforce a new file name over the snapshot service call without file renaming I was thinking about maybe in this case we can return the saved filename in the snapshot service response message instead of enforcing a new name?

Timple · 2024-04-11T07:35:37Z

I'm not quite sure why the bagfile is already open actually. Because that is where this problem is originating from.

Why not open it upon receiving a snapshot request?

fujitatomoya · 2024-04-15T22:28:37Z

Why not open it upon receiving a snapshot request?

sounds reasonable, file should be named and created upon snapshot capture request by user. but i am not sure how we can deal with metadata.yaml? can it include the multiple bag data information? if not, it should create rosbag2_2024_04_15-15_19_02?snapshot folder on the request by user?

reinzor · 2024-10-22T08:35:15Z

@fujitatomoya @MichaelOrlov How can we best continue with this implementation. There appears to be a lot of interest in this functionality together with a fixed duration for a bag file. We can help in the implementation but this would require some pointing in the correct direction. Let us know :).

fujitatomoya · 2024-10-25T17:26:23Z

@reinzor @MichaelOrlov @Timple

just a question though, do we need to support both snapshot and split at the same time? user is expecting the snapshot data for certain cache size. i think if the files are split into multiple files, that would not be useful for user? i do not see the actual use case for this...

bijoua29 · 2024-10-25T18:56:03Z

@fujitatomoya Yes, I would expect snapshot for a certain cache size or duration. So I wouldn't expect or need the snapshot to be split.

MichaelOrlov · 2024-10-25T22:40:58Z

@reinzor As regards to

How can we best continue with this implementation?

I see a new related PR #1839. Let me review it first, and then I will try to figure out how to continue.
Glad to hear that you can allocate resources to help with this task.

MichaelOrlov · 2024-10-26T23:15:54Z

@bijoua29 We need to move the completion criteria

Add ability to specify filename for each snapshot or allow option for timestamped file

to a separate issue with feature request.

Timple · 2024-10-27T07:48:19Z

@reinzor @MichaelOrlov @Timple

just a question though, do we need to support both snapshot and split at the same time? user is expecting the snapshot data for certain cache size. i think if the files are split into multiple files, that would not be useful for user? i do not see the actual use case for this...

Our use-case:

In the case of a event (e.g. robot collision) we want to have a snapshot of (almost) all data available.

Single file

This snapshot should be a single rosbag. Having 12 snapshots in a single file still makes this file hard to offload from the robot. Also it's hard to link this file to an event as it contains multiple.

Also closing the file makes for other synchronization services to kick-in. If the file keeps being open for the duration of the robots uptime, we would not be able to offload it until the next reboot.

File names

Ideally we can also control the name of the bagfile. Such that it can be easier linked with events. Although this can also be solved with metadata inside the bagfile.
If there is a timestamp in the filename, it should either correlate to the time of the snapshot call or the time of the first data entry.
Having a timestamp in the filename when the node started does not make any sense regarding the data it contains.

Duration

Having a fixed duration (like 30seconds) makes more sense for our usecase than say (1gb of data). Because the amount of data can verify. And will most likely increase as our software evolves, making the bagfiles shorter.

Not all three points might belong to this issue, but I thought I'd fully clarify our usecase.

Edit: Appendix:

We also record all data on our machines. With a cleanup script to prevent the disk getting full. Currently there is quite a CPU hog as all data is received twice (once per bag process).

If the standard bag service would provide snapshot feature alongside it's standard recording, that would be awesome! Although I currently have no idea how this would look like, so might be far-fetched.

MichaelOrlov · 2024-10-27T15:30:24Z

Hi @Timple, Thank you for posting real use cases. Your feedback is precious.
First of all, I want to clarify that when we were talking about

just a question though, do we need to support both snapshot and split at the same time?

We meant that it shouldn't happen bag_split due to the limit imposed on the maximum file size when we will be writing a snapshot to a new file.

As regards filenames. I looked at the code yesterday, and I can say that it is possible to implement a custom name for the snapshot file. We can correct the file name in metadata before saving and rename file at the right moment after the closing it.
Although, I am not convinced yet that timestamp in the filename makes sense or that it will add any value in it. The file timestamp is present in filesystem attributes and inside metadata. The sequence number makes more sense IMO. Please provide more details about this topic if you have.
Also, a few caveats could be in a custom filename. What if the file with the same filename already exists? Or what if the filename is too long or has a path in it to the "undesirable" or protected location? I would like to avoid handling all these corner cases. Therefore, having a custom filename is an undesirable feature.

I understand that users want to somehow tag the snapshot and know the recorded filename.
There are a few possible options in addition to the explicitly enforcing filename for the snapshot.

Provide 'custom-data' with the <key, value> during the snapshot service request and save it in the exiting custom_data field in the metadata. See

rosbag2/rosbag2_storage/include/rosbag2_storage/bag_metadata.hpp

Lines 55 to 57 in 786c3c4

    
           std::string compression_mode; 
        
           std::unordered_map<std::string, std::string> custom_data;  // {key: value, ...} 
        
           std::string ros_distro;

Subscribe to the Topic: /events/write_split | Type: rosbag2_interfaces/msg/WriteSplitEvent on the client side.
We publishing on this topic from the Rosbag2 recorder information about closed and newly opened files. And we will publish on it when will be saving snapshot in the new file. Please see details in the:

rosbag2/rosbag2_transport/src/rosbag2_transport/recorder.cpp

Lines 374 to 388 in 786c3c4

    
           auto message = rosbag2_interfaces::msg::WriteSplitEvent(); 
        
           message.closed_file = bag_split_info_.closed_file; 
        
           message.opened_file = bag_split_info_.opened_file; 
        
           message.node_name = node->get_fully_qualified_name(); 
        
           try { 
        
             split_event_pub_->publish(message); 
        
           } catch (const std::exception & e) { 
        
             RCLCPP_ERROR_STREAM( 
        
               node->get_logger(), 
        
               "Failed to publish message on '/events/write_split' topic. \nError: " << e.what()); 
        
           } catch (...) { 
        
             RCLCPP_ERROR_STREAM( 
        
               node->get_logger(), 
        
               "Failed to publish message on '/events/write_split' topic."); 
        
           }

Return the file name of the newly saved snapshot via service response to the service request for the snapshot.

The request for a fixed snapshot duration is valid and desirable for almost everybody. However, it is not easy to implement with the current design. We are bound by the double buffering, lock-free scheme with the fixed size on the rosbag2_cpp layer. Bounding by the time limit will imply intensive dynamic allocations during recording, which we would like to avoid. It is likely that some compromise solution will be possible. However, I have not found it yet.

As regards to the proposal of saving snapshot during normal recording. It is hard to implement and will likely require a similar amount of additional CPU resources as if running one more instance of the rosbag recorder in parallel. BTW liikely it will be possible to reduce the CPU burden by disabling discovery in the recorder or increasing the timeout for it via CLI option.

bijoua29 · 2024-10-27T17:25:39Z

@bijoua29 We need to move the completion criteria

Add ability to specify filename for each snapshot or allow option for timestamped file

to a separate issue with feature request.

@MichaelOrlov That's fine to move this to a separate issue.

Regarding the options you propose for filename, I'm fine with option 3 where it returns the filename in the service response. I assume the service response is returned after the file is written to disk so I can safely rename the file if I so choose.
Currently, we do some hacky thing where we trigger the snapshot save and then rename the file after some assumptions that the file has been written and then restart another snapshot. This is prone to errors and possible race conditions.

MichaelOrlov · 2024-10-30T22:06:18Z

Hi @bijoua29, Option 3, returning the filename, as a service response may not work for all use cases.
After the #1842 is merged, the snapshot will become a blocking call.

However, at least one corner case exists. When we enable per-file compression and use sequential_compression_writer. After closing a file and returning from the snapshot, we still run compression in the background in a separate thread, and as soon as the compression finishes, we rename the original file.
In theory, we can predict a new filename after compression and return it. However, that is not what you want if you are going to use it for renaming on your end as well. Since when we return the filename from the snapshot, that new file will not exist yet. If you try to wait for it with the polling cycle, there is a risk that you will try to rename it while it has not been closed yet by the Rosbag2 recorder.
Also, to facilitate the other feature request, where will need to add the option to optionally save snapshots not only before triggering the snapshot service event but also for some time after. We will need to reconsider the design of the snapshot service event to allow writing incoming data to the swapped buffer during saving snapshots, and the snapshot event call will likely become non-synchronous again.

The most robust way to know when and what filename is available after trigering a snapshot is the option 2.

Subscribe to the Topic: /events/write_split | Type: rosbag2_interfaces/msg/WriteSplitEvent on the client side.

MichaelOrlov · 2024-11-04T01:21:15Z

Hi @reinzor. You were asking how to move forward with the implementation of snapshot writing for the different files and the time-bounded snapshot features. And also were willing to help with it.

The first item Make snapshot writing into a new file each time it is triggered has been done and recently merged to rolling. It was a smooth backport to the jazzy but with some hiccups on Windows CI. However, the backport to the Humble needs to be adjusted.
We moved out of scope, providing a name for the saving snapshot file via a service request. It could be considered and implemented as a separate feature.

I've made good progress on the second item The time bounded snapshot feature #1844. I've made all the required changes for this feature. However, I didn't have time to test it and write unit tests.
The help there would be really appreciated. Feel free to cherry-pick commits from py #1844 PR or create a fork above it.
Need to review #1844, test it locally, write unit tests, and update documentation.

bijoua29 added the enhancement New feature or request label Sep 25, 2022

dkorenkovnl mentioned this issue Mar 19, 2024

Improve logic of snapshot feature #1584

Closed

This was referenced Oct 11, 2024

Snapshot writes to a different file every time it is triggered #1828

Closed

Feature/snapshot write different file each trigger rolling #1839

Closed

MichaelOrlov mentioned this issue Oct 26, 2024

Make snapshot writing into a new file each time it is triggered #1842

Merged

MichaelOrlov linked a pull request Oct 26, 2024 that will close this issue

Make snapshot writing into a new file each time it is triggered #1842

Merged

MichaelOrlov closed this as completed in #1842 Nov 1, 2024

This was referenced Nov 1, 2024

[jazzy] Make snapshot writing into a new file each time it is triggered (backport #1842) #1849

Merged

[humble] Make snapshot writing into a new file each time it is triggered (backport #1842) #1850

Merged

MichaelOrlov reopened this Nov 4, 2024

MichaelOrlov self-assigned this Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot should write to a different file every time it is triggered #1099

Snapshot should write to a different file every time it is triggered #1099

bijoua29 commented Sep 25, 2022

swiz23 commented Feb 10, 2023

wvergouwenavular commented Jul 7, 2023

emersonknapp commented Jul 7, 2023

Timple commented Jan 22, 2024

MichaelOrlov commented Jan 23, 2024

bijoua29 commented Jan 23, 2024

Timple commented Feb 15, 2024

bijoua29 commented Feb 15, 2024

Timple commented Feb 15, 2024

reinzor commented Apr 9, 2024 •

edited

Loading

MichaelOrlov commented Apr 11, 2024

reinzor commented Apr 11, 2024

Timple commented Apr 11, 2024

MichaelOrlov commented Apr 11, 2024

Timple commented Apr 11, 2024

fujitatomoya commented Apr 15, 2024

reinzor commented Oct 22, 2024

fujitatomoya commented Oct 25, 2024

bijoua29 commented Oct 25, 2024

MichaelOrlov commented Oct 25, 2024

MichaelOrlov commented Oct 26, 2024

Timple commented Oct 27, 2024 •

edited

Loading

MichaelOrlov commented Oct 27, 2024

bijoua29 commented Oct 27, 2024

MichaelOrlov commented Oct 30, 2024 •

edited

Loading

MichaelOrlov commented Nov 4, 2024

Snapshot should write to a different file every time it is triggered #1099

Snapshot should write to a different file every time it is triggered #1099

Comments

bijoua29 commented Sep 25, 2022

Description

Related Issues

Completion Criteria

Implementation Notes / Suggestions

Testing Notes / Suggestions

swiz23 commented Feb 10, 2023

wvergouwenavular commented Jul 7, 2023

emersonknapp commented Jul 7, 2023

Timple commented Jan 22, 2024

MichaelOrlov commented Jan 23, 2024

bijoua29 commented Jan 23, 2024

Timple commented Feb 15, 2024

bijoua29 commented Feb 15, 2024

Timple commented Feb 15, 2024

reinzor commented Apr 9, 2024 • edited Loading

MichaelOrlov commented Apr 11, 2024

reinzor commented Apr 11, 2024

Timple commented Apr 11, 2024

MichaelOrlov commented Apr 11, 2024

Timple commented Apr 11, 2024

fujitatomoya commented Apr 15, 2024

reinzor commented Oct 22, 2024

fujitatomoya commented Oct 25, 2024

bijoua29 commented Oct 25, 2024

MichaelOrlov commented Oct 25, 2024

MichaelOrlov commented Oct 26, 2024

Timple commented Oct 27, 2024 • edited Loading

Single file

File names

Duration

MichaelOrlov commented Oct 27, 2024

bijoua29 commented Oct 27, 2024

MichaelOrlov commented Oct 30, 2024 • edited Loading

MichaelOrlov commented Nov 4, 2024

reinzor commented Apr 9, 2024 •

edited

Loading

Timple commented Oct 27, 2024 •

edited

Loading

MichaelOrlov commented Oct 30, 2024 •

edited

Loading