Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Calculation Method of "latency_ms" in Node "multi_object_tracker" Seems Unreasonable #9428

Open
3 tasks done
cyn-liu opened this issue Nov 22, 2024 · 15 comments · May be fixed by #9533
Open
3 tasks done

The Calculation Method of "latency_ms" in Node "multi_object_tracker" Seems Unreasonable #9428

cyn-liu opened this issue Nov 22, 2024 · 15 comments · May be fixed by #9533
Assignees
Labels
component:perception Advanced sensor data processing and environment understanding. (auto-assigned)

Comments

@cyn-liu
Copy link
Contributor

cyn-liu commented Nov 22, 2024

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I'm convinced that this is not my fault but a bug.

Description

pipeline_latency_ms represents the time it takes for the entire pipeline from point cloud publishing to the completion of execution at the current node.

We found node /perception/object_recognition/tracking/multi_object_tracker's debug/pipeline_latency_ms is large and the data fluctuates greatly.

pipeline_latency.mp4

We believe that there is a problem with the calculation method of /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms , which cannot actually reflect the pipeline latency from publishing the point cloud to completing the operation of node multi_object_tracker .

Expected behavior

/perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms is slightly large than /perception/object_recognition/detection/object_lanelet_filter/debug/pipeline_latency_ms.

Actual behavior

/perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms is much large than /perception/object_recognition/detection/object_lanelet_filter/debug/pipeline_latency_ms.

Steps to reproduce

  1. Launch Autoware logging_simulator
  2. Play ROS2 bag
  3. Use rqt_plot to view /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms

Versions

  • OS: ubuntu22.04
  • ROS2: humble
  • Autoware: Latest version

Possible causes

  1. Perhaps debug time should not be published in onTime.
  2. When calculating time, use the timestamp of the oldest data in the input data queue, the amount of data in the queue often changes, resulting in significant fluctuations in the time calculation results.

Additional context

None

@cyn-liu cyn-liu added the component:perception Advanced sensor data processing and environment understanding. (auto-assigned) label Nov 22, 2024
@cyn-liu cyn-liu added type:performance Software optimization and system performance. type:improvement Proposed enhancement and removed type:performance Software optimization and system performance. type:improvement Proposed enhancement labels Nov 22, 2024
@amadeuszsz amadeuszsz moved this from To triage to Backlog in Sensing & Perception Working Group Nov 22, 2024
@technolojin
Copy link
Contributor

technolojin commented Nov 25, 2024

@cyn-liu
What is your configuration of enable_delay_compensation?
If this value is true, it can extrapolate the tracking result (depends on your system situation), and the pipeline latency becomes large.

By turn enable_delay_compensation false, you can test if the pipeline-latency becomes stable.

@technolojin
Copy link
Contributor

Related PR

#6710
#8657

@cyn-liu
Copy link
Contributor Author

cyn-liu commented Nov 25, 2024

@cyn-liu What is your configuration of enable_delay_compensation? If this value is true, it can extrapolate the tracking result (depends on your system situation), and the pipeline latency becomes large.

By turn enable_delay_compensation false, you can test if the pipeline-latency becomes stable.

  1. when enable_delay_compensation=true
    new_track_pipeline_true

  2. when enable_delay_compensation=false
    new_track_pipeline_false

After testing, I believe that enable_delay_compensation=true is more stable, and enable_delay_compensation=false has a smaller vibration range.

As stated in the PR you provided, the reason for the instability of pipeline_latency_ms when enable_delay_compensation=false is the input (merged objects from multiple detectors) frequency is unstable. But as shown in the above figure, the range of variation of this value cannot be ignored.

@technolojin
Copy link
Contributor

@cyn-liu
One of function of the enable_delay_compensation in the multi object tracker is to absorb the fluctuation of detection latency.

When the detection is not reached (over than 100 ms + margin), It publishes estimated tracking result. Since the estimation is done by old measurements, the pipeline latency is enlarged one additional cycle.
If this extrapolation is not done, the map based prediction and the planning module will not get any update.

Because of the extrapolation function and the fluctuating detection latency, the pipeline latency will fluctuating. In my understand, this is by design for now.

@technolojin
Copy link
Contributor

@cyn-liu You may can have an experiment that disabling the extrapolation.

should_publish = should_publish || elapsed_time > maximum_publish_interval;

@cyn-liu
Copy link
Contributor Author

cyn-liu commented Nov 27, 2024

You mentioned multiple times above that node multi_object_tracker absorb the fluctuation of detection latency, but after my testing, I found that the running time of the pipeline before node multi_object_tracker is basically stable.

The following figure shows the latency: the time when multi_object_tracker receives the detection objects - the time of point cloud pub.

before_tracking_latency.mp4

@technolojin
Copy link
Contributor

I found that the running time of the pipeline before node multi_object_tracker is basically stable.

Then you have a good object detection pipeline. That is a good thing.
But, when you turn enable_delay_compensation false, the fluctuation is enlarged. In this case, the tracker shall process the incoming data immediately, and the tracker purely add the processing time to the total pipeline latency.

/perception/object_recognition/tracking/track/latency/fluid_pressure

What is this topic? Is this your custom topic for system analysis?
To check the input latency (from the tracker perspective), debug/input_latency_ms topic is prepared.

Perhaps debug time should not be published in onTime.

The pipeline latency is determined when toe tracked object is published. If you set enable_delay_compensation to true, the object is published by onTimer callback, and the debug time will also be published at that time.
But, you got more fluctuated result when enable_delay_compensation is false. I do not think that it is a solution.

When calculating time, use the timestamp of the oldest data in the input data queue, the amount of data in the queue often changes, resulting in significant fluctuations in the time calculation results.

Why the oldest data is the pipeline latency timing even the tracker is updated newer data?

@cyn-liu
Copy link
Contributor Author

cyn-liu commented Nov 28, 2024

What is this topic? Is this your custom topic for system analysis?
To check the input latency (from the tracker perspective), debug/input_latency_ms topic is prepared.

The /perception/object_recognition/tracking/before/track/latency/fluid_pressure is my custom topic, which different from debug/input_latency_ms.

My custom topic: It is written at this position, the current time of receiving data - the timestamp of the data header.

debug/input_latency_ms: startMeasurementTime - the timestamp of the oldest data in object_list

When I have a good object detection pipeline, my custom topic has stable values, but the value of debug/input_latency_ms fluctuates greatly.

Why the oldest data is the pipeline latency timing even the tracker is updated newer data?

This is also my question, why does the Autoware code calculate like this.

@technolojin
Copy link
Contributor

I asked myself why it is. I could somehow recall the situation.

I think there is two perspective to the latency measurements.
Assumption: More accurate detection takes longer processing time

  1. high accuracy detection domain: If we focus on the high-accuracy/long processing time detector, we may want to know that latency.
  2. minimum latency domain: update means update, if a quick detector detects objects, the system latency is small

The current implementation is on the case 1 side. If we think the case 2 is right, I think it need to be fixed.

@technolojin
Copy link
Contributor

technolojin commented Nov 28, 2024

Overall naming of function and variable are misleading now. This is due to the trigger algorithm changes over time.

onMessage
it was the callback of message arrival, but it is not now.

void TrackerDebugger::startMeasurementTime(
  const rclcpp::Time & now, const rclcpp::Time & measurement_header_stamp)
{
  last_input_stamp_ = measurement_header_stamp;

Incomming measurement_header_stamp is the "oldest/earliest" timestamp for the current implementation, but last_input_stamp_ means "latest".

@technolojin
Copy link
Contributor

@cyn-liu Are you using multiple input to the multi_object_tracker?
If it is not, the explanation above is not meaningful.

Let me figure out how to solve this.

@cyn-liu
Copy link
Contributor Author

cyn-liu commented Nov 28, 2024

Overall naming of function and variable are misleading now. This is due to the trigger algorithm changes over time.

onMessage it was the callback of message arrival, but it is not now.

void TrackerDebugger::startMeasurementTime(
  const rclcpp::Time & now, const rclcpp::Time & measurement_header_stamp)
{
  last_input_stamp_ = measurement_header_stamp;

Incomming measurement_header_stamp is the "oldest/earliest" timestamp for the current implementation, but last_input_stamp_ means "latest".

The naming of functions or variables in the Autoware code is indeed somewhat misleading, but measurement_header_stamp does mean oldest. code link

const rclcpp::Time oldest_time(objects_list.front().second.header.stamp);
  last_updated_time_ = current_time;

  // process start
  debugger_->startMeasurementTime(this->now(), oldest_time);

@cyn-liu
Copy link
Contributor Author

cyn-liu commented Nov 28, 2024

@cyn-liu Are you using multiple input to the multi_object_tracker? If it is not, the explanation above is not meaningful.

My perception module uses Lidar Only mode, and even in Lidar Only mode, the input of node multi_object_tracker is multiple.
This is my perception module node relationship diagram.
perception_node_graph

@technolojin
Copy link
Contributor

technolojin commented Dec 2, 2024

Here is the fix PR for this issue.
#9533

Screenshot from 2024-12-02 13-43-06

@cyn-liu
As you expected, the pipeline latency is the input latency + processing time (+ waiting time)
Can you test the PR?

@cyn-liu
Copy link
Contributor Author

cyn-liu commented Dec 3, 2024

Here is the fix PR for this issue. #9533
@cyn-liu As you expected, the pipeline latency is the input latency + processing time (+ waiting time) Can you test the PR?

I have tested your PR and the results look consistent with your previous explanation.

when enable_delay_compensation is true, it can extrapolate the tracking result (depends on your system situation), and the pipeline latency becomes large. One of function of the enable_delay_compensation in the multi_object_tracker is to absorb the fluctuation of detection latency.

when enable_delay_compensation is false, the fluctuation is enlarged. In this case, the tracker shall process the incoming data immediately, and the tracker purely add the processing time to the total pipeline latency.

  1. when enable_delay_compensation=true
have_delay_compensation.mp4
  1. when enable_delay_compensation=false
no_delay_compensation.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:perception Advanced sensor data processing and environment understanding. (auto-assigned)
Projects
Status: No status
3 participants