The Calculation Method of "latency_ms" in Node "multi_object_tracker" Seems Unreasonable #9428

cyn-liu · 2024-11-22T03:14:47Z

Checklist

I've read the contribution guidelines.
I've searched other issues and no duplicate issues were found.
I'm convinced that this is not my fault but a bug.

Description

pipeline_latency_ms represents the time it takes for the entire pipeline from point cloud publishing to the completion of execution at the current node.

We found node /perception/object_recognition/tracking/multi_object_tracker's debug/pipeline_latency_ms is large and the data fluctuates greatly.

pipeline_latency.mp4

We believe that there is a problem with the calculation method of /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms , which cannot actually reflect the pipeline latency from publishing the point cloud to completing the operation of node multi_object_tracker .

Expected behavior

/perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms is slightly large than /perception/object_recognition/detection/object_lanelet_filter/debug/pipeline_latency_ms.

Actual behavior

/perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms is much large than /perception/object_recognition/detection/object_lanelet_filter/debug/pipeline_latency_ms.

Steps to reproduce

Launch Autoware logging_simulator
Play ROS2 bag
Use rqt_plot to view /perception/object_recognition/tracking/multi_object_tracker/debug/pipeline_latency_ms

Versions

OS: ubuntu22.04
ROS2: humble
Autoware: Latest version

Possible causes

Perhaps debug time should not be published in onTime.
When calculating time, use the timestamp of the oldest data in the input data queue, the amount of data in the queue often changes, resulting in significant fluctuations in the time calculation results.

Additional context

None

The text was updated successfully, but these errors were encountered:

technolojin · 2024-11-25T00:34:12Z

@cyn-liu
What is your configuration of enable_delay_compensation?
If this value is true, it can extrapolate the tracking result (depends on your system situation), and the pipeline latency becomes large.

By turn enable_delay_compensation false, you can test if the pipeline-latency becomes stable.

technolojin · 2024-11-25T00:37:35Z

Related PR

#6710
#8657

cyn-liu · 2024-11-25T10:04:50Z

@cyn-liu What is your configuration of enable_delay_compensation? If this value is true, it can extrapolate the tracking result (depends on your system situation), and the pipeline latency becomes large.

By turn enable_delay_compensation false, you can test if the pipeline-latency becomes stable.

when enable_delay_compensation=true
when enable_delay_compensation=false

After testing, I believe that enable_delay_compensation=true is more stable, and enable_delay_compensation=false has a smaller vibration range.

As stated in the PR you provided, the reason for the instability of pipeline_latency_ms when enable_delay_compensation=false is the input (merged objects from multiple detectors) frequency is unstable. But as shown in the above figure, the range of variation of this value cannot be ignored.

technolojin · 2024-11-26T06:22:21Z

@cyn-liu
One of function of the enable_delay_compensation in the multi object tracker is to absorb the fluctuation of detection latency.

When the detection is not reached (over than 100 ms + margin), It publishes estimated tracking result. Since the estimation is done by old measurements, the pipeline latency is enlarged one additional cycle.
If this extrapolation is not done, the map based prediction and the planning module will not get any update.

Because of the extrapolation function and the fluctuating detection latency, the pipeline latency will fluctuating. In my understand, this is by design for now.

technolojin · 2024-11-26T06:26:13Z

@cyn-liu You may can have an experiment that disabling the extrapolation.

autoware.universe/perception/autoware_multi_object_tracker/src/multi_object_tracker_node.cpp

Line 242 in 5372403

should_publish = should_publish || elapsed_time > maximum_publish_interval;

cyn-liu · 2024-11-27T07:28:41Z

You mentioned multiple times above that node multi_object_tracker absorb the fluctuation of detection latency, but after my testing, I found that the running time of the pipeline before node multi_object_tracker is basically stable.

The following figure shows the latency: the time when multi_object_tracker receives the detection objects - the time of point cloud pub.

before_tracking_latency.mp4

technolojin · 2024-11-28T00:45:23Z

I found that the running time of the pipeline before node multi_object_tracker is basically stable.

Then you have a good object detection pipeline. That is a good thing.
But, when you turn enable_delay_compensation false, the fluctuation is enlarged. In this case, the tracker shall process the incoming data immediately, and the tracker purely add the processing time to the total pipeline latency.

/perception/object_recognition/tracking/track/latency/fluid_pressure

What is this topic? Is this your custom topic for system analysis?
To check the input latency (from the tracker perspective), debug/input_latency_ms topic is prepared.

Perhaps debug time should not be published in onTime.

The pipeline latency is determined when toe tracked object is published. If you set enable_delay_compensation to true, the object is published by onTimer callback, and the debug time will also be published at that time.
But, you got more fluctuated result when enable_delay_compensation is false. I do not think that it is a solution.

When calculating time, use the timestamp of the oldest data in the input data queue, the amount of data in the queue often changes, resulting in significant fluctuations in the time calculation results.

Why the oldest data is the pipeline latency timing even the tracker is updated newer data?

cyn-liu · 2024-11-28T03:10:46Z

What is this topic? Is this your custom topic for system analysis?
To check the input latency (from the tracker perspective), debug/input_latency_ms topic is prepared.

The /perception/object_recognition/tracking/before/track/latency/fluid_pressure is my custom topic, which different from debug/input_latency_ms.

My custom topic: It is written at this position, the current time of receiving data - the timestamp of the data header.

debug/input_latency_ms: startMeasurementTime - the timestamp of the oldest data in object_list

When I have a good object detection pipeline, my custom topic has stable values, but the value of debug/input_latency_ms fluctuates greatly.

Why the oldest data is the pipeline latency timing even the tracker is updated newer data?

This is also my question, why does the Autoware code calculate like this.

technolojin · 2024-11-28T07:53:02Z

I asked myself why it is. I could somehow recall the situation.

I think there is two perspective to the latency measurements.
Assumption: More accurate detection takes longer processing time

high accuracy detection domain: If we focus on the high-accuracy/long processing time detector, we may want to know that latency.
minimum latency domain: update means update, if a quick detector detects objects, the system latency is small

The current implementation is on the case 1 side. If we think the case 2 is right, I think it need to be fixed.

technolojin · 2024-11-28T07:57:28Z

Overall naming of function and variable are misleading now. This is due to the trigger algorithm changes over time.

onMessage
it was the callback of message arrival, but it is not now.

void TrackerDebugger::startMeasurementTime(
  const rclcpp::Time & now, const rclcpp::Time & measurement_header_stamp)
{
  last_input_stamp_ = measurement_header_stamp;

Incomming measurement_header_stamp is the "oldest/earliest" timestamp for the current implementation, but last_input_stamp_ means "latest".

technolojin · 2024-11-28T08:01:05Z

@cyn-liu Are you using multiple input to the multi_object_tracker?
If it is not, the explanation above is not meaningful.

Let me figure out how to solve this.

cyn-liu · 2024-11-28T09:09:26Z

Overall naming of function and variable are misleading now. This is due to the trigger algorithm changes over time.

onMessage it was the callback of message arrival, but it is not now.
void TrackerDebugger::startMeasurementTime(
  const rclcpp::Time & now, const rclcpp::Time & measurement_header_stamp)
{
  last_input_stamp_ = measurement_header_stamp;
Incomming measurement_header_stamp is the "oldest/earliest" timestamp for the current implementation, but last_input_stamp_ means "latest".

The naming of functions or variables in the Autoware code is indeed somewhat misleading, but measurement_header_stamp does mean oldest. code link

const rclcpp::Time oldest_time(objects_list.front().second.header.stamp);
  last_updated_time_ = current_time;

  // process start
  debugger_->startMeasurementTime(this->now(), oldest_time);

cyn-liu · 2024-11-28T09:33:44Z

@cyn-liu Are you using multiple input to the multi_object_tracker? If it is not, the explanation above is not meaningful.

My perception module uses Lidar Only mode, and even in Lidar Only mode, the input of node multi_object_tracker is multiple.
This is my perception module node relationship diagram.

technolojin · 2024-12-02T04:45:07Z

Here is the fix PR for this issue.
#9533

@cyn-liu
As you expected, the pipeline latency is the input latency + processing time (+ waiting time)
Can you test the PR?

cyn-liu · 2024-12-03T09:49:20Z

Here is the fix PR for this issue. #9533
@cyn-liu As you expected, the pipeline latency is the input latency + processing time (+ waiting time) Can you test the PR?

I have tested your PR and the results look consistent with your previous explanation.

when enable_delay_compensation is true, it can extrapolate the tracking result (depends on your system situation), and the pipeline latency becomes large. One of function of the enable_delay_compensation in the multi_object_tracker is to absorb the fluctuation of detection latency.

when enable_delay_compensation is false, the fluctuation is enlarged. In this case, the tracker shall process the incoming data immediately, and the tracker purely add the processing time to the total pipeline latency.

when enable_delay_compensation=true

have_delay_compensation.mp4

when enable_delay_compensation=false

no_delay_compensation.mp4

cyn-liu added the component:perception Advanced sensor data processing and environment understanding. (auto-assigned) label Nov 22, 2024

github-project-automation bot added this to Sensing & Perception Working Group Nov 22, 2024

github-project-automation bot moved this to To triage in Sensing & Perception Working Group Nov 22, 2024

cyn-liu added type:performance Software optimization and system performance. type:improvement Proposed enhancement and removed type:performance Software optimization and system performance. type:improvement Proposed enhancement labels Nov 22, 2024

amadeuszsz moved this from To triage to Backlog in Sensing & Perception Working Group Nov 22, 2024

YoshiRi assigned technolojin and YoshiRi Nov 22, 2024

xmfcx removed this from Sensing & Perception Working Group Nov 26, 2024

xmfcx added this to Software Working Group Nov 26, 2024

technolojin linked a pull request Dec 2, 2024 that will close this issue

fix(autoware_multi_object_tracker): measure latency with latest detection update time #9533

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Calculation Method of "latency_ms" in Node "multi_object_tracker" Seems Unreasonable #9428

The Calculation Method of "latency_ms" in Node "multi_object_tracker" Seems Unreasonable #9428

cyn-liu commented Nov 22, 2024

technolojin commented Nov 25, 2024 •

edited

Loading

technolojin commented Nov 25, 2024

cyn-liu commented Nov 25, 2024

technolojin commented Nov 26, 2024

technolojin commented Nov 26, 2024

cyn-liu commented Nov 27, 2024

technolojin commented Nov 28, 2024

cyn-liu commented Nov 28, 2024

technolojin commented Nov 28, 2024

technolojin commented Nov 28, 2024 •

edited

Loading

technolojin commented Nov 28, 2024

cyn-liu commented Nov 28, 2024

cyn-liu commented Nov 28, 2024

technolojin commented Dec 2, 2024 •

edited

Loading

cyn-liu commented Dec 3, 2024

The Calculation Method of "latency_ms" in Node "multi_object_tracker" Seems Unreasonable #9428

The Calculation Method of "latency_ms" in Node "multi_object_tracker" Seems Unreasonable #9428

Comments

cyn-liu commented Nov 22, 2024

Checklist

Description

Expected behavior

Actual behavior

Steps to reproduce

Versions

Possible causes

Additional context

technolojin commented Nov 25, 2024 • edited Loading

technolojin commented Nov 25, 2024

cyn-liu commented Nov 25, 2024

technolojin commented Nov 26, 2024

technolojin commented Nov 26, 2024

cyn-liu commented Nov 27, 2024

technolojin commented Nov 28, 2024

cyn-liu commented Nov 28, 2024

technolojin commented Nov 28, 2024

technolojin commented Nov 28, 2024 • edited Loading

technolojin commented Nov 28, 2024

cyn-liu commented Nov 28, 2024

cyn-liu commented Nov 28, 2024

technolojin commented Dec 2, 2024 • edited Loading

cyn-liu commented Dec 3, 2024

technolojin commented Nov 25, 2024 •

edited

Loading

technolojin commented Nov 28, 2024 •

edited

Loading

technolojin commented Dec 2, 2024 •

edited

Loading