Don't report span.timestamp or duration when you didn't create the span #91

codefromthecrypt · 2016-11-28T08:43:57Z

Zipkin's RPC span is interesting because it is a shared model. For example, a client creates a span and then the next hop (a server) adds annotations to it. The authoritative owner of the span is the one that's supposed to set timestamp and duration. In the case where it is originated by a client, the client, not the server, should set timestamp and duration.

See openzipkin/openzipkin.github.io#49

Why bother?

If we special-case timestamp and duration reporting, a couple things happen:

zipkin duration queries will be accurate because the critical path duration will be the only duration queried for the span
Those externally controlling ids will be able to use the duration query
Those converting zipkin RPC spans into single-host spans, like Google StackDriver, will have a heuristic to tell if a span is dual host or not.

Do we need to do anything?

I'm not sure if the current implementation is doing this fine or not, but it is probably useful to add tests to make sure it it. Usually, there's very little to do to special-case this, and it always is where B3 headers are read.

Here are tests that should smoke it out

client creates a span and propagates it to a server
- client propagates b3 headers for trace id, span id, and sampled=1
- server neither reports span.timestamp nor duration
- client reports span.timestamp and duration
server creates a new trace
- no b3 headers are detected
- server reports span.timestamp and duration
server creates a new trace with externally supplied ids
- caller propagates b3 headers for trace id, span id, but no sampled header
- server reports span.timestamp and duration

Does this make sense? Can anyone help translate this into tests for us?

See openzipkin/brave#277 (comment)

codefromthecrypt · 2016-11-28T10:37:05Z

ps mostly unrelated, but in some other implementations, when we start a span, we log a tick (or otherwise start a timer). When this is set (ex it wouldn't be set if it were a continued server span), we extract a measurement in microseconds as duration. Using this is more reliable than system time, as it never risks moving backwards during a span (ex from NTP adjustments). Not sure best mechanism in ruby, but I've found http://ruby-doc.org/stdlib-2.3.3/libdoc/benchmark/rdoc/Benchmark.html and https://github.com/copiousfreetime/hitimes

jcarres-mdsol · 2016-11-28T19:57:35Z

I think the third one would not pass and require several changes.
The first two it are easy, you got the ids , someone else is owning this, else you own this.
The third is weird, "the lack of this optional sampled header in fact means you own this", weird. Current code will not like this

codefromthecrypt · 2016-11-29T01:04:57Z

The third is weird, "tha lack of this optional sampled header in fact means you own this", weird. Current code will not like this

well, if you think about it, it makes sense. You'd never propagate sampled=1 unless you started the span. lack of a sampled header means "defer sampling decision downstream" which means it hasn't been sampled or reported yet. anyway, we can leave out part 3 for now, just it will eventually come up. when it does, this is how it would be handled.

codefromthecrypt · 2016-11-29T01:05:58Z

in other words the sampled header is indeed required for any span that the caller reports to zipkin. I can fix the b3 docs as it seems unclear.

codefromthecrypt · 2016-11-29T01:06:42Z

actually it is already there.. https://github.com/openzipkin/b3-propagation#details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't report span.timestamp or duration when you didn't create the span #91

Don't report span.timestamp or duration when you didn't create the span #91

codefromthecrypt commented Nov 28, 2016

codefromthecrypt commented Nov 28, 2016

jcarres-mdsol commented Nov 28, 2016 •

edited

Loading

codefromthecrypt commented Nov 29, 2016 via email

codefromthecrypt commented Nov 29, 2016 via email

codefromthecrypt commented Nov 29, 2016 via email

Don't report span.timestamp or duration when you didn't create the span #91

Don't report span.timestamp or duration when you didn't create the span #91

Comments

codefromthecrypt commented Nov 28, 2016

Why bother?

Do we need to do anything?

codefromthecrypt commented Nov 28, 2016

jcarres-mdsol commented Nov 28, 2016 • edited Loading

codefromthecrypt commented Nov 29, 2016 via email

codefromthecrypt commented Nov 29, 2016 via email

codefromthecrypt commented Nov 29, 2016 via email

jcarres-mdsol commented Nov 28, 2016 •

edited

Loading