Skip to content

Commit

Permalink
Generate process metrics with semconv yaml (#330)
Browse files Browse the repository at this point in the history
Co-authored-by: Joao Grassi <[email protected]>
  • Loading branch information
braydonk and joaopgrassi authored Jan 23, 2024
1 parent a41f0e8 commit d21135e
Show file tree
Hide file tree
Showing 5 changed files with 368 additions and 20 deletions.
3 changes: 3 additions & 0 deletions .yamllint
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
extends: default

ignore-from-file:
- .gitignore

rules:
document-start: disable
octal-values: enable
Expand Down
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,20 @@ release.
([#484](https://github.com/open-telemetry/semantic-conventions/pull/484))
- Depluralize labels for pod (`k8s.pod.labels.*`) and container (`container.labels.*`) resources
([#625](https://github.com/open-telemetry/semantic-conventions/pull/625))
- BREAKING: Generate process metrics from YAML
([#330](https://github.com/open-telemetry/semantic-conventions/pull/330))
- Rename `process.threads` to `process.thread.count`
- Rename `process.open_file_descriptors` to `process.open_file_descriptor.count`
- Rename attributes for `process.cpu.*`
- `state` to `process.cpu.state`
- Change attributes for `process.disk.io`
- Instead of `direction` use `disk.io.direction` from global registry
- Change attributes for `process.network.io`
- Instead of `direction` use `network.io.direction` from global registry
- Rename attributes for `process.context_switches`
- `type` to `process.context_switch_type`
- Rename attributes for `process.paging.faults`
- `type` to `process.paging.fault_type`

### Features

Expand Down
220 changes: 200 additions & 20 deletions docs/system/process-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,17 @@ metrics](/docs/runtime/README.md#metrics).

<!-- toc -->

- [Metric Instruments](#metric-instruments)
* [Process](#process)
- [Attributes](#attributes)
- [Process Metrics](#process-metrics)
* [Metric: `process.cpu.time`](#metric-processcputime)
* [Metric: `process.cpu.utilization`](#metric-processcpuutilization)
* [Metric: `process.memory.usage`](#metric-processmemoryusage)
* [Metric: `process.memory.virtual`](#metric-processmemoryvirtual)
* [Metric: `process.disk.io`](#metric-processdiskio)
* [Metric: `process.network.io`](#metric-processnetworkio)
* [Metric: `process.thread.count`](#metric-processthreadcount)
* [Metric: `process.open_file_descriptor.count`](#metric-processopen_file_descriptorcount)
* [Metric: `process.context_switches`](#metric-processcontext_switches)
* [Metric: `process.paging.faults`](#metric-processpagingfaults)

<!-- tocstop -->

Expand All @@ -35,27 +43,199 @@ metrics](/docs/runtime/README.md#metrics).
> * SHOULD introduce a control mechanism to allow users to opt-in to the new
> conventions once the migration plan is finalized.
## Metric Instruments
## Process Metrics

### Process
### Metric: `process.cpu.time`

Below is a table of Process metric instruments.
This metric is [recommended][MetricRecommended].

| Name | Instrument Type ([\*](/docs/general/metrics.md#instrument-types)) | Unit | Description | Labels |
|---------------------------------|----------------------------------------------------|-----------|-------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `process.cpu.time` | Counter | s | Total CPU seconds broken down by different states. | `state`, if specified, SHOULD be one of: `system`, `user`, `wait`. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. |
| `process.cpu.utilization` | Gauge | 1 | Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process. | `state`, if specified, SHOULD be one of: `system`, `user`, `wait`. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. |
| `process.memory.usage` | UpDownCounter | By | The amount of physical memory in use. | |
| `process.memory.virtual` | UpDownCounter | By | The amount of committed virtual memory. | |
| `process.disk.io` | Counter | By | Disk bytes transferred. | `direction` SHOULD be one of: `read`, `write` |
| `process.network.io` | Counter | By | Network bytes transferred. | `direction` SHOULD be one of: `receive`, `transmit` |
| `process.threads` | UpDownCounter | {thread} | Process threads count. | |
| `process.open_file_descriptors` | UpDownCounter | {count} | Number of file descriptors in use by the process. | |
| `process.context_switches` | Counter | {count} | Number of times the process has been context switched. | `type` SHOULD be one of: `involuntary`, `voluntary` |
| `process.paging.faults` | Counter | {fault} | Number of page faults the process has made. | `type`, if specified, SHOULD be one of: `major` (for major, or hard, page faults), `minor` (for minor, or soft, page faults). |
<!-- semconv metric.process.cpu.time(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.cpu.time` | Counter | `s` | Total CPU seconds broken down by different states. |
<!-- endsemconv -->

## Attributes
<!-- semconv metric.process.cpu.time(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.cpu.state` | string | The CPU state for this data point. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. | `system` | Recommended |

Process metrics SHOULD be associated with a [`process`](/docs/resource/process.md#process) resource whose attributes provide additional context about the process.
`process.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `system` | system |
| `user` | user |
| `wait` | wait |
<!-- endsemconv -->

### Metric: `process.cpu.utilization`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.cpu.utilization(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.cpu.utilization` | Gauge | `1` | Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process. |
<!-- endsemconv -->

<!-- semconv metric.process.cpu.utilization(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.cpu.state` | string | The CPU state for this data point. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. | `system` | Recommended |

`process.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `system` | system |
| `user` | user |
| `wait` | wait |
<!-- endsemconv -->

### Metric: `process.memory.usage`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.memory.usage(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.memory.usage` | UpDownCounter | `By` | The amount of physical memory in use. |
<!-- endsemconv -->

<!-- semconv metric.process.memory.usage(full) -->
<!-- endsemconv -->

### Metric: `process.memory.virtual`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.memory.virtual(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.memory.virtual` | UpDownCounter | `By` | The amount of committed virtual memory. |
<!-- endsemconv -->

<!-- semconv metric.process.memory.virtual(full) -->
<!-- endsemconv -->

### Metric: `process.disk.io`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.disk.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.disk.io` | Counter | `By` | Disk bytes transferred. |
<!-- endsemconv -->

<!-- semconv metric.process.disk.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`disk.io.direction`](../attributes-registry/disk.md) | string | The disk IO operation direction. | `read` | Recommended |

`disk.io.direction` MUST be one of the following:

| Value | Description |
|---|---|
| `read` | read |
| `write` | write |
<!-- endsemconv -->

### Metric: `process.network.io`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.network.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.network.io` | Counter | `By` | Network bytes transferred. |
<!-- endsemconv -->

<!-- semconv metric.process.network.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`network.io.direction`](../attributes-registry/network.md) | string | The network IO operation direction. | `transmit` | Recommended |

`network.io.direction` MUST be one of the following:

| Value | Description |
|---|---|
| `transmit` | transmit |
| `receive` | receive |
<!-- endsemconv -->

### Metric: `process.thread.count`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.thread.count(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.thread.count` | UpDownCounter | `{thread}` | Process threads count. |
<!-- endsemconv -->

<!-- semconv metric.process.thread.count(full) -->
<!-- endsemconv -->

### Metric: `process.open_file_descriptor.count`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.open_file_descriptor.count(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.open_file_descriptor.count` | UpDownCounter | `{count}` | Number of file descriptors in use by the process. |
<!-- endsemconv -->

<!-- semconv metric.process.open_file_descriptor.count(full) -->
<!-- endsemconv -->

### Metric: `process.context_switches`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.context_switches(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.context_switches` | Counter | `{count}` | Number of times the process has been context switched. |
<!-- endsemconv -->

<!-- semconv metric.process.context_switches(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.context_switch_type` | string | Specifies whether the context switches for this data point were voluntary or involuntary. | `voluntary` | Recommended |

`process.context_switch_type` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `voluntary` | voluntary |
| `involuntary` | involuntary |
<!-- endsemconv -->

### Metric: `process.paging.faults`

This metric is [recommended][MetricRecommended].

<!-- semconv metric.process.paging.faults(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `process.paging.faults` | Counter | `{fault}` | Number of page faults the process has made. |
<!-- endsemconv -->

<!-- semconv metric.process.paging.faults(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| `process.paging.fault_type` | string | The type of page fault for this data point. Type `major` is for major/hard page faults, and `minor` is for minor/soft page faults. | `major` | Recommended |

`process.paging.fault_type` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `major` | major |
| `minor` | minor |
<!-- endsemconv -->

[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.26.0/specification/document-status.md
[MetricRecommended]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.26.0/specification/metrics/metric-requirement-level.md#recommended
120 changes: 120 additions & 0 deletions model/metrics/process-metrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
groups:
- id: attributes.process.cpu
prefix: process.cpu
type: attribute_group
brief: "Attributes for process CPU metrics."
attributes:
- id: state
brief: "The CPU state for this data point. A process SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels."
type:
allow_custom_values: true
members:
- id: system
value: 'system'
- id: user
value: 'user'
- id: wait
value: 'wait'

- id: metric.process.cpu.time
type: metric
metric_name: process.cpu.time
brief: "Total CPU seconds broken down by different states."
instrument: counter
unit: "s"
attributes:
- ref: process.cpu.state

- id: metric.process.cpu.utilization
type: metric
metric_name: process.cpu.utilization
brief: "Difference in process.cpu.time since the last measurement, divided by the elapsed time and number of CPUs available to the process."
instrument: gauge
unit: "1"
attributes:
- ref: process.cpu.state

- id: metric.process.memory.usage
type: metric
metric_name: process.memory.usage
brief: "The amount of physical memory in use."
instrument: updowncounter
unit: "By"
attributes: []

- id: metric.process.memory.virtual
type: metric
metric_name: process.memory.virtual
brief: "The amount of committed virtual memory."
instrument: updowncounter
unit: "By"
attributes: []

- id: metric.process.disk.io
type: metric
metric_name: process.disk.io
prefix: process.disk
brief: "Disk bytes transferred."
instrument: counter
unit: "By"
attributes:
- ref: disk.io.direction

- id: metric.process.network.io
type: metric
metric_name: process.network.io
brief: "Network bytes transferred."
instrument: counter
unit: "By"
attributes:
- ref: network.io.direction

- id: metric.process.thread.count
type: metric
metric_name: process.thread.count
brief: "Process threads count."
instrument: updowncounter
unit: "{thread}"
attributes: []

- id: metric.process.open_file_descriptor.count
type: metric
metric_name: process.open_file_descriptor.count
brief: "Number of file descriptors in use by the process."
instrument: updowncounter
unit: "{count}"
attributes: []

- id: metric.process.context_switches
type: metric
metric_name: process.context_switches
brief: "Number of times the process has been context switched."
instrument: counter
unit: "{count}"
attributes:
- id: process.context_switch_type
brief: "Specifies whether the context switches for this data point were voluntary or involuntary."
type:
allow_custom_values: true
members:
- id: voluntary
value: 'voluntary'
- id: involuntary
value: 'involuntary'

- id: metric.process.paging.faults
type: metric
metric_name: process.paging.faults
brief: "Number of page faults the process has made."
instrument: counter
unit: "{fault}"
attributes:
- id: process.paging.fault_type
brief: "The type of page fault for this data point. Type `major` is for major/hard page faults, and `minor` is for minor/soft page faults."
type:
allow_custom_values: true
members:
- id: major
value: 'major'
- id: minor
value: 'minor'
Loading

0 comments on commit d21135e

Please sign in to comment.