Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading pipeline information in 'node_stats' API #14805

Open
roaksoax opened this issue Dec 8, 2022 · 1 comment · May be fixed by #16839
Open

Misleading pipeline information in 'node_stats' API #14805

roaksoax opened this issue Dec 8, 2022 · 1 comment · May be fixed by #16839

Comments

@roaksoax
Copy link
Contributor

roaksoax commented Dec 8, 2022

Logstash information:

Please include the following information:

  1. Logstash version: 8.5
  2. Logstash installation source: deb

OS version: ubuntu 20.04

Description of the problem including expected versus actual behavior:

When getting pipeline information from the node stats API, I was looking to obtain the values of pipeline.workers & pipeline.batch_size for a pipeline called "performance". However, the API only showed the general (default) values (e.g. those defined in logstash.yml).

While node_stats is likely not the place where to show configuration, it may be helpful to have.

Example

curl -XGET 'localhost:9600/_node/stats/pipelines?pretty'

Current Behavior
The API currently shows the information as such. The pipeline.workers and pipeline.batch_size in logstash.yml are default. However, the pipeline.workers and pipeline.batch_size for the pipeline called performance have been changed, and the API is not showing that information.

{
  "host" : "file-1",
 [...]
  "pipeline" : {
    "workers" : 1,
    "batch_size" : 125,
    "batch_delay" : 50
  },
 [...]
  "pipelines" : {
    "performance" : {
      "events" : {
        [...]
      },
      "flow" : {
       [..]
      }
 [...]

Potential Solution 1 - Add pipeline information in node_stats API
What I would expect if we are already showing "pipeline" information in "node_stats":

{
  "host" : "file-1",
 [...]
  "pipeline" : {
    "workers" : 1,
    "batch_size" : 125,
    "batch_delay" : 50
  },
 [...]
  "pipelines" : {
    "performance" : {
      "pipeline" : {
          "workers": 2,
          "batch_size": 500,
          "batch_delay": 50,
      }
      "events" : {
        [...]
      },
      "flow" : {
       [..]
      }
 [...]

Potential Solution 2 - Remove 'pipelines' and 'monitoring' from node_stats API

This solution may require us to remove the pipelines and monitoring information from node_stats API. This could introduce breakage in Stack Monitoring and would mean it is no longer backwards compatible.

@roaksoax roaksoax changed the title Missing 'pipeline' config information per pipeline in node stats API Misleading pipeline information in 'node_stats' API Dec 8, 2022
@blightbow
Copy link

blightbow commented Oct 13, 2024

Ran into this when looking to do a calculation on flow.worker_utilization*pipeline.workers per pipeline on a monitoring dashboard. Considering that the documentation for the Node Stats API frequently mentions pipeline.workers as a multiplier for derived statistics, I think Potential Solution 1 is the way to go.

The only other way for monitoring systems to get this information requires that the monitoring developer hardcode the worker counts as constants, and update this information each time pipelines.yml is updated by an operations team. There are obvious downsides to hardcoding the number of workers in monitoring code. Ops can add new pipelines and change the number of workers, causing the monitoring to easily fall out of sync.

@kaisecheng kaisecheng self-assigned this Dec 27, 2024
@kaisecheng kaisecheng linked a pull request Dec 27, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants