Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IO stats and ext4 FS stats through new ext4 collector #3047

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

mshahzeb
Copy link

@mshahzeb mshahzeb commented Jun 10, 2024

Fixes: #3005

Adds:

# HELP node_filesystem_errors Number of filesystem errors encountered.
# TYPE node_filesystem_errors counter
node_filesystem_errors{device="/dev/vda2",device_error="",fstype="ext4",mountpoint="/boot"} 0

# HELP node_filesystem_warnings Number of filesystem warnings encountered.
# TYPE node_filesystem_warnings counter
node_filesystem_warnings{device="/dev/vda2",device_error="",fstype="ext4",mountpoint="/boot"} 0

# HELP node_filesystem_messages Number of filesystem log messages.
# TYPE node_filesystem_messages counter
node_filesystem_messages{device="/dev/vda2",device_error="",fstype="ext4",mountpoint="/boot"} 0

From

  • /sys/fs/ext4/<partition>/errors_count: number of ext4 errors (commit)
  • /sys/fs/ext4/<partition>/warning_count: number of ext4 warning log messages (commit)
  • /sys/fs/ext4/<partition>/msg_count: number of other ext4 log messages

and

# HELP node_disk_ioerr_total Number of IO commands that completed with an error.
# TYPE node_disk_ioerr_total counter
node_disk_ioerr_total{device="sda"} 3
node_disk_ioerr_total{device="sr0"} 29

# HELP node_disk_iodone_total Number of completed or rejected IO commands.
# TYPE node_disk_iodone_total counter
node_disk_iodone_total{device="sda"} 307
node_disk_iodone_total{device="sr0"} 4483

From

  • /sys/block/<disk>/device/ioerr_cnt: number of SCSI commands that completed with an error
  • /sys/block/<disk>/device/iodone_cnt: number of completed or rejected SCSI commands

Implements new ext4 collector.

Corresponding procfs changes: prometheus/procfs#651

@mshahzeb
Copy link
Author

Sample generated metrics file
node_metrics.txt

@BurritoWrapped
Copy link

This would be wonderful to have as a feature

@gouthamve
Copy link
Member

Hi @mshahzeb, thanks for looking into this! This is a great start, we now know which files to read.

node_exporter doesn't really try to read the files directly in this codebase, but rather, we abstract the parsing here: https://github.com/prometheus/procfs

/sys/block/<disk>/device/ioerr_cnt and /sys/block/<disk>/device/iodone_cnt should be added here: https://github.com/prometheus/procfs/blob/master/blockdevice/stats.go

/sys/fs/ext4/<partition> should be added to a new ext4 folder like we did for xfs and btrfs

@mshahzeb
Copy link
Author

Thank you I will be moving the code to procfs and open a PR there.

@mshahzeb
Copy link
Author

PR in the works on procfs: prometheus/procfs#651

@mshahzeb mshahzeb changed the title Add IO stats and FS stats Add IO stats and ext4 FS stats through new ext4 collector Jul 11, 2024
@mshahzeb
Copy link
Author

Procfs PR merged: prometheus/procfs#651

@mshahzeb
Copy link
Author

mshahzeb commented Nov 6, 2024

Waiting for new procfs release

@discordianfish
Copy link
Member

You can also use the unreleased version for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disk and filesystem error metrics
4 participants