You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this project we have primarily been working with the backblaze dataset, and more recently the ceph-telemetry dataset. These datasets mainly consist of SMART metrics collected from hard disks via the smartctl tool (although ceph-telemetry also contains quite a lot of metadata, in addition to SMART metrics).
However, some recent research suggests that incorporating disk performance and disk location data on top of SMART data can be valuable in analyzing disk health. Specifically, this paper claims to achieve improvements in disk failure prediction models, when using these additional metrics. If this is indeed true for our use cases as well, then ceph should also collect these metrics from their users as a part of ceph-telemetry, so that we can build better models.
In this epic, we will explore this FAST dataset and evaluate the tradeoffs between performance gain and overhead of collecting these metrics from users. This would help us determine the optimal set of additional features that ceph should collect from users, to get the maximum benefit (in terms of better disk health prediction models).
The text was updated successfully, but these errors were encountered:
In this project we have primarily been working with the backblaze dataset, and more recently the ceph-telemetry dataset. These datasets mainly consist of SMART metrics collected from hard disks via the
smartctl
tool (although ceph-telemetry also contains quite a lot of metadata, in addition to SMART metrics).However, some recent research suggests that incorporating disk performance and disk location data on top of SMART data can be valuable in analyzing disk health. Specifically, this paper claims to achieve improvements in disk failure prediction models, when using these additional metrics. If this is indeed true for our use cases as well, then ceph should also collect these metrics from their users as a part of ceph-telemetry, so that we can build better models.
In this epic, we will explore this FAST dataset and evaluate the tradeoffs between performance gain and overhead of collecting these metrics from users. This would help us determine the optimal set of additional features that ceph should collect from users, to get the maximum benefit (in terms of better disk health prediction models).
The text was updated successfully, but these errors were encountered: