Project

General

Profile

Feature #1421

Search to alert users of bad disks

Added by Luke Murphey over 8 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
08/17/2016
Due date:
% Done:

100%


Description

Need to look for:

  • FAILING_NOW or In_the_past in the events
  • SMART_overall_health_self_assessment_test_result that has a value other than PASSED

See examples reports here: https://www.smartmontools.org/wiki/Help#Howtoreadsmartctlreports

History

#1 Updated by Luke Murphey over 8 years ago

sourcetype = smartmon Serial_Number=*  | stats values(SMART_overall_health_self_assessment_test_result) as SMART_overall_health_self_assessment_test_result sparkline count(Reallocated_Sector_Ct) as Reallocated_Sector_Ct latest(Offline_Uncorrectable) as Offline_Uncorrectable latest(Current_Pending_Sector) as Current_Pending_Sector by Device_Model, Serial_Number

#2 Updated by Luke Murphey over 8 years ago

sourcetype = smartmon Serial_Number=* | stats values(SMART_overall_health_self_assessment_test_result) as SMART_overall_health_self_assessment_test_result sparkline(max(Reallocated_Sector_Ct)) as Reallocated_Sector_Ct latest(Offline_Uncorrectable) as Offline_Uncorrectable latest(Current_Pending_Sector) as Current_Pending_Sector by Device_Model, Serial_Number

#3 Updated by Luke Murphey over 8 years ago

sourcetype = smartmon Serial_Number=* | stats values(SMART_overall_health_self_assessment_test_result) as SMART_overall_health_self_assessment_test_result sparkline(max(Reallocated_Sector_Ct)) as Reallocated_Sector_Ct latest(Offline_Uncorrectable) as Offline_Uncorrectable latest(Current_Pending_Sector) as Current_Pending_Sector by Device_Model, Serial_Number, host

#5 Updated by Luke Murphey about 8 years ago

  • SMART 5 – Reallocated_Sector_Count. (1-4 warning, > 4 replace)
    • Reallocated_Sector_Ct > 4
  • SMART 187 – Reported_Uncorrectable_Errors. (> 1 replace)
    • Offline_Uncorrectable > 1
  • SMART 188 – Command_Timeout. (1-13 warning, > 13 replace)
    • Command_Timeout > 13
  • SMART 197 – Current_Pending_Sector_Count. (> 1 replace)
    • Current_Pending_Sector > 1
  • SMART 198 – Offline_Uncorrectable. (> 1 replace)
    • Reported_Uncorrect > 1

#6 Updated by Luke Murphey about 8 years ago

sourcetype=smartmon Serial_Number=* | stats max(Reallocated_Sector_Ct) as Reallocated_Sector_Ct max(Offline_Uncorrectable) as Offline_Uncorrectable max(Current_Pending_Sector) as Current_Pending_Sector max(Reported_Uncorrect) as Reported_Uncorrect by Device_Model, Serial_Number | eval error_message=case(Reallocated_Sector_Ct > 4, "Reallocated sector count is above 4", Offline_Uncorrectable > 1, "Offline uncorrectable error count is above 1", Command_Timeout > 13, "Command timeouts are above 13", Current_Pending_Sector > 1, "Current pending sector count is above 1", Reported_Uncorrect > 1, "Reported uncorrectable errors are above 1") | search error_message=*
sourcetype=smartmon Serial_Number=* (SMART_overall_health_self_assessment_test_result!="PASSED" OR ("FAILING_NOW"))

#7 Updated by Luke Murphey about 8 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

Also available in: Atom PDF