Feature #1421
Search to alert users of bad disks
Start date:
08/17/2016
Due date:
% Done:
100%
Description
Need to look for:
- FAILING_NOW or In_the_past in the events
- SMART_overall_health_self_assessment_test_result that has a value other than PASSED
See examples reports here: https://www.smartmontools.org/wiki/Help#Howtoreadsmartctlreports
History
#1 Updated by Luke Murphey over 8 years ago
sourcetype = smartmon Serial_Number=* | stats values(SMART_overall_health_self_assessment_test_result) as SMART_overall_health_self_assessment_test_result sparkline count(Reallocated_Sector_Ct) as Reallocated_Sector_Ct latest(Offline_Uncorrectable) as Offline_Uncorrectable latest(Current_Pending_Sector) as Current_Pending_Sector by Device_Model, Serial_Number
#2 Updated by Luke Murphey over 8 years ago
sourcetype = smartmon Serial_Number=* | stats values(SMART_overall_health_self_assessment_test_result) as SMART_overall_health_self_assessment_test_result sparkline(max(Reallocated_Sector_Ct)) as Reallocated_Sector_Ct latest(Offline_Uncorrectable) as Offline_Uncorrectable latest(Current_Pending_Sector) as Current_Pending_Sector by Device_Model, Serial_Number
#3 Updated by Luke Murphey over 8 years ago
sourcetype = smartmon Serial_Number=* | stats values(SMART_overall_health_self_assessment_test_result) as SMART_overall_health_self_assessment_test_result sparkline(max(Reallocated_Sector_Ct)) as Reallocated_Sector_Ct latest(Offline_Uncorrectable) as Offline_Uncorrectable latest(Current_Pending_Sector) as Current_Pending_Sector by Device_Model, Serial_Number, host
#4 Updated by Luke Murphey about 8 years ago
#5 Updated by Luke Murphey about 8 years ago
- SMART 5 – Reallocated_Sector_Count. (1-4 warning, > 4 replace)
- Reallocated_Sector_Ct > 4
- SMART 187 – Reported_Uncorrectable_Errors. (> 1 replace)
- Offline_Uncorrectable > 1
- SMART 188 – Command_Timeout. (1-13 warning, > 13 replace)
- Command_Timeout > 13
- SMART 197 – Current_Pending_Sector_Count. (> 1 replace)
- Current_Pending_Sector > 1
- SMART 198 – Offline_Uncorrectable. (> 1 replace)
- Reported_Uncorrect > 1
#6 Updated by Luke Murphey about 8 years ago
sourcetype=smartmon Serial_Number=* | stats max(Reallocated_Sector_Ct) as Reallocated_Sector_Ct max(Offline_Uncorrectable) as Offline_Uncorrectable max(Current_Pending_Sector) as Current_Pending_Sector max(Reported_Uncorrect) as Reported_Uncorrect by Device_Model, Serial_Number | eval error_message=case(Reallocated_Sector_Ct > 4, "Reallocated sector count is above 4", Offline_Uncorrectable > 1, "Offline uncorrectable error count is above 1", Command_Timeout > 13, "Command timeouts are above 13", Current_Pending_Sector > 1, "Current pending sector count is above 1", Reported_Uncorrect > 1, "Reported uncorrectable errors are above 1") | search error_message=*
sourcetype=smartmon Serial_Number=* (SMART_overall_health_self_assessment_test_result!="PASSED" OR ("FAILING_NOW"))
#7 Updated by Luke Murphey about 8 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100