Version 2 - History - Wiki - Disk Monitoring - LukeMurphey.net

Wiki » History » Version 2

Luke Murphey, 08/27/2016 05:56 AM

-Luke Murphey
+h1. Wiki
 Luke Murphey
 Luke Murphey
-Luke Murphey
+h2. Getting the Source Code
 Luke Murphey
-Luke Murphey
+This project is open source. You can get the code on "GitHub":https://github.com/LukeMurphey/splunk-smartmon.
 Luke Murphey
-Luke Murphey
+h2. How failures are dedicated
 Luke Murphey
-Luke Murphey
+Failures are identified based on the two things"
 Luke Murphey
-Luke Murphey
+# SMART health status
-Luke Murphey
+# Known bad attributes
 Luke Murphey
-Luke Murphey
+Monitoring SMART health status is easy. SMART will flag a disk as being a failure and indicate a failed test.
 Luke Murphey
-Luke Murphey
+However, there are some other attributes that do not necessarily flagged by SMART as a failure but are known to be associated with failing disks. BackBlaze (see https://www.backblaze.com/blog/hard-drive-smart-stats/) has determined that the following can be used to flag drives as likely failing soon:
 Luke Murphey
-Luke Murphey
+* SMART 5 – Reallocated_Sector_Count. (1-4 warning, > 4 replace)
-Luke Murphey
+* SMART 187 – Reported_Uncorrectable_Errors. (> 1 replace)
-Luke Murphey
+* SMART 188 – Command_Timeout. (1-13 warning, > 13 replace)
-Luke Murphey
+* SMART 197 – Current_Pending_Sector_Count. (> 1 replace)
-Luke Murphey
+* SMART 198 – Offline_Uncorrectable. (> 1 replace)
 Luke Murphey
-Luke Murphey
+h2. Contents
 Luke Murphey
-Luke Murphey
+{{child_pages}}