Scan Engine¶
Data Normalization¶
The NSIA scan engine must identify the character set encoding and content-type correctly for the analysis to be complete. In order to detect the character set, NSIA uses an algorithm based on the one that Mozilla Firefox uses. In order to detect the content-type, NSIA uses an algorithm that detects the content-type based on the file contents.
HTTP Auto-Discovery Rules¶
The HTTP-Auto Discovery rules automatically browse web-site content to identify all of the resources associated with the website. The scan engine can spawn multiple threads and may not follow the same path each scan. Therefore, the scan engine may not examine the same files each time it scans if the scan settings are too restrictive and prevent it from scanning the entire website.
ThreatScript Processing¶
ThreatScripts can be complicated software in-and-of themselves and therefore require special handling by the scan engine to ensure that definitions do not cause an issue in the scan engine. See source:trunk/src/net/lukemurphey/nsia/scan/ScriptDefinition.java to view the code for the ScriptDefinition class (which provides the functionality for ThreatScripts).
Classloading¶
A custom classloader is used by the thread that executes the ThreatScript definitions in order to control what libraries the definitions can access and provides simplified names for the packages.
Maximum Runtime¶
The scan engine ensures that ThreatScript definitions cannot cause a crash in the engine itself by entering a infinite loop. The execution of the script is monitored at runtime and terminate it if it runs too long. Any detected error will be reported to the event logging sub-system.