(This entry refers to the checks performed by OmniCenter as part of its monitoring duties.)
At the core of OmniCenter’s operation is a series of monitoring checks that are performed against the devices on the networks it monitors.
- Service Checks
A service check queries a service on the host device that it is assigned to for a response. The response code determines the status of the check. The most versatile check type, with the broadest range of options. Host availability is monitored using a ping service check. Failure of a service check will trigger host checking to aid in root cause analysis.
- Threshold Checks
OmniCenter automatically collects and stores a wide range of statistical data from your devices for reporting purposes. A threshold check actively monitors an individual instance of a statistic for unacceptably high and/or low values by comparing the collected values to configured static threshold values (expressed as either percentages or absolute values).
- Anomaly Checks
An anomaly check monitors an individual instance of a statistic over time for unusual changes in its regular behavior. For example, if CPU usage for a particular device normally averages 80-90% during business hours, but now is averaging 40-50% during those same hours. The new values are perfectly reasonable from a performance standpoint, and thus, would not be detected as a problem by a static threshold check. But, it is anomalous behavior and something you would want to be made aware of, as it could indicate connection problems or some other indirect issue.
- Configuration Check
The configuration check is a global check performed on all eligible managed devices by the config manager. It monitors host devices for changes to their device configuration. For these devices, OmniCenter automatically downloads and archives the device configuration every night. The most recently downloaded configuration is compared to the last archived version and checked for changes.
- Web Application Response Time (WebART) Checks
Monitors availability and response-time performance of web-based applications.
- Email Application Check
Monitors availability and response-time performance of your organization’s email application.
Service, threshold and anomaly checks are assigned to individual host devices, and their queries are run against those devices only. The “scheduled configuration check” is a single global check run by the OmniCenter config manager against all managed devices. Whereas, the WebART and email check types do not require a device at all in order to work. Some checks are assigned to devices automatically by OmniCenter when they are first discovered. Additional checks may be added to devices by an administrator.
OmniCenter indicates the health of its monitored networks through the “status” of its various checks. Under perfect conditions, all checks will be in an “OK” state. When the result of a check query indicates a serious problem that should be addressed immediately, that check enters a “CRITICAL” state and generates an “alarm.” Newly generated alarms always attempt to open a new “incident,” which is OmniCenter’s way of recording events. (See the entries for alarm and incident for more information.) All checks, except host and config, have customizable settings to determine when they will generate an alarm. See the individual entries for each check type for specific information about their settings.
A check in a failed state will continue to execute its query according to its schedule even after it has generated an alarm and caused an incident to be opened. If a failed check then returns to an OK state, it will clear its alarm and close its incident. (See the incident entry for how cleared alarms affect incident status.)
For all OmniCenter checks (except config); if the host device a check is assigned to is in a maintenance window during a check failure, any generated alarms will not open new incidents. This does not affect WebART and email checks, since they are not assigned to specific host devices.
All check types can have one or more action groups assigned to them in their alarm settings to determine the behavior of an incident created by one of their alarms (send notifications, execute commands against the device, etc.).