Realtime monitoring

The Monitoring menu can be used to view the evolution of the supervision of its information system in real time.

Object status

Statuses are indicators for the hosts or the services. Each status has a precise meaning for the object. To each status corresponds a code generated by the monitoring plugin according to thresholds defined by the user.

Host status

The table below summarizes all the possible statuses for a host.

Status

Exit code

Description

UP

0

The host is available and reachable

DOWN

1

The host is unavailable

UNREACHABLE

2

The host is unreachable

Service status

The table below summarizes all the possible statuses for a service.

Status

Exit code

Description

OK

0

The service presents no problem

WARNING

1

The service has reached the warning threshold

CRITICAL

2

The service has reached the critical threshold

UNKNOWN

3

The status of the service cannot be checked (e.g.: SNMP agent down, etc.)

Advanced statuses

In addition to the standard statuses, new statuses can be used to add additional information:

  • The PENDING status is a status displayed for a service or a host freshly configured but which has not yet been checked by the scheduler.

  • The UNREACHABLE status is a status indicating that the host (parental relationship) is situated downstream of a host with a DOWN status.

  • The FLAPPING status is a status indicating that the status change percentage of the object is very high. This percentage is obtained from calculations performed by the network monitoring engine.

  • The ACKNOWLEDGED status is a status indicating that the incident of the service or of the host has been taken into account by a user.

  • The DOWNTIME status is a status indicating that the incident of the service or of the host occurred during a downtime period.

SOFT and HARD states

A host or a service can have two states:

  • SOFT: Signifies that an incident has just been detected and that it has to be confirmed.

  • HARD: Signifies that the status of the incident is confirmed. Once the status is confirmed, the notification process is engaged (sending of a mail, SMS, etc.).

Status confirmation

An incident (Not-OK status) is confirmed as of the moment when the number of validation attempts has reached its end. The configuration of an object (host or service) requires a regular check interval, a number of attempts to confirm a Not-OK status and an irregular check interval. As soon as the first incident is detected, the state is “SOFT” until its confirmation into “HARD”, triggering the notification process.

Example:

A service has the following check settings:

  • Max check attempts: 3

  • Normal check interval: 5 minutes

  • Retry check interval: 1 minute

Let us imagine the following scenario:

  • Instant t + 0: The service is checked, it has the OK status.

  • Instant t + 5: The second check shows that the service has the CRITICAL status. The service goes into the SOFT state (attempt 1/3).

  • Instant t + 6: The third check is performed, the service still has the CRITICAL status in SOFT (attempt 2/3).

  • Instant t + 7: The fourth check shows that the service still has the CRITICAL status (attempt 3/3). The number of tests has been completed; the state is configured (HARD). The notification process is triggered.

  • Instant t + 8: The service recovers OK status. It goes directly into the HARD state. The notification process is triggered.

  • Instant t + 13: The service has the WARNING status. It goes into the SOFT state (attempt 1/3).

  • Instant t + 14: The service still has the WARNING status (attempt 2/3).

  • Instant t + 15: The service has the CRITICAL status. It remains in the SOFT state because it has changed status.

Generic actions

By default, during viewing of statuses of hosts or services, the monitoring data is refreshed automatically (15 seconds by default). Nevertheless, several icons can be used to check the refreshing of the data. The table below summarizes the functions of these icons:

Icon

Description

refresh

Serves to refresh the results manually.

pause

Serves to put the automatic data refreshment into pause.

resume

Serves to restart automatic data refreshment.

Hosts

Viewing

To view the status of hosts, go into the menu Monitoring > Status Details > Hosts

../_images/04unhandledproblems.png

You can use filter to adapt the view.

  • To view the hosts with a problem but not acknowledged, choose Unhandled Problems in Host Status filter.

  • To view all the hosts with a problem, choose Host Problems in Host Status filter.

  • To view all the hosts, choose All in Host Status filter.

  • To view the hosts classified by host groups, click on Hostgroups Summary menu

../_images/04hostgroup.png

Host tables

The table below gives a description of all the columns of the table displayed when viewing hosts:

Column name

Description

S

Displays the severity level of the host.

Hosts

Displays the name of the host. The icon nonotifications indicates that the notifications for this host are disabled. The icon graphperformances can be used to view all the performance graphs for this host.

Status

Serves to view the status of the host.

IP Address

Indicates the IP address of the host.

Last Check

Displays the date and the time of the last check.

Duration

Displays the time that the host a has kept its current status.

Hard State Duration

Displays the time that the host a has kept its current hard state (does not appear when viewing of all the hosts).

Tries

Displays the number of tries before confirming the state.

Status information

Displays the message explaining the status of the host.

Note

The severity column and the associated filter appear if at least one object displayed has a severity level.

Available filters

You can filter the result presented via the following filters:

  • Host: can be used to filter by name of host via an SQL LIKE type search.

  • Status: can be used to filter by the status of hosts.

  • Severity: can be used to filter by severity.

  • Poller: can be used to filter the hosts by poller. Only the hosts of the poller selected will be displayed.

  • Hostgroup: can be used to filter by host group. Only the hosts of the host group selected will be displayed.

Note

The search on text fields only starts after entry of at least 3 characters.

Host groups table

The table below gives a description of all the columns of the table displays when of the viewing host groups:

Column name

Description

Host Group

List of available groups of hosts.

Hosts Status

Allows to display the number of hosts in UP, DOWN, UNREACHABLE or in PENDING status.

Services Status

Allows to display the number of services in OK, WARNING, CRITICAL or in PENDING status.

Available filters

You can filter result by selecting poller in the select box. Only hosts monitored by poller will be displayed.

Host details

When you click on a host, the following screen appears:

../_images/04hostdetail.png

Services

Here is the listing of all services linked to the host.

Status details

The table below summarizes the attributes of this part:

Attributes

Description

Host Status

Displays the status of the host.

Status information

Displays the information of the status of the host.

Performance Data

Displays performance data associated to the check.

Current Attempt

Displays the number of attempts before status validation.

State Type

Displays the type of state (‘SOFT’ or ‘HARD’).

Last Check

Displays the last check of the host.

Next Check

Displays the next scheduled check of the host.

Latency

Displays the latency in seconds between the scheduled check and the real check execution.

Execution Time

Displays the execution time of the check.

Last State Change

Displays when the status of the host changed.

Current State Duration

Displays the date and time from which the host is in the present state.

Last Notification

Displays the sent date and time of the last notification.

Next Notification

Displays the sent date and time of the next notification.

Current Notification Number

Displays the number of sent notifications.

Is This Host Flapping?

Indicates if the host is in flapping state.

Percent State Change

Displays the percentage change of state.

In Scheduled Downtime?

Indicates if the host is in scheduled downtime.

Last Update

Displays the date and time of the last information update.

Options and controls available

Options and controls allow you to perform a number of actions on the host. Options are described in the chapter Exploitation guide.

Notifications

The Notifications container allows to display the contacts and contactgroups linked to the host which will receive notifications.

Services

Viewing

To view the status of service, go into the Monitoring > Status Details > Services menu.

../_images/04servicelist.png

The grey search bar can be used to filter the result displays. You can use filter to adapt the view.

  • To view the services problems but not acknowledged, choose Unhandled Problems in Service Status filter

  • To view all the services in non-OK status, choose Service Problems in Service Status filter

  • To view all the services, choose All in Service Status filter

  • To view all services (short by host) in any status, click on Services Grid menu and choose for Display Details

../_images/04servicelistbyhostdetail.png
  • To view the number of services (short by host and by status), click on Services Grid menu and choose for Display Summary

../_images/04servicelistbyhost.png
  • To view the all services (short by host’s groups) in any status, click on Services by Hostgroup menu and choose for Display Details

../_images/04servicelistbyhostgroupdetail.png
  • To view the number of services (short by host’s groups), click on Services by Hostgroup menu and choose for Display Summary

../_images/04servicelistbyhostgroup.png
  • To view the all services (short by services groups), click on Services by Servicegroup menu and choose for Display Details

../_images/04servicelistbyservicegroupdetail.png
  • To view the number of services (short by services groups), click on Services by Servicegroup menu and choose for Display Summary

../_images/04servicelistbyservicegroup.png
  • To view the meta services, go to Monitoring > Status Details > Services menu and filter with meta host.

../_images/04metaservices.png

Services table

The table below gives a description of all the columns of the table displayed when viewing services:

Column name

Description

S

Displays the severity level of the service.

Host

Displays the name of host. The url_link icon allows to access to host’s page details.

Services

Displays the name of service. The nonotifications icon indicates that notifications are disabled for this service. The graphperformances icon Allows to display performance graphs of the service. The url_link icon allows to access to service’s page details.

Duration

Displays the duration of the actual status.

Last Check

Displays the date and time of the last check.

Tries

Displays the number of attempts before status validation.

Status information

Displays the message explaining the status of the service.

Note

The severity column and the associated filter appear if at least one object displayed has a severity level.

Note

The Hard State Duration doesn’t appear in All Services menu.

Tables of objects groups

The table below gives a description of all the columns of the table of services sorted by groups:

Column name

Description

Host or Host Groups Hosts or Service Group Hosts

Allows to display hosts or hosts linked to hostgroups or hosts linked to servicegroups. The showservices icon allows to display all services linked to the host. The graphperformances icon allows to display all performance graphs of services linked to the host.

Status

Displays the status of the host.

Services information

Displays the status of services (details mode) or the number of services classified by status (summary mode).

Service details

When you click on a service, the following screen appears:

../_images/04servicedetail.png

Status details

The table below summarizes the attributes of this part:

Attributes

Description

Service Status

Displays the status of the service.

Status information

Displays the information of the status of the service.

Extended status information

Displays long output of the service.

Performance Data

Displays performance data associated to the check.

Current Attempt

Displays the number of attempts before status validation.

State Type

Displays the type of state (‘SOFT’ or ‘HARD’).

Last Check Type

Indicates if the last type of check is ‘active’ or ‘passive’.

Last Check

Displays the last check of the service.

Next Scheduled Active Check

Displays the next scheduled check of the service.

Latency

Displays the latency in seconds between the scheduled check and the real check execution.

Check Duration

Displays the execution time of the check.

Last State Change

Displays when the status of the service changed.

Current State Duration

Displays the date and time from which the host is in the present state.

Last Service Notification

Displays the sent date and time of the last notification.

Current Notification Number

Displays the number of sent notifications.

Is This Service Flapping?

Indicates if the service is in flapping state.

Percent State Change

Displays the percentage change of state.

In Scheduled Downtime?

Indicates if the host is in scheduled downtime.

Last Update

Displays the date and time of the last information update.

Options and controls available

Options and controls allow you to perform a number of actions on the host. Options are described in the chapter Exploitation guide.

Detailed graph and status graph

The Detailed Graph and Status Graph allow to display performance graphs and the history chart statutes for this service.

Host and Service Shortcuts

Options are described in the chapter Exploitation guide.

Links

The Links container allows to display:

  • The groups of hosts which this service is linked.

  • The groups of services which this service is linked.

  • The categories of services which this service is linked.

Notifications

The Notifications container allows to display the contacts and contactgroups linked to the host which will receive notifications.

Downtime

To visualize downtimes:

  1. Go to the menu Monitoring > Downtimes > Downtimes

../_images/04downtimelist.png

The table below gives a description of all the columns:

Column name

Description

Host Name

Indicates the name of host.

Services

Indicates the name of service.

Start Time and End Time

Displays the start and end date and time.

Duration

Displays the duration of the downtime.

Author

Displays the name of user who set the downtime.

Comments

Displays the comments linked to the downtime.

Started

Indicates if the downtime is started or not.

Fixed

Indicates if the start and end datetime if fix or not.

Available filters

You can filter the result presented via the following filters:

  • Host: can be used to filter by name of host via an SQL LIKE type search.

  • Service: can be used to filter by name of service via an SQL LIKE type search.

  • Output: can be used to filter by output of services.

  • Author: can be used to filter by author.

  • Display Finished Downtimes: allows to display recurring finished downtime.

  • Display Recurring Downtimes: allows to display recurring downtime.

Note

The search on text fields only begins entering the third character.

Comments

To visualize comments:

  1. Go to the menu Monitoring > Downtimes >Comments

../_images/04comments.png

The table below gives a description of all the columns:

Column name

Description

Host Name

Indicates the name of host.

Services (only available of services page)

Indicates the name of service.

Entry Time

Displays the date and time when comment had been written.

Author

Displays the name of user who set the comment.

Comments

Displays the content of the comment.

Persistent

Indicates if the comment is persistent when the monitoring engine restarts.

Available filters

You can filter the result presented via the following filters:

  • Host: can be used to filter by name of host via an SQL LIKE type search.

  • Service: can be used to filter by name of service via an SQL LIKE type search.

  • Output: can be used to filter by output of services.

Note

The search on text fields only begins entering the third character.