ETL

Execution modes

The Centreon MBI reporting database or “data warehouse” is updated every day with aggregated data calculated by the ETL. Regarding your platform, this can represent millions of lines processed every day. This is why the ETL and the data warehouse are two critical components that you have to understand.

The ETL has two modes:

  • Daily mode: this is the regular mode when your Centreon MBI reporting platform is up and running. Centreon data are imported every day, transformed and inserted in the reporting database. This mode is a differential mode: it imports and calculates only data generated on Centreon during the previous day. More precisely :
  • data_bin is imported differentialy and agregations are calculated only for the previous day
  • the whole tables hoststatevents and servicestatevents are imported but events are calculated differentially

It can take from some seconds to several minutes depending on the size of your monitored environment (1K, 10K, 100K services…)

This daily mode is configured as a cron that you can find in /etc/cron.d/centreon-bi-engine

#30 4 * * * root /usr/share/centreon-bi/etl/centreonBIETL --daily >> /var/log/centreon-bi/centreonBIETL.log 2>&1

Warning

Do not execute this script multiple time on the same day or you will have duplication problems.

  • Rebuild mode: this is the mode used after installing Centreon MBI platform, in case of data corruption. With this mode you import and calculate data on a defined period, based on retention parameters, using the same script as the daily mode but using the “–rebuild” option.

Example

/usr/share/centreon-bi/bin/centreonBIETL -r

In order to have acceptable execution times and to manage all the data generated by your Centreon platform: hardware configuration, storage sizing and MySQL optimizations are 3 important points to be careful when installing Centreon MBI. All recommendations can be found on our online documentation in the Architecture & Pre-requisites chapters.

Note

It is highly recommanded to monitore the reporting database using the procedure available in the “Advanced configuration” chapter. If the ETL does not work for few days or the raw data where not up to date for few days, a specific rebuild process of missing days have to be started. Do not hesitate to contact the Support to get some help. The ETL does not automatically re-imports and calculates the missing days.

Performance

If ETL calculation seems too long on a daily basis or rebuild basis, you should considerer optimizing your reporting server by :

  • optimizing MySQL configuration
  • being sure to store the database on high performance disk (avoir I/O wait)
  • adding more physical memory (+ optimize configuration)
  • not sharing storage nor database with other applications

Execution Options

Differents options can be passed to the script to execute specific actions:

-c  Create the reporting database model
-d  Daily execution to calculate statistics on yesterday
-r  Rebuild mode to calculate statitics on a historical period. Can be used with:
    Extra arguments for options -d and -r (if none of the following is specified, these one are selected by default: -IDEP):
-I  Extract data from the monitoring server
    Extra arguments for option -I:
    -C  Extract only Centreon configuration database only. Works with option -I.
    -i  Ignore perfdata extraction from monitoring server
    -o  Extract only perfdata from monitoring server

-D  Calculate dimensions
-E  Calculate event and availability statistics
-P  Calculate perfdata statistics
    Common options for -rIDEP:
    -s  Start date in format YYYY-MM-DD.
        By default, the program uses the data retention period from Centreon MBI configuration
    -e  End date in format YYYY-MM-DD.
        By default, the program uses the data retention period from Centreon MBI configuration
    -p  Do not empty statistic tables, delete only entries for the processed period.
        Does not work on raw data tables, only on Centreon MBI statistics tables.

If no “start” or “end” date is given to the ETL script, the start and end date are automatically calculated using the retention parameters configured on the interface in General Option > Data retention Parameter.

Change history

Centreon MBI logs every change that concerns the relations between, hosts, services, groups and categories

For example:

  • A host “H1” is related to the host group “G1” on January 2012
  • The host “H1” does not belong anymore to the group “G1” on February the 1st 2012
  • After this change, if a report is generated for the group “G1” on the reporting period of January 2012, the statistics of the host “H1” will be taken in account in the statistic of the group “G1”
  • The statistics of the host “H1” will not be taken in account for the group “G1” if the reporting period selected is February 2012
  • If the reporting periods starts on January 15th and ends on February 15th, the statistics of the host “H1” will be considered for the statistics of the group “G1” only from January 15th to January 31st

The initial setup of Centreon configuration and objects relation must be clearly defined before the installation of Centreon MBI in a production platform. Each host, group or category configuration modification is considered as a normal evolution in the life cycle of the component.

Purge

The data purge can be activated in the General Option of Centreon MBI. It garantees the database to respect the retention parameter configured in the general options. It has to be activated by the interface AND in the following cron, on the reporting server : /etc/cron.d/centreon-bi-purge

The reporting dimensions (combination of groupes/categories/host/service/metrics) that have no more data related to the in the reporting database are automatically deleted.

Centile statistics

To be able to use the new standard report “Monthly Network Percentile” you have to activate the centile calculation. To do so, go to Reporting > Business Intelligence > General Options | ETL Tab, and configure the subsection “Centile parameters” as described below to create relevant combination(s) of centile/timeperiod. If you are not interested in that report, just let the default value for the parameters.

Parameter Value
Calculating centile agregation by : Month (at least)
Select service categories to agregate centile on Select at least one of YOUR trafic service category
First day of the week Monday (default)
Create centile-timeperiod combination(s) that fits your needs. (Centile format: 00.0000) Create at least one combination. Ex: 99.0000 - 24x7

See the example in the screenshot below :

../_images/centileParameters.png

Only service categories selected in “Reporting perimeter selection” will appear in the list of service categories available for centile statistics.

You can create as many combination as you want of centile-timeperiod. You must be aware that the more you create, the more the calculation will take time. Start with a small number of combination to see the impact on the time it takes to build.

Note

If you want to be able to generate the centile trafic report on historical data, execute the following steps, if you just want to be able to generate this report in the future (data will be calculated from now), just go to the next chapter.

On the reporting server, execute the following command to import the new configuration:

/usr/share/centreon-bi/bin/centreonBIETL -rIC

Then, execute the following command to update centile configuration in the datawarehouse:

/usr/share/centreon-bi/etl/dimensionsBuilder.pl -d

Finally, execute the following command to calculate only centile statistics:

/usr/share/centreon-bi/etl/perfdataStatisticsBuilder.pl -r --centile-only

How to apply a new configuration to historical data ?

Warning

This procedure delete all the data already calculated and recalculate them. This is also true for dimensions already calculated, they’ll all be delete and recalculated based on the latest Centreon configuration.

When working on Centreon reporting, you might make multiple modifications to host groups, categories and service categories to describe the best your business and organization. Because of the changing dimension management, with the “daily” mode, the modifications that you do are available in the reporting the day after.

When your work on the groups and categories is finished, you can apply the new configuration to the past using the procedure below.

This procedure does not include the importation of raw data, make sure all data imported from Centreon are up-to-date on your reporting server. To check that, execute the following command ::

#/usr/share/centreon-bi/etl/centreonbiMonitoring.pl --db-content

And make sure you see “ETL OK - Database is up to date” OR you do not see the following tables :

  • data_bin
  • hoststatevents
  • servicestateevents

The ETL commands provided below are executed without giving start and end parameter, it means the calculations will be based on the retention parameter defined in Centreon MBI > General Option > Data retention tab. If you’re currently installing or beginning with the product, you may consider modifying the retention to have 3 months defined so it does not take too long time to rebuild. After checking that your configuration is OK, execute the same procedure but modify the retention to the default value you want ( 365 days for instance ).

../_images/etl_dataRetention.png

Import the last Centreon configuration

#/usr/share/centreon-bi/etl/importData.pl -r --centreon-only

Calculated reporting dimensions

The following command will erase all the changes tracked by the reporting mecanism. If you don’t want that, replace the -r by -d::

#/usr/share/centreon-bi/etl/dimensionsBuilder.pl -r

Agregate events and availability

#nohup /usr/share/centreon-bi/etl/eventStatisticsBuilder.pl -r > /var/log/centreon-bi/rebuildAllEvents.log &

Agreggate performance

#nohup /usr/share/centreon-bi/etl/perfdataStatisticsBuilder.pl -r > /var/log/centreon-bi/rebuildAllPerf.log &

How to rebuild missing reporting data ?

You need that procedure when the monitoring plugin of your reporting server tells you that data are not up-to-date on your reporting server. This may appear in case the Centreon MBI ETL fails and/or errors appears on Centreon data.

Example of plugin return that tells you your database is not up to date :

# /usr/share/centreon-bi/etl/centreonbiMonitoring.pl --db-content
[Table mod_bam_reporting, last entry: 2017-07-01 00:00:00] [Table mod_bi_ba_incidents, last entry: 2017-07-01 00:00:00] [Table hoststateevents, last entry: 2017-07-01 00:00:00]
[Table servicestateevents, last entry: 2017-07-01 00:00:00] [Table mod_bi_hoststateevents, last entry: 2017-07-01 00:00:00]
[Table mod_bi_servicestateevents, last entry: 2017-07-01 00:00:00] [Table mod_bi_hostavailability, last entry: 2017-07-01 00:00:00]
[Table mod_bi_serviceavailability, last entry: 2017-07-01 00:00:00] [Table data_bin, last entry: 2017-08-01 00:00:00] [Table mod_bi_metricdailyvalue, last entry: 2017-08-01 00:00:00]
[Table mod_bi_metrichourlyvalue, last entry: 2017-08-01 23:00:00]
  • If you only see the tables mod_bi_*, it means there is only a problem with agregated data and not Centreon data.

    In that case, ignore the “Import Missing data” section below.

  • If you see the following tables, it means there is a problem with raw data imported from Centreon :

  • hoststatevents
  • servicestateevents
  • All the mod_bam_reporting* tables
  • data_bin

In this case, you have make sure problems are resolved on Centreon side and then execute the procedure below.

Pre-requisites

Before executing the commands of the procedures, check that:

  • The Centreon platform is up & running and data are up to date
  • The daily cron centreonBIETL is not activated on the reporting server( it means commented ) in the file /etc/cron.d/centreon-bi-engine. It has to be enabled at the end of the procedure
  • The script dataRetentionManager.pl is not activated on the reporting server ( it means commented ) in the file /etc/cron.d/centreon-bi-purge. It has to be enabled at the end of the procedure
  • The retention is set on the centreon MBI interface, it has to be shorter than 1024 days
  • The retention is enabled in the interface

Note

For the following commands that may take a while, we advice you to use “screen” or “nohup” to prevent from terminal disconnection.

Import missing data

  • Import data, without the performance data (data_bin table), from a specific date depending on the retention you defined in Centreon MBI data retention parameters in General Options, in the “Availability” retention part.

    #nohup /usr/share/centreon-bi/etl/importData.pl -r -s 2017-06-01 -e 2018-01-01 --ignore-databin > /var/log/centreon-bi/rebuild_importDataEvents.log &
    
    *Execution time : (fast) few minutes*
  • Import the data of data_bin, starting from the day where the last data are present. Check the data_bin last value in the plugin return to know the last imported data:

    #nohup /usr/share/centreon-bi/etl/importData.pl -r --no-purge --databin-only -s 2017-08-01 -e 2018-01-01 > /var/log/centreon-bi/rebuild_importDataBin.log &
    
    *Execution time: fast (few minutes), proportional to the number of day to re-import*

Update reporting dimensions

  • Build dimension. The “-d” option is used not to lose the historical changes made in the configuration but implies longer rebuild time. Do not use the “-r” option or you’ll have to rebuild all statistics :

    #nohup /usr/share/centreon-bi/etl/dimensionsBuilder.pl -d > /var/log/centreon-bi/rebuild_dimensions.log &
    

    Execution time : (fast) few seconds to few minutes

Rebuild missing events and the avaibility data

  • Rebuild events from a specific date (depending on the retention set) :

    #nohup /usr/share/centreon-bi/etl/eventStatisticsBuilder.pl -r --events-only > /var/log/centreon-bi/rebuild_events.log &
    

    Execution time: depending on the number of events, can take several hours but should not be longer than 24h. If the script doesn’t finish in 24h, please contact the Centreon support and/or consider optimizing your reporting server (physical memory / disk speed).

  • Rebuild the avaibility tables, starting from the day where the last data are present. Check the mod_bi_hostavailability and mod_bi_serviceavailability date in the plugin return to know the last build data::

    #nohup /usr/share/centreon-bi/etl/eventStatisticsBuilder.pl -r --no-purge --availability-only -s 2017-07-01 -e 2018-01-01 > /var/log/centreon-bi/rebuild_availability.log &
    

    Execution time: from few minutes to several hours, proportional to the number of days to rebuild

Rebuild the missing performance data

  • Rebuild the missing performance statistics. Check the lowest date of mod_bi_metrichourlyvalue and mod_bi_metricdailyvalue in the plugin return to know the last calculated data:

    #nohup /usr/share/centreon-bi/etl/perfdataStatisticsBuilder.pl -r --no-purge -s 2017-08-01 -e 2018-01-01 > /var/log/centreon-bi/rebuild_perfData.log &
    

    Execution time: from few minutes to several hours, proportional to the number of day to calculate. If the number of days to rebuild is > to the hourly retention, you may ask some help from the Helpdesk

What to do after the scripts execution ?

  • Case 1 : The rebuild is done on the same day

Uncomment the lines in /etc/cron.d/centreon-bi-engine and /etc/cron.d/centreon-bi-purge and restart the cron service ::

#systemctl restart crond restart
  • Case 2 : The rebuild finishes the next day
  • Uncomment the lines in /etc/cron.d/centreon-bi-engine and /etc/cron.d/centreon-bi-purge and restart the cron service ::

    #systemctl restart crond restart
    
  • Manually execute the daily script

    #/usr/share/centreon-bi/bin/centreonBIETL -d
    
  • Case 3 : in other cases

Follow the procedure of partial rebuild for the missing days.:

Example:
The reconstruction took 4 days : from 01/01 to 04/01 , you need to follow the procedure from the beginning and use date_start = 01-01 and date_end = 04/01
The procedure is over, the output of the BI monitoring plugin should be “ETL execution OK, database is up-to-date”.

Centreon BAM statistics

If you recently updated your Centreon BAM module to 3.0 or you just rebuilt Centreon BAM statistics, you have to re-import data on the reporting server. To achieve that, execute the following command :

/usr/share/centreon-bi/etl/importData.pl -r --bam-only

It will import all Centreon BAM reporting tables.

Note

To rebuild the BAM statistics, execute the following command :
/usr/share/centreon/www/modules/centreon-bam-server/engine/centreon-bam-rebuild-events --all