Monitor Linux/Windows VM and applications¶

The configuration of the monitoring for Linux/Windows virtual machines and applications is implemented using fully automated GitOps workflows. You can access those workflows in your administration environment.

Note

In this tutorial, please replace the following values:

ZONE_NAME with the name of the administrative zone (it starts with ocb-).
CUSTOMER_ENVIRONMENT with the name of the customer environment where virtual machine is located

Principle¶

To monitor a virtual machine (VM) located in one of your customer environments, you have to:

install system or application(s) specific Prometheus exporter(s) on this VM
declare Prometheus exporters informations in corresponding Git repository in your administration environment

Internal Caascad CI/CD pipelines will validate syntactically exporters informations, create and deploy automatically all necessary Prometheus scrapping configurations.

Important

Ensure that network connectivity exists between Kubernetes® cluster and VM to monitor and it is possible to establish a connection between Prometheus that is running inside the cluster and VM exporter network port.

Install Prometheus exporters¶

According to your monitoring requirements, you can deploy any Prometheus exporter.

There is a wide variety of official and third-party exporters available, so check this list to see if one of them meets your needs.

Here are some installation examples of operating system specific Prometheus exporter:

GitOps Workflow¶

Follow GitOps workflow documentation with feature_name=monitoring-vm

Several actions are possible:

add/modify informations about the VM(s) which must be monitored
add/modify informations about exporter(s) used to monitor VMs applications
remove monitored VM informations
remove monitored exporter information

Values¶

This section describes all possible configuration values to be specified in Git repository.

Values format¶

Exporters¶

exporters section contains informations about exporters deployed on the VMs from which Prometheus will scrape the metrics.

Field	Description	Scheme	Required
name	Name of exporter	string: must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character	true
port	Port the exporter metrics is exposed on	int : between 1024 and 65535	true
interval	Interval at which metrics should be scraped	string : must consist of numbers and must end with `s`	true
path	HTTP path to scrape for metrics	string : must start with `/`	true
relabelings	Add, modify metrics labels	[]*RelabelConfig	false
addresses	Addresses specifies the VMs for which Prometheus will scrap the metrics	addresses	true

Addresses¶

Addresses section contains informations about the VM itself.

Field	Description	Scheme	Required
hostname	VM's name. Useful to differentiate from which VM the metrics come.	string : must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character	true
ip	VM's private IP. IPs must be private.	string	true

Example¶

exporters:
  - name: node-exporter # OS metrics exporter for Linux
    port: 9100
    interval: 15s
    path: /metrics
    relabelings:
      - replacement: demo_vm_node_exporter
        targetLabel: demo_label
    addresses:
      - hostname: vm-linux-1
        ip: 10.0.59.230
      - hostname: vm-linux-2
        ip: 10.0.21.114
  - name: windows-exporter # OS metrics exporter for Windows
    port: 9182
    interval: 30s
    path: /metrics
    addresses:
      - hostname: vm-windows-3
        ip: 10.0.59.235
      - hostname: vm-windows-4
        ip: 10.0.59.236

In this example, four VMs are monitored:

VM 1: ip 10.0.59.230
VM 2 : ip 10.0.21.114
VM 3 : ip 10.0.59.235
VM 4 : ip 10.0.59.236

For the first two VMs, the installed exporter is node_exporter.

For the last two VMs, it's windows_exporter.

Example of metrics retrieved for the last two VMs:

windows_cpu_processor_performance{cc_client="ocb-demo", cc_prom="cloud-app", cc_prom_source="janeway", cc_vm_source="vm-windows-3", core="0,0", endpoint="metrics", instance="10.0.59.235:9182", job="caascad-windows-exporter", namespace="caascad-monitoring-vm", prometheus="monitoring-app/app-prometheus", service="caascad-windows-exporter"} 2168316033206
windows_cpu_processor_performance{cc_client="ocb-demo", cc_prom="cloud-app", cc_prom_source="janeway", cc_vm_source="vm-windows-4", core="0,0", endpoint="metrics", instance="10.0.59.236:9182", job="caascad-windows-exporter", namespace="caascad-monitoring-vm", prometheus="monitoring-app/app-prometheus", service="caascad-windows-exporter"} 2060335251087

The important labels are:

cc_prom_source: identifies customer environment.
cc_vm_source: identifies monitored VM. The value of this label is equal to the value of hostname in the configuration file.
instance: ip address of the VM with the port of exporter.
job: the value of this label is equal to caascad-name where name is the value of name in the configuration file.

You can add your own labels in relabelings section.

Troubleshooting¶

If the deployment went well, but metrics are not visible in Grafana, you can check Prometheus target status:

connect to customer environment

do a port-forward on Prometheus

kubectl port-forward -n caascad-monitoring svc/caascad-prometheus 9090:9090 &

go to this URL
```
http://localhost:9090/targets
```
check the error message

Connectivity issues:

Bad path:

If there are connectivity problems, check the private IP of the VM and the security group ...

Alerting¶

It is possible to create PrometheusRules to alert when metrics are no longer scrapped by Prometheus.

Go to: https://git.ZONE_NAME.caascad.com/MonitoringApp/alerts

And follow the README.md to add an alert:

 - alert: MonitoringVMDown
    annotations:
      description: monitoring vm is down
      message: '{{ $value }}% of the {{ $labels.cc_vm_source }} targets are down.'
    expr: 100 * (count(up{cc_vm_source!=""} == 0) BY (cc_prom_source, cc_vm_source, job, namespace, service) /
          count(up{cc_vm_source!=""}) BY (cc_prom_source, cc_vm_source, job, namespace, service)) > 10
    for: 10m
    labels:
      severity: warning

When there is a problem with the VM, an alert will be raised:

Grafana Optionals dashboards¶

To visualise system metrics for Linux or Windows hosts, some predefined Grafana dashboards can be enabled in your customer administration environment.

This is the list of the optional dashboards maintained by Caascad team:

Dashboard Name	Description	Optional Feature Required
`linux-node-exporter`	A dashboard to monitor VMs running on Linux	`monitoring-vm` `grafana-dashboards`
`windows-node-exporter`	A dashboard to monitoring VM running on Windows	`monitoring-vm` `grafana-dashboards`

GitOps Workflow¶

Follow GitOps workflow documentation with feature_name=grafana-dashboards

Several actions are possible:

add optional dashboard to Grafana
remove optional dashboard from Grafana

Values¶

This section describes all possible configuration values to be specified in files in your Caascad Git repositories.

Values format¶

Grafana Dashboards¶

grafana-dashboards section contains informations about the Grafana dashboards that will be deployed on the Grafana application of your zone.

Field	Description	Scheme	Required
`<dashboard_name>`	Name of the dashboard	string	true
`<dashboard_name>.enabled`	Enable/disable dashboard	boolean	true

Example¶

---
grafana_dashboards:
  linux-node-exporter:
    enabled: true
  windows-node-exporter:
    enabled: true

In this example, the dashboards linux-node-exporter and windows-node-exporter will be installed.