Quantcast
Viewing all articles
Browse latest Browse all 18

Site monitoring with Zabbix

Production Site Monitoring with zabbix

  1. Monitoring requirements

  • Collect performance data for server and devices
  • Troubleshooting efficient
  • Predict site issue
  • Triggers/events

 

  1. Metrics and events architecture

Image may be NSFW.
Clik here to view.

 

  1. Templates dev & KPI metrics

  2. Core switch KPI

  3. OS KPI

 

requirements

The metric

Trigger defined

Note

CPU utilization

vmware.hv.cpu.usage

95%

{Template Virt VMware Hypervisor:vmware.hv.cpu.usage[{$URL},{HOST.HOST}].min(5m)}>95

ALL ESXi hosts

Memory utilization

vmware.memory.usage.percent

95%

{Template Virt VMware Hypervisor:vmware.memory.usage.percent[{$URL},{HOST.HOST}].min(5m)}>95

 

CPU utilization

 

System.cpu.util


 

95%

Linux
{Template OS Linux:system.cpu.util[,idle].max(5m)}<20

Windows:

{Template OS Windows:system.cpu.util[,,avg5].last()}>95

 

Memory utilization

Vm.memory.size

95%

 

{Template OS Windows:vm.memory.size[pused].min(5m)}>90

 

Linux:

{Template OS Linux:vm.memory.size[pused].min(5m)}>90

 

Disk utilization

Vfs.fs.size

Linux:

{Template OS Linux:vfs.fs.size[{#FSNAME},pfree].last(0)}<10

 

Windows:

{Template OS Windows:vfs.fs.size[{#FSNAME},pfree].last(0)}<10

 

     

 

 

6. zabbix configurations

6.1 display only triggers not ACKed

change /usr/share/zabbix/dashboard.php from

$dashconf[‘extAck’] = 0;

to

$dashconf[‘extAck’] = 1;

 

6.2 send trigger only <1:00AM or > 6:00AM because of backup

{Template OS Linux:system.cpu.util[,idle].max(5m)}<5&

({Template OS Linux:system.cpu.util[,idle].time(0)}<010000|

{Template OS Linux:system.cpu.util[,idle].time(0)}>060000

The post Site monitoring with Zabbix appeared first on Robert Chen.


Viewing all articles
Browse latest Browse all 18

Trending Articles