Production Site Monitoring with zabbix
-
Monitoring requirements
-
Collect performance data for server and devices
-
Troubleshooting efficient
-
Predict site issue
-
Triggers/events
-
Metrics and events architecture
Image may be NSFW.
Clik here to view.
-
Templates dev & KPI metrics
-
Core switch KPI
-
OS KPI
requirements |
The metric |
Trigger defined |
Note |
|
CPU utilization |
95% {Template Virt VMware Hypervisor:vmware.hv.cpu.usage[{$URL},{HOST.HOST}].min(5m)}>95 |
ALL ESXi hosts |
||
Memory utilization |
vmware.memory.usage.percent |
95% {Template Virt VMware Hypervisor:vmware.memory.usage.percent[{$URL},{HOST.HOST}].min(5m)}>95 |
||
CPU utilization |
System.cpu.util
|
95% Linux Windows: |
||
Memory utilization |
Vm.memory.size |
95%
{Template OS Windows:vm.memory.size[pused].min(5m)}>90
Linux: {Template OS Linux:vm.memory.size[pused].min(5m)}>90 |
||
Disk utilization |
Vfs.fs.size |
Linux: {Template OS Linux:vfs.fs.size[{#FSNAME},pfree].last(0)}<10
Windows: {Template OS Windows:vfs.fs.size[{#FSNAME},pfree].last(0)}<10 |
||
… |
…
6. zabbix configurations
6.1 display only triggers not ACKed
change /usr/share/zabbix/dashboard.php from
$dashconf[‘extAck’] = 0;
to
$dashconf[‘extAck’] = 1;
6.2 send trigger only <1:00AM or > 6:00AM because of backup
{Template OS Linux:system.cpu.util[,idle].max(5m)}<5&
({Template OS Linux:system.cpu.util[,idle].time(0)}<010000|
{Template OS Linux:system.cpu.util[,idle].time(0)}>060000
The post Site monitoring with Zabbix appeared first on Robert Chen.