When running any form of hosted web service it is important that you know when your server is not working as it should. For that reason, I've written a simple script which helps me monitor the current status of several servers and notifies me when there are problems.
Server Setup
For this to work in a sensible way you need at least two seperate servers, ideally located in seperate geographical locations. The servers do not necessarily have to be dedicated or virtual machines, our monitoring script for example runs on a shared hosting server from BWF. In fact, I would recommend running the monitoring script from a managed host as (unless you intend to setup two copies of this code) you will not be notified if the server with the script on it stops working.
This diagram should illustrate how the system works:
Server Load
For simplicity, I'm going to presume that you know a little about the setup of the servers you wish to monitor. I'm an ubuntu user so the code below was specifically written for Ubuntu; I have also written a version for CentOS, it's just a matter of changing some file paths or variable names which can be OS dependant.
Anyway, the idea of this portion is to install a small PHP code on the server you intend to monitor. Its purpose is to provide the status monitor with a few details abouts the servers load and what services are running. The version I have provided is commented but obviously you can remove those, I tend to call my copies of this status.php but you can call yours whatever you wish.
<?php // If this file loads, apache must be up echo "apache=1;"; // Check if the MySql service is running by connecting to a local database if(@mysql_connect('localhost', 'Your Username', 'Your Password')) { echo "mysql=1;"; } else { echo "mysql=0;"; } // Look at the average CPU Load from a file located at /proc/loadavg $cpu_power = explode(" ", file_get_contents("/proc/loadavg")); echo "cpu1=". $cpu_power[0] .";"; // Load over 1 Minute echo "cpu2=". $cpu_power[1] .";"; // Load over 5 Minutes echo "cpu3=". $cpu_power[2] .";"; // Load over 15 Minutes // Calculate memory usage (as a percentage) from a file called /proc/meminfo $mem_usage = file("/proc/meminfo"); $total_memory = preg_replace("/[^\d]/", "", $mem_usage[0]); // Total Memory $free_memory = preg_replace("/[^\d]/", "", $mem_usage[1]); // Free Memory $free_memory += preg_replace("/[^\d]/", "", $mem_usage[2]); // Used by Buffers $free_memory += preg_replace("/[^\d]/", "", $mem_usage[3]); // Used by cache $used_memory = $total_memory - $free_memory; // Calculate the amount used $memory_usage = ceil(($used_memory / $total_memory) * 100); // Convert to a % echo "memory=$memory_usage;"; ?>When this file is accessed, its output should look something similar to this:
apache=1;mysql=1;cpu1=0.04;cpu2=1.05;cpu3=1.54;memory=27;
Reading the load and sending notifications
Probably the most important stage is to have this status.php file read and processed, and to have notification messages sent when errors are detected. Unfortunately the script I use to accomplish this is slightly too long and complex to embed here but you can download it here: read-status.phps
I'm hoping that most of it (at least the parts you need to edit) are self explanatory. For those who are interested, a brief overview of what it does:
- It loops through an array you define of each server
- For each server it attempts to load the contents of the status.php file
- If it fails to load the file then it knows the server if offline
- If it manages to load the file it will then check the time it took to load, the memory usage and the average cpu load; if these are higher than desired it sets the status as "busy"
- It checks if mysql is running, if not it classes the server as being offline
- All the information it has collected about the server is stored in a cache file. These files get placed into a servers directory so make sure you create this directory and give the script permission to write to it.
- If the server is offline it sends notifications, on a first occurance these go only to "level0" contacts but on a second consecutive occurance it sends notifications to everyone. This is in case of blips which may make it think a server is offline which isn't.
This read-status.php script needs to be run at regular intervals to be useful. This is done through cron. How you edit cron varies depending which control panel (if any) your server uses. But the Unix format would still be the same:
*/3 * * * * curl http://url-of-status-monitor/read-status.php > /dev/null 2>&1
My version uses curl to execute the file but there are other methods which would work just as well. The */3 basically means to run the cron job every 3 minutes.
Displaying load information
The load information is displayed using the cache files which are created for each server being monitored. The format I choose to use to display this can be seen on the Virtual Forums server status page.
<style type="text/css">
.forum-status {
font-size: 12px;
background-color: #f1f1f1;
border-bottom: 1px solid #5EABE7;
width: 95%;
min-width: 660px;
margin: auto;
}
.forum-status th {
border-right: 0px solid #5EABE7;
cursor: default;
}
.forum-status td {
text-align: center;
cursor: default;
border-left: 1px solid #5EABE7;
}
.forum-status .rhs {
border-right: 1px solid #5EABE7;
}
.usage {
border: 1px solid #3575A5;
width: 150px;
height: 15px;
}
.usage div {
float: left;
height: 15px;
}
</style>
<table class="forum-status" cellpadding="3" cellspacing="0" align="center">
<thead>
<tr class="inner_header">
<th>
Server
</th>
<th>
Status
</th>
<th>
CPU Usage
</th>
<th>
Memory Usage
</th>
<th>
Last Checked
</th>
</tr>
</thead>
<tbody>
<?php
include_once('servers.php');
// Loop through each server
foreach($servers as $server) {
// check If it has an alert level -
// my method of deciding if it's important enough to be displayed
if($server['alert_level'] > 0) {
// Load its' cache file
$status = @file_get_contents('servers/server-' . $server['id'] .'.txt');
$status = unserialize($status);
if(!empty($status)) {
$cpu_percent = $status['cpu2'];
if($cpu_percent == 0)
$cpu_percent = 0.1;
?>
<tr>
<td title="<?=$server['description'];?>">
<?=$server['name'];?>
</td>
<td>
<img src="/forum-hosting-images/<?=$status['status'];?>.png"
alt="Server is <?=$status['status'];?>" />
</td>
<td>
<?php if($status['status'] == 'offline') { ?>
Information Unavaliable
<?php } else {
if($cpu_percent > 45) {
$background_color = "ad0000";
} else if($cpu_percent > 15) {
$background_color = "adad00";
} else {
$background_color = "00de2f";
}
?>
<div class="usage">
<div style="width: <?=$cpu_percent;?>%;
background-color: #<?=$background_color;?>;">
</div>
</div>
<?php } ?>
</td>
<td>
<?php if($status['status'] == 'offline') { ?>
Information Unavaliable
<?php } else { ?>
<div class="usage">
<?php
if($status['memory'] > 70) {
$background_color = "ad0000";
} else if($status['memory'] > 55) {
$background_color = "adad00";
} else {
$background_color = "00de2f";
}
?>
<div style="width: <?=ceil($status['memory']);?>%;
background-color: #<?=$background_color;?>;">
</div>
</div>
<?php } ?>
</td>
<td class="rhs">
<?=date('H:i T', $status['updated']);?>
</td>
</tr>
<?php
}
}
}
?>
</tbody>
</table>