Sunday, July 12, 2015

Manually Monitoring a Client with Shinken

Now that we've set up our Shinken server, we're going to want to monitor a client.

Prerequisites: Check out this introductory guide to Shinken and Nagios, and make sure you've set up your Shinken server, first. You'll need to set up an additional client server, so make sure you're familiar with Vagrant and VirtualBox. This guide was written for Mac OS X users.

More Configuration for the Shinken Server

The first thing we'll want to do is make some changes to our Shinken server to prepare it for the client, so let's ssh back into that machine.

# ssh into the server vm
vagrant ssh hostname1

# log in as root user
sudo su

# switch to the shinken user
su shinken

# create a host file config for the client
vi /etc/shinken/hosts/client.cfg

Add the following content:

Now let's add some modules.

# search for the ftp module
shinken search ftp
ftp (naparuba) [pack,ftp,ftps] : FTP(s) checks

# install the ftp module
shinken install ftp

# search for the snmp module
shinken search snmp
hpux (claneys) [pack,hp,hpux,os,server,snmp] : Standard HPUX checks, like CPU, RAM and disk space. Checks are done by SNMP.
linux-snmp (naparuba) [pack,linux,snmp] : Linux checks based on SNMP

# install the snmp module
shinken install linux-snmp
# restart the server
service shinken restart

Configuring the Shinken Client

Open up a new terminal session on your Mac OS X machine and return to the shinken directory we created in the previous article.

# ssh into the client vm
vagrant ssh hostname2

# install snmp and snmpd
sudo apt-get install snmp
sudo apt-get install snmpd

# edit the snmpd config
sudo vi /etc/snmp/snmpd.conf

Make sure this line is commented out:

# agentAddress  udp:127.0.0.1:161

And this line is uncommented:

agentAddress udp:161,udp6:[::1]:161

Now update the SNMP password.

# change this
rocommunity public  default    -V systemonly

# to this
rocommunity snmpP@ss

Warning: If you get the error ERROR: Description table : The requested table is empty or does not exist., you need to make sure that you set the value to rocommunity snmpP@ss and not rocommunity snmpP@ss default -V systemonly

Restart the SNMP service.

sudo service snmpd restart

Debugging

Now, you're going to get a bunch of errors, at this stage. Your dashboard is going to look something like the following:

Get ready for a lot of warning blocks, ahead!

SNMP Support Issues

Warning: If you get the following error: Can't locate Net/SNMP.pm in @INC (@INC contains: /usr/lib/perl5/5.8.5/i386-linux-thread-multi /usr/lib/perl5/5.8.5 /usr/lib/perl5/site_perl/..., you need to install snmp support for perl on the server (hostname1).

# install snmp support for perl
sudo apt-get install libnet-snmp-perl

# restart
sudo service shinken restart

Missing Utils Module Issue

Warning: If you get the following error: Can't locate utils.pm in @INC (you may need to install the utils module)..., you'll need to create a symlink.

https://github.com/shinken-monitoring/pack-linux-snmp/issues/10
# create symlink
ln -s /usr/lib/nagios/plugins/utils.pm /var/lib/shinken/libexec/utils.pm

# restart
sudo service shinken restart
http://ubuntuforums.org/showthread.php?t=2177741

Issues Connecting to the FTP Service

Warning: If you get the error Warning: connect to address localhost and port 21: Connection refused, you'll need to install vsftpd on the client machine (hostname2).

# install vsftpd on the client
sudo apt-get install vsftpd

# test that ftp is working
telnet localhost 21
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 (vsFTPd 3.0.2)

Missing Network Plugin

Warning: If you get the error /bin/sh: 1: /var/lib/shinken/libexec/check_netint.pl: not found, I believe this is because the check_netint.pl just doesn't exist and linux-snmp is outdated. I'll follow up with a solution, in the future.

# list contents of libexec
ls -l /var/lib/shinken/libexec/

-rwxr-xr-x 1 shinken shinken 197343 Mar 19  2014 check_logfiles
-rwxr-xr-x 1 shinken shinken  22920 Mar 19  2014 check_snmp_load.pl
-rwxr-xr-x 1 shinken shinken   3502 Jun 27  2014 check_snmp_mem.pl
-rwxr-xr-x 1 shinken shinken  23930 Mar 19  2014 check_snmp_storage.pl
drwxr-xr-x 2 shinken shinken   4096 Jul 12 17:38 discovery
-rwxr-xr-x 1 shinken shinken   7051 Jul 12 17:38 dump_vmware_hosts.py
-rwxr-xr-x 1 shinken shinken   3498 Jul 12 17:38 external_mapping.py
-rwxr-xr-x 1 shinken shinken   3860 Jul 12 17:38 link_libvirt_host_vm.py
-rwxr-xr-x 1 shinken shinken   6265 Jul 12 17:38 link_vmware_host_vm.py
-rwxr-xr-x 1 shinken shinken   5464 Jul 12 17:38 link_xen_host_vm.py
-rwxr-xr-x 1 shinken shinken    506 Jun 27  2014 logFiles_linux.conf
-rwxr-xr-x 1 shinken shinken     85 Jul 12 17:38 notify_by_xmpp.ini
-rwxr-xr-x 1 shinken shinken   2686 Jul 12 17:38 notify_by_xmpp.py
-rwxr-xr-x 1 shinken shinken   2610 Jul 12 17:38 send_nsca.py
-rwxr-xr-x 1 shinken shinken  10337 Jul 12 17:38 service_dependency_mapping.py
lrwxrwxrwx 1 shinken shinken     32 Jul 12 20:48 utils.pm -> /usr/lib/nagios/plugins/utils.pm

Missing Log File

Warning: If you get the error Log_File_Health - UNKNOWN - (1 unknown in logFiles_linux.protocol-2015-07-12-21-14-35) - could not find logfile /var/log/rhosts/remote-hosts.log, .

sudo mkdir -p /var/log/rhosts/
sudo touch /var/log/rhosts/remote-hosts.log

Issues Connecting to the NTP Server

Warning: If you get the error TimeSync - CRITICAL ( CRITICAL NTP: No response from the NTP server), you need to install ntp on the client.

sudo apt-get install ntp

You will then get the following error: NTP CRITICAL: Offset unknown. This seems to be an ongoing issue. I'm still looking into the matter, and don't have a resolution, at this time.

Other Possible Issues

If you see the following error: Warning: ERROR: Description table : No response from remote host "[hostname or ip]", you might just want to, manually, make sure you can actually connect to the host.

curl -v 127.0.0.1

* Rebuilt URL to: 127.0.0.1/
* Hostname was NOT found in DNS cache
*   Trying 127.0.0.1...
* connect to 127.0.0.1 port 80 failed: Connection refused
* Failed to connect to 127.0.0.1 port 80: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 127.0.0.1 port 80: Connection refused

Make sure there aren't any firewall rules that are preventing the packets from going through. Find out how to configure your firewall by reading this article. You should then get to a point when your can hit either localhost or your remote server.

telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

At this stage, you're dashboard won't be perfect, but you should be in a much better place than where we were when we started.

That's it for now. I'll continue to look into some of these issues and expand upon my Shinken article series, so check back, once in a while.

No comments:

Post a Comment