I monitor my infrastructure using Nagios running on multiple Linux servers. I'm switching over to using Pulseway.
There are a lot of things that Nagios does that Pulseway does not, so I'm looking to implement some of the important things using plugins.
For starters, I'm planning on checking HTTP (speed of response, size of response, does the response contain key information). I want to expand to other network service checks (DNS, SMTP, IMAP) to ensure my infrastructure is always available.
I am going to have multiple host and service checks run through the PluginDataCheck method of the ClientPlugin.
I'm concerned that I might need to make these checks run concurrently so that I don't block the PluginDataCheck method and make it take a long time. They are network service checks, so if there are problems some could take up to their timeout (like 30 seconds). Is this a valid concern? Is there a certain amount of time that PluginDataCheck needs to run in before it times out?
My next concern was organization and notifications. If multiple checks fail, I'd like to get multiple notifications. This is currently limited to 1 per plugin.
I was thinking that I could use the Cloud API to create a service for every check that I run. That way each one could send it's own notifications and I could also organize the checks into groups instead of having them related to the system that's doing the monitoring. Would that be an abuse of the Cloud API?