How good monitoring saved our ass … again

You know how it goes – suddenly people complain your app does not work, your are getting plenty of timeouts or other errors in your error tracking tool, you find the backend app that is misbehaving and finally “fix” the problem by restarting it. Phew! But why? What caused the downtime? A glitch an anContinue reading “How good monitoring saved our ass … again”


AWS CloudWatch Alarms Too Noisy Due To Ignoring Missing Data in Averages

I want to know when our app starts getting slower so I sat up an alarm on the Latency metric of our ELB. According to the AWS Console, “This alarm will trigger when the blue line [average latency over the period of 15 min] goes above the red line [2 sec] for a duration ofContinue reading “AWS CloudWatch Alarms Too Noisy Due To Ignoring Missing Data in Averages”

Graphite Shows Metrics But No Data – Troubleshooting

My Graphite has all the metrics I expect but shows no data for them. Communication between my app and Graphite clearly works otherwise the metrics would not have appeared in the list but why is there no data? Update: Graphite data gotchas that got me (These gotchas explain why I did not see any data.)Continue reading “Graphite Shows Metrics But No Data – Troubleshooting”

Most interesting links of December ’13

Recommended Readings Society HBR: Want to Build Resilience? Kill the Complexity – a highly interesting, thought provoking article relevant both to technology in particular and the society in general; f.ex.: more security features are bad for they make us behave less safely (risk compensation) and are more fragile w.r.t. unexpected events. “Complexity is a clearContinue reading “Most interesting links of December ’13”

Enabling JMX Monitoring for Hadoop And Hive

Hadoop’s NameNode and JobTracker expose interesting metrics and statistics over the JMX. Hive seems not to expose anything intersting but it still might be useful to monitor its JVM or do simpler profiling/sampling on it. Let’s see how to enable JMX and how to access it securely, over SSH.

VisualVM: Monitoring Remote JVM Over SSH (JMX Or Not)

(Disclaimer: Based on personal experience and little research, the information might be incomplete.) VisualVM is a great tool for monitoring JVM (5.0+) regarding memory usage, threads, GC, MBeans etc. Let’s see how to use it over SSH to monitor (or even profile, using its sampler) a remote JVM either with JMX or without it. ThisContinue reading “VisualVM: Monitoring Remote JVM Over SSH (JMX Or Not)”

Testing Zabbix Trigger Expressions

When defining a Zabbix (1.8.2) trigger e.g. to inform you that there are errors in a log file, how do you verify that it is correct? As somebody recommended in a forum, you can use a Calculated Item with a similar expression (the syntax is little different from triggers). Contrary to triggers, the value ofContinue reading “Testing Zabbix Trigger Expressions”