The Holy Java

Building the right thing, building it right, fast

Posts Tagged ‘monitoring’

Graphite Shows Metrics But No Data – Troubleshooting

Posted by Jakub Holý on May 5, 2014

My Graphite has all the metrics I expect but shows no data for them. Communication between my app and Graphite clearly works otherwise the metrics would not have appeared in the list but why is there no data?

Update: Graphite data gotchas that got me

(These gotchas explain why I did not see any data.)

  1. Graphite shows aggregated, not raw data if the selected query period (24h by default) is greater than the retention period of the highest precision. F.ex. with the schema “1s:30m,1m:1d,5m:2y” you will see data at the 1s precision only if you select period less than or equal to the past 30 minutes. With the default one, you will see the 1-minute aggregates. This applies both to the UI and whisper-fetch.py.
  2. Aggregation drops data unless by default at least 50% of available data slots have values (xFilesFactor=0.5). I.e. if your app sends data at a rate more than twice slower than Graphite expects them, they will never show up in aggregates. F.ex. with the schema “1s:30m,1m:1d,5m:2y”  you must sends data at least 30 times within a minute for them to show in an aggregate.

I suppose that whisper-dump.py would show the raw data.

Lesson learned: Always send data to Graphite in *exactly* same rate as its highest resolution

As described above, if you send data less frequently than twice the highest precision (if 1s => send at least every 2s), aggregation will ignore the data, with the default xFilesFactor=0.5 (a.k.a. min 50% of values reqired factor). On the other hand, if you send data more frequently than the highest precision, only the last data point received in each of the highest precision periods is recorded, others ignored – that’s why f.ex. statsD flush period must = Graphite period.

Read the rest of this entry »

Posted in Tools | Tagged: , | Comments Off

Most interesting links of December ’13

Posted by Jakub Holý on December 31, 2013

Recommended Readings

Society

  • HBR: Want to Build Resilience? Kill the Complexity – a highly interesting, thought provoking article relevant both to technology in particular and the society in general; f.ex.: more security features are bad for they make us behave less safely (risk compensation) and are more fragile w.r.t. unexpected events. “Complexity is a clear and present danger to both firms and the global financial system: it makes both much harder to manage, govern, audit, regulate and support effectively in times of crisis. [..] Combine complex, Robust-Yet-Fragile systems, risk-compensating human psyches, and risk-homeostatic organizational cultures, and you inevitably get catastrophes of all kinds: meltdowns, economic crises, and the like.” The solution to future financial crisis is primarily not more regulation but simplification of the system – to make it easier to police, tougher to game. We also need to decrease interconnectednes (of banks etc.), one of the primary sources of complexity. Also a great example of US Army combatting complex, high-risk situations by employing “devil’s advocates / professional skeptics” trained to help “avoid the perils of overconfidence, strategic brittleness, and groupthink. The goal is to respectfully help leaders in complex situations unearth untested assumptions, consider alternative interpretations and “think like the other”“.
  • The Dark Side of Technology – technologies provide great opportunities – but also risks we should be aware of – they create a world of mounting performance pressure for all of us (individuals, companies, states), accelerate the rate of change, increasing uncertanity (=> risk of Taleb’s black swans). “All of this mounting pressure has an understandable but very dangerous consequence. It draws out and intensifies certain cognitive biases [..]” – magnify our perception of risk, shrink our time horizons, foster a more and more reactive approach to the world, the “if you win, I will lose” view, erode our ability to trust anyone – and “combined effect of these cognitive biases increases the temptation to use these new digital infrastructures in a dysfunctional way: surveillance and control in all aspects of our economic, social and political life.” => “significantly increase[d] the likelihood of an economic, social and political backlash, driven by an unholy alliance between those who have power today and those who have achieved some modest degree of income and success.
    Complexity theory: the more connected a system is, the more vulnerable it becomes to cascades of disruptive information/action.
  • What Do Government Agencies Have Against 23andMe, Uber, and Airbnb? – innovative startups do not fit into established rules and thus bureaucrats do not know how to handle them and resort to their favourite weapon: saying no, i.e. enforcing rules that harm them (f.ex. France recently passed a law that requires Uber etc. drivers to wait 15 min before picking up a customer so that established taxi services have it easier; wot?!)
  • Nonviolent communication in action – wonderful stories about NVC being applied in difficult situations with a great success

Tech

  • D. Nolen: The Future of JavaScript MVC Frameworks – highly recommended thought food – about React.js, disadvantages of event-based UI, benefits of immutability, performance, the ClojureScript React wrapper Om  – “I simply don’t believe in event oriented MVC systems – the flame graph above says it all. [...] Hopefully this gives fans of the current crop of JS MVCs and even people who believe in just using plain JavaScript and jQuery some food for thought. I’ve shown that a compile to JavaScript language that uses slower data structures ends up faster than a reasonably fast competitor for rich user interfaces. To top it off Om TodoMVC with the same bells and whistles as everyone else weighs in at ~260 lines of code
  • Quora: Michael Wolfe’s answer to Engineering Management: Why are software development task estimations regularly off by a factor of 2-3? – a wonderful story explaining to a layman why estimation is hard, on the example of a hike from SF to LA
  • Style Guide for JavaScript/Node.js by Felix Geisendörfer, recommended by a respectable web dev; nothing groudn breaking I suppose but great start for a team’s standards
  • Johannes Brodwall: Why I stopped using Spring [IoC] – worth to read criticism of Spring by a respected and experienced architect and developer; summary – dependency injection is good bug “magical” frameworks decrease understandability and encourage unnecessarily complex code => smaller code, , easier to navigate and understand and easier to test
  • Misunderstanding technical debt – a brief discussion of the various forms of tech. debt (crappy code x misaligned design and problem domain x competence debt)
  • Tension and Flaws Before Health Website Crash – surprising lack of understanding and tensions between the government and contractors on HealthCare.gov – “a huge gap between the administration’s grand hopes and the practicalities of building a website that could function on opening day” – also terribly decision making, shifting requirements (what news!), management’s lack of decision power, CGI’s blame-shifting. A nice horror story. The former head knew that they should “greatly simplify the site’s functions” – but the current head wasn’t able to “talk them out of it”.
  • The Log: What every software engineer should know about real-time data’s unifying abstraction – logs are everywhere and especially important in distributed apps – DB logs, append-only logs, transaction logs – “You can’t fully understand databases, NoSQL stores, key value stores, replication, paxos, hadoop, version control, or almost any software system without understanding logs” – I have only read a small part but it looks useful
  • What I Wish I Knew When Learning Haskell tl;dr
  • Better Than Unit Tests – a good overview of testing approaches beyond unit tests – including “Automated Contract Testing” (ability to define a contract for a web service, use it to test it and to simulate it; see Internet of Strings for more info), Property-based Testing (test generic properties using random data/calls as with Quickcheck), Fault Injection (run on multiple VMs, simulate network failures), Simulation Testing as with Simulant.
  • Use #NoEstimates to create options and deliver value reliably – a brief post with an example of an estimation-based vs. no-estimates project (i.e. more focus on delivering early, discovery)
  • How Google Sold Its Engineers on Management – managers may be useful after all :-); a report about Google’s research into management and subsequent (sometimes radical) improvements in management style/skills and people satisfaction; I love that Google hasn’t HR but “people ops”
  • Roy Osherove: Technical Disobedience – take nothing for granted, don’t let the system/process stop you, be creative about finding ways to improve your team’s productivity; there always is a way. Nice examples.
  • Uncle Bob: Extreme Programming, a Reflection – a reflection on changes in the past ~ 14 years since XP that have seen many of the “extreme” practices becoming mainstream
  • The Anti-Meeting Manifesto – essentially a checklist and tips for limitting meetings to minimum

Other

Talks

  • Pete Hunt: React: Rethinking best practices (JSConf 2013, 30 min) – one of the most interesting talks about frontend development, design, and performance I have heard this year, highly recommended. Facebook’s React JavaScript framework  is a fresh and innovative challenger in the MVC field. It is worthwile to learn why they parted ways with the popular approach of templates (spoiler: concern separation, cohesion x coupling, performance). Their approach with virtual DOM enables some cool things (run in Node, provide HTML5-like events in any browser with consistent behavior, …). Key: templates are actually tightly coupled to display logic (controllers) via the model view tailored for them (i.e. Controller must know what specific data & in what form View needs) => follow cohesion and keep them together componets, separate from other components and back-end code. Also, state changing over time at many places is hard => re-render the whole app rather than in-place updates. Also, the ClojureScript Om wrapper enables even more performance optimizations thanks to immutable data structures etc.
  • David Pollak: Some musings on Scala and Clojure by a long time Scala dude (46 min) – a subjective but balanced comparison of Scala and Clojure and their strengths/weaknesses by the author of the Scala Lift framework (doing Scala since 2006, Clojure since 2013)

Clojure Corner

Tools/Libs

  • Apache Sirona – a new monitoring tool in the Apache incubator – “a simple but extensible monitoring solution for Java applications” with support for HTTP, JDBC, JAX-RS, CDI, ehcache, with data published e.g. to Graphite or Square Cube. It is still very new.
  • GenieJS – Ctrl-Space to popup a command-prompt for your web page, inspired by Alfred (type ‘ to see all possible commands)

Favourite Quotes

A good #agile team considers their backlog inaccurate. It is merely a list of assumptions that must be tested & refined by shipping product
- @mick maguire 12/10

Ada Lovelace (1st program), Grace Hopper (1st compiler), Adele Goldberg (1st OO language), why would anyone think women aren’t in computing?
- @Dan North 12/11

There will always be a shortage of talented, self-motivated creative professionals who will unquestioningly follow orders.
- @Thomas K Nilsson 12/7

Estimation paradox = If something unpredictable happens, predict how long it will take to fix it
- me 12/7

IT systems can be inspired by AK-47 a.k.a. Kalashnikov. The rifle was purposefully designed to be simple and to be tolerant to imperfections in most parts; as a result, it required essentially no maintenance and was extremely reliable.
- summarized from Roman Pichlík’s Odkaz Michaila Kalašnikova softwarovému vývoji

Posted in General, Languages, Testing, Top links of month | Tagged: , , , , , , , , , , , , , | Comments Off

Enabling JMX Monitoring for Hadoop And Hive

Posted by Jakub Holý on September 21, 2012

Hadoop’s NameNode and JobTracker expose interesting metrics and statistics over the JMX. Hive seems not to expose anything intersting but it still might be useful to monitor its JVM or do simpler profiling/sampling on it. Let’s see how to enable JMX and how to access it securely, over SSH.

Read the rest of this entry »

Posted in Tools | Tagged: , , , , | 2 Comments »

VisualVM: Monitoring Remote JVM Over SSH (JMX Or Not)

Posted by Jakub Holý on September 21, 2012

(Disclaimer: Based on personal experience and little research, the information might be incomplete.)

VisualVM is a great tool for monitoring JVM (5.0+) regarding memory usage, threads, GC, MBeans etc. Let’s see how to use it over SSH to monitor (or even profile, using its sampler) a remote JVM either with JMX or without it.

This post is based on Sun JVM 1.6 running on Ubuntu 10 and VisualVM 1.3.3.

Read the rest of this entry »

Posted in General, Languages, Tools | Tagged: , , , | 1 Comment »

Zabbix: Fixing Active Checks to Work With Zabbix Proxy

Posted by Jakub Holý on August 8, 2012

We’ve recently changed our Zabbix 1.8.1 setup to include Zabbix Proxy, which broke all our active checks (f.ex. monitoring of log files). The solution seems to be having the proxy first, before the Zabbix Server, in the Zabbix Agent’s config parameter Server, i.e. “Server=<proxy ip>,<server ip>”.

Read the rest of this entry »

Posted in General, Tools | Tagged: , , | Comments Off

Notify on Errors in a Log File with Zabbix 1.8

Posted by Jakub Holý on July 3, 2012

Situation: You want to get notified when a log entry marked ERROR appears in a log file. You want the corresponding trigger to reset back to the OK state if there are no more errors for 10 minutes. (This post assumes certain familiarity with Zabbix UI.)

Read the rest of this entry »

Posted in Tools | Tagged: , , | Comments Off

Testing Zabbix Trigger Expressions

Posted by Jakub Holý on July 2, 2012

When defining a Zabbix (1.8.2) trigger e.g. to inform you that there are errors in a log file, how do you verify that it is correct? As somebody recommended in a forum, you can use a Calculated Item with a similar expression (the syntax is little different from triggers). Contrary to triggers, the value of a calculated item is easy to see and the historical values are stored so you can check how it evolved. If your trigger expression is complex the you can create multiple calculated items, one for each subexpression.

Read the rest of this entry »

Posted in Testing, Tools | Tagged: , , | Comments Off

Most interesting links of October

Posted by Jakub Holý on October 31, 2011

Recommended Readings

  • Steve Yegge’s Execution in the Kingdom of Nouns – I guess you’ve already read this one but if not – it is a well-written and amusing post about why not having functions as first class citizens in Java causes developers to suffer. Highly recommended.
  • Reply to Comparing Java Web Frameworks – a very nice and objective response to a recent blog summarizing a JavaOne presentation about the “top 4″ web frameworks. The author argues that based on number of resources such as job trends, StackOverflow questions etc. (however data from each of them on its own is biased in a way) JSF is a very popular framework – and rightly so for even though JSF 1 sucked, JSF 2 is really good (and still improving). Interesting links too (such as What’s new in JSF 2.2?). Corresponds to my belief that GWT and JSF are some of the best frameworks available.
  • Using @Nullable – use javax.annotation.Nullable with Guava’s checkNotNull to fail fast when an unexpected null appeares in method arguments
  • JavaOne 2011: Migrating Spring Applications to Java EE 6 (slides) – nice (and visually attractive) comparison of JavaEE and Spring and proposal of a migration path. It’s fun and worthy to see.
  • xUnitPatterns – one of the elementary sources that anybody interested in testing should read through. Not only it explains all the basic concepts (mocks, stubs, fakes,…) but also many pitfalls to avoid (various test smells such as fragile tests due to Data Sensitivity, Behavior Sensitivity, Overspecified Software [due to mocks] etc.), various strategies (such as for fixture setup), and general testing principles. The materials on the site were turned into the book xUnit Test Patterns: Refactoring Test Code (2007), which is more up-to-date and thus a better source.
  • Eclipse tip: Automatically insert at correct position: Semicolon, Braces – in “while(|)” type “true {” to get “while(true) {|” i.e. the ‘{‘ is moved to the end where it belongs, the same works for ‘;’
  • Google Test Analytics – Now in Open Source – introduces Google’s Attributes-Components-Capabilities (ACC) application intended to replace laborous and write&forget test plans with something much more usable and quicker to set up, it’s both a methodology for determining what needs to be tested and a tool for doing so and tracking the progress and high-risk areas (based not just on estimates but also actual data such as test coverage and bug count). The article is a good and brief introduction, you may also want to check a live hosted version and a little more detailed explanation on the project’s wiki.
  • JSF and Facelets: build-time vs. render-time (component) tags (2007) – avoid mixing them incorrectly
  • StackOverflow: What are the main disadvantages of Java Server Faces 2.0? Answer: The negative image of JSF comes from 1.x, JSF 2 is very good (and 2.2 is expected to be just perfect :-)). Nice summary and JSF history review.
  • Ola Bini: JavaScript in the small – best practices for projects using partly JavaScript – the module pattern (code in the body of an immediately executed function not to polute the global var namespace), handling module dependencies with st. like RequireJS, keeping JS out of HTML, functions generating functions for more readable code, use of many anonymous functions e.g. as a kind of named parameters, testing, open questions.

Talks

  • Kent Beck’s JavaZone talk Software G Forces: The Effects of Acceleration is absolutely worth the 1h time. Kent describes how the development process, practices and partly the whole organization have to change as you go from annual to monthly to weekly, daily, hourly deployments. What is a best practice for one of these speeds becomes an impediment for another one – so know where you are. You can get an older version of the slides and there is also a detailed summary of the talk from another event.
  • Rich Hickey: Simple Made Easy - Rich, the author of Clojure, argues very well that we should primarily care for our tools, constructs and artifacts to be “simple”, i.e. with minimal complexity, rather than “easy” i.e. not far from our current understanding and skill set. Simple means minimal interleaving – one concept, one task, one role, minimal mixing of who, what, how, when, where, why. While easy tools may make us start faster, only simplicity will make it possible to keep going fast because (growing) comlexity is the main cause of slowness.  And simplicity is a choice – we can create the same programs we do today with the tools of complexity with drastically simpler tools. Rich of course explains what, according to him, are complex tools and their simple(r) alternatives – see below. The start of the 1h talk is little slow but it is worth the time. I agree with him that we should much more thing about the simplicity/complexity of the things we use and create rather than easiness (think ORM).
    Read also Uncle Bob’s affirmative reaction (“All too often we do what’s easy, at the expense of what’s simple. And so we make a mess. [...] doing what is simple as opposed to what is easy is one of the defining characteristics of a software craftsman.”).

Random Notes from Rich’s Simple Made Easy Talk:

There are also better notes by Alex Baranosky and you may want to check a follow-up discussion with some Rich’s answers.

The complex vs. simple toolkit (around 0:31):

COMPLEXITY                             SIMPLICITY
State, objects                            Values
Methods                                   Functions, namespaces
vars                                          Managed refs
Inheritance, switch, matching  Polymorphism a la carte
Syntax                                      Data
Imperative loops, fold              Set functions
Actors                                      Queues
ORM                                         Declarative data manipulation
Conditionals                             Rules
Inconsistency                            Consistency

What each of the complexity constructs mixes (complects) together

CONSTRUCT                        COMPLECTS (MIXES)
State, objects – everything that touches it (for state complects time and value)
Methods – function and state, namespaces (2 classes, same m. name)
Syntax – Meaning, order
Inheritance – Types (ancestors, child)
Switch/matching – Multiple who/what pairs (1.decide who, 2.do what ?)
var(iable)s – Value, time
Imperative loops, fold – what/how (fold – order)
Actors – what/who
ORM – OMG :-)
Conditionals – Why, rest of program (rules what program does are intertw. with the structure and order of the program, distributed all over it)

HE SIMPLICITY TOOLKIT (around 0:44)
CONSTRUCT            GET IT IVA…
Values – Final, persistent collections
Functions – a.k.a. stateless methods
Namespaces – Language support
Data – Maps, arrays, sets, XML, JSON etc.
Polymorphism a la carte – Protocols, Haskell type classes
Managed refs – Clojure/Haskell refs (compose time and value , not mix)
Set functions – Libraries
Queues – Libraries
Declarative data manipulation – SQL/LINQ/Datalog
Rules – Libraries, Prolog
Consistency – Transactions, values

True abstraction isn’t hiding complexity but drawing things away – along one of the dimensions of who, what, when, where, why [policy&rules of the app.], how.
Abstraction => there are things I don’t need – and don’t want – to know.
Why – do explore rules and declarative logic systems.
When, where – when obj. A communicates with obj. B. => put a queue in between them so that A doesn’t need to know where B is; you should use Qs extensively.

Links to Keep

  • Incredibly Useful CSS Snippets -  “a list of CSS snippets that will help you minimize headaches, frustration and save your time while writing css” – few float resets, targetting specific browsers & browser hacks, cross-rowser transparency/min height/drop shadow, Google Font API, link styled by file type,

DevOps: Tools and libraries for system monitoring and (time series) data plotting

  • Hyperic SIGAR API – open-source library that unifies collection of system-related metrics such as memory, CPU load, processes, file system metrics across most common operating systems
  • rrd4j – Java clone of the famous RRDTool, which stores, aggregates and plots time-series data (RRD = round-robin database, i.e. keeps only a given number of samples and thus has a fixed size)
  • JRDS “is performance collector, much like cacti or munins”, uses rrd4j. The documentation could be better and it seems to be just a one man project but it might be interesting to look at it.

Clojure Corner

  • Alex Miller: Real world Clojure – a summary of experiences with using Clojure in enterprise data integration and analytics products at Revelytix, since early 2011 with a team of 5-10 devs. Some observations: Clojure code is 1-2 order of magnitude smaller than Java. It might take more time to learn than Java but not much. Clojure tooling is acceptable, Emacs is still the best. Debugging tools are unsurprisingly quite inferior to those for Java. Java profiling tools work but it may be hard to interpret the results. “[..]  I’ve come to appreciate the data-centric approach to building software.” Performance has been generally good so far.
  • Article series Real World Clojure at World Singles – the series focuses on various aspects of using Clojure and how it was used to solve particular problems at a large dating site that starting to migrate to it in 2010. Very interesting. F. ex. XML generation, multi-environment configuration, tooling (“If Eclipse is your drug of choice, CCW [Counter ClockWise] will be a good way to work with Clojure.”, “Clojure tooling is still pretty young [..]  - but given how much simpler Clojure is than most languages, you may not miss various features as much as you might expect!”)
  • StackOverflow: Comparing Clojure books – Programming Clojure, Clojure in Action, The Joy of Clojure, Practical Clojure – which one to pick? A pretty good comparison.
  • Clojure is a Get Stuff Done Language – experience report – “For all that people think of Clojure as a “hard” “propeller-head” language, it’s actually designed right from the start not for intellectual purity, but developer productivity.”

Posted in eclipse, General, j2ee, Languages, Testing, Top links of month | Tagged: , , , , , , , , , , | 3 Comments »

Intro: Java Webapp Monitoring with Hyperic HQ + How to Alert on Too Many Errors in Logs

Posted by Jakub Holý on October 17, 2011

This post describes how to set up the Java-based open source monitoring tool Hyperic HQ to monitor application server error logs and send a single warning e-mail when there are more of them than a threshold. In the previous post Aggregating Error Logs to Send a Warning Email When Too Many of Them – Log4j, Stat4j, SMTPAppender we’ve seen how to achieve that programatically while this solution is just about configuration. We will also see a little what else (a lot!) Hyperic can do for you and what the impressions after a short experimentation with it are. Read the rest of this entry »

Posted in j2ee, Tools | Tagged: , , , | Comments Off

Aggregating Error Logs to Send a Warning Email When Too Many of Them – Log4j, Stat4j, SMTPAppender

Posted by Jakub Holý on October 15, 2011

Our development team wanted to get notified as soon as something goes wrong in our production system, a critical Java web application serving thousands of customers daily. The idea was to let it send us an email when there are too many errors, indicating usually a problem with a database, an external web service, or something really bad with the application itself. In this post I want to present a simple solution we have implemented using a custom Log4J Appender based on Stats4j and an SMTPAppender (which is more difficult to configure and troubleshoot than you might expect) and in the following post I explore how to achieve the same effect with the open-source Hyperic HQ monitoring SW.

Read the rest of this entry »

Posted in j2ee, Tools | Tagged: , , , | 1 Comment »