The Holy Java

Building the right thing, building it right, fast

Archive for November, 2018

Java: Simulating various connection problems with Toxiproxy

Posted by Jakub Holý on November 26, 2018

Advertisements

Posted in Languages | Tagged: | Leave a Comment »

Clojure – comparison of gnuplot, Incanter, oz/vega-lite for plotting usage data

Posted by Jakub Holý on November 4, 2018

What is the best way to plot memory and CPU usage data (mainly) in Clojure? I will compare gnuplot, Incanter with JFreeChart, and vega-lite (via Oz). (Spoiler: I like Oz/vega-lite most but still use Incanter to prepare the data.)

The data looks like this:

;; sec.ns | memory | CPU %
1541052937.882172509 59m 0.0
1541052981.122419892 78m 58.0
1541052981.625876498 199m 85.9
1541053011.489811184 1.2g 101.8

The data has been produced by monitor-usage.sh.

The tools

Gnuplot 5

Gnuplot is the simplest, with a lot available out of the box. But it is also somewhat archaic and little flexible.

Read the rest of this entry »

Posted in Languages, [Dev]Ops | Tagged: , , | Comments Off on Clojure – comparison of gnuplot, Incanter, oz/vega-lite for plotting usage data

How I got fired and learned the importance of communication and play time

Posted by Jakub Holý on November 4, 2018

When I came to the office one late autumn morning in 2005, I have been shocked to find out that – without any warning signs whatsoever – I hd been fired. That day I have learned the importance of communication. Their criticism was justified but the thing is, nobody bothered to tell me anything during my 11 months in the company. I received exactly 0 feedback about my behaviour or work. The company ended up at court with its client – which both explains why they were stressed and was also caused by bad communication. So communication – even, or especially under stress – is really important. It must be open, transparent, and broad.

The funny thing is that I still do the things they fired me for.

Read the rest of this entry »

Posted in General | Tagged: , | 3 Comments »

How good monitoring saved our ass … again

Posted by Jakub Holý on November 1, 2018

You know how it goes – suddenly people complain your app does not work, your are getting plenty of timeouts or other errors in your error tracking tool, you find the backend app that is misbehaving and finally “fix” the problem by restarting it. Phew!

But why? What caused the downtime? A glitch an an upstream system? Sudden overload due to a spike in concurrent users? Trolls?

You know that it helps sometimes to zoom out, to get the right perspective. Here the perspective was 7 days:

It was enough to look at this chart with the right zoom to see at once that something happened on October 23rd that caused a significant change in the behavior of the application. Quick search and indeed, the change in CPU usage corresponds with a deployment. A quick revert to the previous version shortly confirmed the culprit. (It would have been even easier if we showed deployments on these charts.)

This is not the first time good monitoring saved us. A while ago we struggled regularly with the application becoming sluggish and had to restart it regularly. A graph of the Node.js even loop lag showed it increasing over time. Once it was on the same dashboard as Node’s heap usage, we could at once see that it correlated with increasing memory usage – indicating a memory leak. Few hours of experimenting and heap dump analysis later the problem was fixed.

So good monitoring is paramount.

Of course the trick is to know what to monitor and to display all relevant metrics in such a way that you can spot important relations. I am still working on improving that…

Posted in [Dev]Ops | Tagged: | Comments Off on How good monitoring saved our ass … again

Beware the performance cost of async_hooks (Node 8)

Posted by Jakub Holý on November 1, 2018

I was excited about async_hooks having finally landed in Node.js 8, as it would enable me to share important troubleshooting information with all code involved in handling a particular request. However it turned out to have terrible impact of our CPU usage (YMMV):

This was quite extreme and is likely related to the way how our application works and uses Promises. Do your own testing to measure the actual impact in your app.

However I am not the only one who has seen some performance hit from async_hooks – see https://github.com/bmeurer/async-hooks-performance-impact, in particular:

Here the results of running the Promise micro benchmarks with and without async_hooks enabled:

Benchmark Node 8.9.4 Node 9.4.0
Bluebird-doxbee (regular) 226 ms 189 ms
Bluebird-doxbee (init hook) 383 ms 341 ms
Bluebird-doxbee (all hooks) 440 ms 411 ms
Bluebird-parallel (regular) 924 ms 696 ms
Bluebird-parallel (init hook) 1380 ms 1050 ms
Bluebird-parallel (all hooks) 1488 ms 1220 ms
Wikipedia (regular) 993 ms 804 ms
Wikipedia (init hook) 2025 ms 1893 ms
Wikipedia (all hooks) 2109 ms 2124 ms

To confirm the impact of async_hook on our app, I have performed 3 performance tests:

CPU usage without async_hooks (Node 8)

It is difficult to see but the mean CPU usage is perhaps around 60% here.

CPU usage with “no-op” async_hooks (Node 8)

Here the CPU jumped to 100%.

CPU usage with “no-op” async_hooks (Node 11)

The same as above, but using Node 11 for comparison. I recorded it for just a few minutes but the CPU usage is still around 100%:

The code

This is the relevant code:

Posted in Languages | Tagged: , | Comments Off on Beware the performance cost of async_hooks (Node 8)