The Holy Java

Building the right thing, building it right, fast

Posts Tagged ‘cloud’

Most interesting links of May ’14

Posted by Jakub Holý on May 31, 2014

Recommended Readings

  • Monolith – from The Codeless Code – fables and koans for the SW engineer – the Monad monolth #Haskell #fun
  • http2 explained (pdf, 27 pages) – cons of http 1 (huge spec / no full impl., wasteful use of TCP <=> latency [x spriting, inlining, concatenation, sharding]) => make it less latency sensitive, fix pipelining (issue a req before previous one finished), stop the need for ever increasing # connections, remove/reduce optional parts of http. Http2 is binary; multiple “streams” over 1 connection => much less conns, faster data delivery; header/data compression; [predictive] resource pushing; . Inspired by SPDY. Chrome and Mozilla will only support it over TLS, yay! (see also Is TLS Fast Yet? [yes, it is]) Promise: faster, more responsive web pages & deprecation of http/1 workarounds => simplified web dev.

Special

  • exercism.io – crowd-sourced good code mentorship – get an exercise, implement it in any of the supported language(s), submit and get feedback, repeat; when finished, you too can comment the same excercise submitted by others while working on your next assignment. Languages include Clojure, JS, Scala, Python, Haskell, Go, Elixir, Java, and more.

Podcasts (FP & related)

  • Cognicast (also @ iTunes) – Clojure, FP, etc.
  • Functional Geekery (@ iTunes) – A podcast on Functional Programming, covering topics across multiple languages.
  • Mostly λazy…a Clojure podcast by Chas Emerick
  • Giant Robots Smashing into other Giant Robots – “a weekly technical podcast discussing development, design, and the business of software development”
  • Software Engineering Radio (@ iTunes) – “The goal is to be a lasting educational resource, not a newscast. Every two to four weeks, a new episode is published that covers all topics software engineering. Episodes are either tutorials on a specific topic, or an interview with a well-known expert from the software engineering world.”
  • EngineerVsDesigner – design insight (@ iTunes) – product design podcast – the latest digital design news, tips & tricks, Q&A, and an industry special guest

Other

Clojure Corner

Tools/Libs

  • ownCloud – your own Dropbox/Google Drive, run on your server – sharing files between devices / PCs / web, syncing calendar and contacts, collaborative editing of documents (ODF)
  • Mailpile – “A modern, fast web-mail client with user-friendly encryption and privacy features.”, to be self-hosted on a PC, RaspberryPI, USB stick
  • Blackhole – role-based ssh proxy – an app that enables you to manage what users can ssh to what server as a particular user, from users’ point of view this is a ssh proxy; useful if many people need access to many servers but you do not want to add them all as users on those servers.
  • Wuala – Secure Cloud Storage – Backup. Sync. Share. Access Everywhere. – Dropbox alternative, secure by default
  • fb-flo – Facebook’s live-coding tool
  • owncloud.org – self-hosted Dropbox-like service with calendar and contact sync and more

Favourite Quotes

Posted in General, Languages, Tools, Top links of month | Tagged: , , , , , , , , , , | Comments Off

Most interesting links of Mars ’13

Posted by Jakub Holý on March 31, 2013

Recommended Readings

A lot of stuff this month since I have finally got time to review some older articles. Quite a few articles by Fowler. Few really great (yet short) talks on agile & SW development.

Top

  • Agile in a Nutshell (originally Agile Product Ownership in a Nutshell) by Henrik Kniberg – the best explanation of the agile development process ever, in just 15 minutes and with wonderful animation; every developer should see this. Some highlights: the most important task of product owner is to say NO so that backlog doesn’t grow infinitely; at start, the estimates of size and value will suck and that’s OK because the value is in the conversation, not in the numbers (that are anyway just relative); the goal is to maximize outcome (value), not output (# features). Compromises between short-term vs. long-term goals, knowledge vs. customer value building etc. Build the right thing (PO) x build it right (devs) x build it fast (SM). Technical debt x sustainable pace. As I said – you MUST see it.
  • Martin Fowler: The Value of Software Design (talk, 22 min, from 0:45:00 til 1:07; Feb 2013) – a balanced argument for the value of good software design and internal code quality based on paying off by enabling us to keep our development speed. Discusses the DesignStaminaHypothesis (bad design => rapid decline of development speed), TechnicalDebt, TechnicalDebtQuadrant (Prudent x Reckless, Deliberate x Inadvertent), TradableQualityHypothesis. According to the experience of Fowler and others, the good design payoff point “it’s weeks, not months.”
  • What Does It Take To Become A Grandmaster Developer? – great post about cognition and learning, valuable references, quotes from an interesting study of good vs. mediocre developers. We have mental capacity for ~7 chunks of information => great performers recognize patterns and see and understand thus higher-level chunks and have many “chunks” (patterns encountered previously) readily available. You need deliberate effort to learn more chunks – especially initially but you must always try to get out of your comfort zone to grow. Experienced collegues can help a lot in acending the learning curve.

Agile, organization, innovation, project management

  • How to Prioritize a User Story Map – we all know that we should prioritize features by their value, risk, and lack of knowledge and that we should slice the features thin so that they fit into short iteration and can be deployed soon to produce feedback, right? Here we see a nice example of what happens if not done so and how to do feature slicing better.
  • Bob Marshall: Rightshifting – according to the author, 80% of knowledge work organizations are very ineffective, wasting resources on non-value-adding activites; only few are effective, even fewer highly effective. Rightshifting is the attempt at shiting them to the right, towards higher effectiveness. Links to a few videos explaining it more. Related: Steve McConnell’s Business Case for Better Software Practices, referring to a study by SEI; “The actual distribution of software effectiveness is asymmetric. Most organizations perform much closer to the worst practice than to the best.” – the best performing 10 times better then the worst/average (productivity, speed, defects, value)
  • On Antifragility in Systems and Organizational Architecture – introduces the concept of antifragility, based on Nassim Taleb’s book Antifragile that compares fragile, robust, and antifragile systems and organizational structures (which is also applicable to SW systems); robust = resists change (unless too large); antifragile: learn, adapt; closely related to DevOps and continous delivery
  • M. Fowler: PurposeOfEstimation – many Agilist disdain estimation, this is a balanced view: “estimation is valuable when it helps you make a significant decision.” (F.ex. when deciding what we (don’t) have resources for or when in need of coordinating related activities.) It is evil when used as commitments that people are forced to stick to and blamed for not managing to do so. “Above all be wary of anyone who tells you they [estimates] are always needed, or never needed.” A. Ferguson: “[..] it is poor project management (whether by project managers or other team members) that results in a client who thinks estimates are fixed, or that raw estimates = actual effort/duration”.
  • Ron Jeffries: Estimation is Evil – discusses the problems estimates can cause, issues with requirements gathering up front and their volatility, transparency and politics. Very valuable, highly recommended. See the “favorite quotes” at the bottom of this post. Also contains an interesting lesson learnt from the failed Chrysler C3 project: don’t try to build a grand new system to replace and fix the old one, fix one problem at a time – worth reading for this alone.
  • Interview with Steve Blank: Why Big Companies Can’t Innovate – the 2013 list of the world’s 50 most innovative companies has only a few large, established firms (those that have built innovation into its DNA such as Apple and Google). Established companies are less innovative because they focus in their existing business model, have risk-aversion (while there are many failures on the way to a new business model); finally “the people who are best suited to search for new business models and conduct iterative experiments usually are not the same managers who succeed at running existing business units.” – and thus aren’t given the chance. “[..]  the process of starting a new business [..] is fundamentally different from running an existing one. So if you want your company to grow organically, then you need to organize your efforts around these differences.”

Architecture & Ops

  • M. Fowler: Schemalessness + NoSQL and Consistency (20 + 20 min) two short, very good, balanced talks about NoSQL. He explains schemalessness and consistency and points out common misunderstanding about them so if you are into NoSQL, watch it.
  • What Powers Instagram: Hundreds of Instances, Dozens of Technologies (2012) – interesting high-level overview of the Instagram infrastructure based on AWS and Python (25 XL instances running Django/Gunicorn behind ELB with 3 Nginxes, sharded PostgreSQL with streaming replication on 12 QXL mem instances with software raid and XFS to freeze when snapshoting, media in S3, Redis, Solr for geo-search, Memcached. Gearman for task queues, pyapns for notifications. Munin for monitoring.)
  • The Netflix API Optimization Story – how Netflix redesigned its APIs to improve performance, reduce chattiness, and power product development and experimentation. The common REST API has become a development bottleneck and a lowest common denominator solution (w.r.t. supporting various clients). The main changes were: usage Hystrix for fault tolerance, each device team managing their own end-points in any JVM languges (primarily Groovy) and re-using common APIs (i.e. pushing some device-specific code to the server) => able to experiemnt more quickly, using the Functional Reactive Programming Model and asynchronous APIs (to abstract away thread-safety and parallel execution implementation details from the device teams so that code can execute sync. or async. without them needing to know).
  • Me: Overview of current monitoring libs for Java – Netflix’ Servo, Yammer’s Metrics, JavaMelody, JavaSimon.
  • Debug Servlets, or ‘HTTP Won; Use It’ – expose all debugging info of your services over HTTP – it makes debugging much simpler. We do a part of it and it really helps. Expose config (values, where they come from), logs, log configuration, JMX (setting it up otherwise not trivial), version, build number, git hash, server time (timezones tricky), metrics, stack dumps, app-specific status (Hadoop: live nodes, data size etc.). The author recommends JavaMelody to collect & visualize many common metrics. Not on security: Make sure to hide passwords and make the endpoints visible only internally. (Tip: consider Jolokia for exposing JMX over HTTP, see below.)
  • JVM Crash/Core Dump Analysis – 3 common categories of  JVM crash causes (JVM/JIT/JNI) and how to recognize and troubleshoot them

Other

  • How to lose wight in the browser:  The definitive front-end performance guide – a site by a number of experts from Twitter, Opera, Google, and other places with best practices for performant web sites (HTML, CSS, JS, jQuery, images). Ex.: styles up top, scripts down bottom; minify your html, css and JS; async script loading; combine css/JS files into one; cache array lengths while looping; use css sprites for icons.
  • Luke Stevens: The harsh truth about HTML5′s structural semantics (part 1) – “HTML’s structural elements — article, section, nav and aside — are, at first glance, some of the easiest parts of the HTML5 specification to understand and implement. However, they’re actually some of the most poorly specified, poorly understood, and poorly implemented parts of HTML5.” Interesting: The “research” leading to their establishment was quite random, ignoring a crucial source of information (css IDs).
  • Marco Emmanuel Patiño: Six non-technical books every programmer should read – 1. Team Geek: A Software Developer’s Guide to Working Well with Others (-> effective communication and collaboration), 2. The Pragmatic Programmer: From Journeyman to Master, 3. The Passionate Programmer: Creating a Remarkable Career in Software Development, 4. Clean Code: A Handbook of Agile Software Craftsmanship, 5. 97 Things Every Programmer Should Know: Collective Wisdom from the Experts (online), 6. Code Simplicity: The Fundamentals of Software.
    • Related: Top 5 Java programming books – Best of lot (actually 8) – 1) Head First Java, 2) Effective Java, 3) Thinking in Java, 4) Head First Design Pattern, 5) Concurrency Practice in Java, 6)Java performance, 7) Java Puzzlers, 8) Head First Object Oriented Analysis and Design.
  • Humans as slaves of chemistry: America’s Real Criminal Element – Lead – a fascinating article about how whole nations can be seriously influenced by a single chemical substance. Aside of that it is also fascinating to observe how we tend to search for causes in our domain of expertise (police, sociologists, …) and of interest while denying other possible causes, no matter how strong are the proofs. If the facts presented are true, then the fivefold increase in serious crimes in (not only) America since 60s has been caused by the increase of lead in the environment (pushing many people over the edge of ocassional violent loss of control). How many social problems in the world have similar industrial causes? Are we careful enough with what we let into our air and bodies?

Languages

  • newcoder.io: Learning more Python via projects – an excellent next step when you have learned Python syntax via LPHW or similar; in this tutorial series you will be building real-world apps while learning more of Python. You will play with, Data Visualization, APIs, Web Scraping, Networks, GUI.
  • Brian McCallister: Go is PHP for the Backend – a very good explanation why you might want to use Go and that you have to first learn “the Go way” to avoid insanity, since it is very opinionated and different from what you might be used to. Some pros: “native code, UNIX friendly, higher level then C, lower level then Python or Ruby, garbage collected, strongly typed, good performance, good concurrency support, etc.”
  • The Neophyte’s Guide to Scala 1 to 15 (list) – a good follow-up on the Cursera FP in Scala course, a series of blog posts exploring some topics more in depth. F.ex.: extracotrs (unapply, for pattern-matching), the broad applicability of pattern matching, pattern matching anonymous functions & partial functions #4, usiong Option idiomaticly #5, nice FP error handling with the Try type #6, Futures, etc. Higly recommended! Thx to Jakob Lind

Libs

  • Jolokia is remote JMX with JSON over HTTP: a REST API bridged to JMX, with support for security, fine-grained access control, bulk operations. Especially useful if you either 1)  need to perform bulk operations (e.g. get multiple values) or 2) want to access them from something that doesn’t support JMX. JSON is in general very easy to use and navigate. You can install Jolokia as a WAR (or mebedd its Servlet), a JVM agent, or attach it on-the-fly to a running JVM.
  • The Appeal and Controversy of ZeroMQ – why to use 0MQ? It is a messaging library that focuses on performance, decentralization and simplicity, solving some really hard problems (sending async. messages w/o locks, distribuing to specific subscribers) and providing a simple API. Main pros: decentralized (no central broker), many languages; cons: no security (but you can use it over SSH).

Talks

  • Tim O’Reilly: Create More Value Than You Capture (30 min + questions) – build apps that matter, that change how we do things. Thinking just about money is bad. Try to make the society better, smart, create more value than you capture, solve important problems, help people. Ex. startups: Uber, Square, Code for America.
  • TED: Bruce Feiler: Agile programming — for your family (20 min) – an inspirational talk, based on positive experience from multiple families, about applying the agile thinking and values to make our families happier by empowering the children (enlist them in their upbringing, deciding on goals, rewards, punishments), letting them know who they are, being adaptive, having regular “retrospectives” (that eventually become cherrished memories). Backed by research. Did you know that the #1 wish of children isn’t that parents spend more time with them but that they are less stressed?

Clojure Corner

  • What’s new in Clojure 1.5.x – reducers, new threading macros (cond->, as->, some->, ..), various improvements, improved performance, erro messages, doc strings, bug fixes
  • Stuart Sierra: On the Perils of Dynamic Scope – summary: don’t create macros like with-connection binding to a thread-local var; make all methods take the resource as a parameter – thus the user has the freedom to decide when to close the resource and isn’t limited to a single thread and can use lazy sequences
  • Logic programming is overrated – core.logic is essentially only a complex DSL for doing an exhaustive search and there is already a nice, clean tool for that: the for comprehension. A logic puzzle can be much more clearly and also efficiently using for. But it is not completely useless – logic programming is good e.g. for running programs backwards, unification is important for writing type checkers, and the new constraint programming piece has good potential. Read also Logic Programming is Underrated, which provides a faster core.logic solution than for-comprehension and provides some pointers rgarding the practical usefulness of core.logic.
  • Prismatic – Graph: Abstractions for Structured Computation – How to reduce the complexity overhead in large, real-world, FP systems by decoupling what is done from how it is executed. Graph is a Clojure library enabling a declarative way to describe how data flows between (mostly pure) functions => “It allows us to formalize the informal structure of good FP code, and enables higher-order abstractions over these structures that can help stamp out many persistent forms of complexity overhead.” By decoupling the description of how data flows and the actual execution, we can execute it in different ways (parallelized, with memoization, lazy/eager) and apply various interceptors (for logging etc.). See especially the part “Graph and complexity overhead.”
  • Mike Anderson: Game development in Clojure : Alchemy 7DRL post-mortem (and the previous 7 daily updates, Alchemy @ GitHub) – an interesting report about game making in Clojure during 7 days, in as functional and immutable style as possible while keeping it sufficiently fast. How do you represent & handle statuf game objects, the world map, game state? The design of the game, what was easy and what hard with Clojure. Tl;dr: search it for “Some parting thoughts” (Clojure productive, immutability hard but pays off, prototype objects great, more typing would have helped). “Making everything immutable in Clojure is harder than it would have been in an OOP language like Java where everything can be encapsulated in mutable classes. In particular, the state update functions are tricky to make both correct and performant. The payoff is big however: in terms of the simplicity and effectiveness later on, and in the conceptual clarity being able to treat the entire game state as an immutable value”. Having REPL is a big win.
  • Refactoring Java using Clojure with the Eclipse Java development tools (JDT) (operation on AST nodes, i.e. little too low level; the Eclipse Refactoring API might be better)
  • Clojure at a Bank – [Our] Clojure Code Immaturity – experiences with going from Java to Clojure: 1) too few comments, too short names => hard to learn the code; 2) not knowing clojure.core well enough => reimplementing (if-let, juxt, …); 3) structure, comment, split up your namespaces well, navigating more complicated then in Java IDEs; 4) reasonably used Macros, Protocols, Defrecords payed off;
  • Datomic for Five Year Olds – explaining the key characteristics of Datomic compared to relational and NoSQL DBs (schema, architecture, programmability/language); doesn’t go into details of how it works (e.g. how does Datomic determine what subset of the DB to cache locally and what if it is few GBs); Honey Badger’s 2012 talk Exploring Datomic: a database deconstructed explores the architecture and technical details much more

Tools

  • Vagrant 1.1.0 is out (what’s new?), with support for VMWare Fusion and AWS VM backends in addition to VirtualBox – use the same config to create, provision, stop, destroy and connect to a virtual machine locally or in the cloud (with limited support for shared folders, I’d suppose). V. 1.1 is backwards compatible aside of plugins, upgrade to new config optional.
  • Animated presentations: ArtRage (drawing program, also for iPad), Wacom Intuos 5 (drawing tablet), Screenflow (screen & audio capture) – used for the Agile in a Nutshell (Agile Product Ownership in a Nutshell) mentioned above
  • ckjm — Chidamber and Kemerer Java Metrics (via Neal Ford) – a command-line tool (also Maven/Ant plugin) to compute some metrics, outputting text or XML for further processing; the metrics: WMC: Weighted methods per class (cyclomatic complexity of its methods), DIT: Depth of Inheritance Tree, NOC: Number of Children, CBO: Coupling between object classes, RFC: Response for a Class, LCOM: Lack of cohesion in methods, NPM: Number of Public Methods, Ca: afferent coupling.
  • Bulletproof Demos: Record & Replay built into Chrome – ever got a failure while demonstrating a web app though it has worked moments ago? No more! You Chrome to record your requests and responses and let its cache handle them during the real demonstration. (Mac: stop Chrome, to record run open -a “Google Chrome” –args –record-mode, to replay run open -a “Google Chrome” –args –playback-mode. Linux: google-chrome –record-mode and –playback-mode. Win.: run chrome <arg>)
  • UserTesting.com (via Ash Maurya, the author of Running Lean) – on-demand usability testing; they have a large base of test users, can select those matching your criteria and unleash them upon your site guided by a script your provide, watch videos of their actions while they verbalize their thinking process, recieve written answers from them, talk to them.
  • MindMup.comopensource, free mind-mapping in the cloud by Gojko Adzic & co., with main focus on productivity. Store private maps in Goolge Drive, support for mobile devices, keyboard shortcuts. No registration needed.

Favorite Quotes

Once we estimated a project to require 9 man-months but were later told that we do not understand a thing and it may not take more then 4. At the end it took over 25 and still wasn’t done.

- paraphrasing my collegue Kim Leskovski

On collecting requirements up front:

At the very beginning, we know less about our project than we’ll ever know again. This is the worst possible moment to be making firm decisions about what we “require.”

- Ron Jeffries in Estimation is Evil: Overcoming the Estimation Obsession

On the estimate of project delivery date at its initial phase:

It’s based on an unrealistic list of requirements, using weak estimates, made at the moment of maximum ignorance, by people who are always optimistic about their own abilities.
- ibid

On planning and requirements (the Chrysler’s C3 payroll project, having a payroll expert and a team familiar with the domain):

This was one of the best-planned projects I’ve ever seen, and even so, at least one third of the requirements were added, removed, or substantially changed.
- ibid

[..] a line of code is a liability, not an asset [..]
- Jez Humble in Why Software Development Methodologies Suck

Agile is not something you do, it is something you are.
- Huib Schoots in Creating my own flow
with personal kanban, Agile Record Feb 2013

Posted in General, SW development, Tools, Top links of month | Tagged: , , , , , , , , , , , , , , , , | Comments Off

Most interesting links of January ’13

Posted by Jakub Holý on January 31, 2013

Recommended Readings

Various

  • Dustin Marx: Significant Software Development Developments of 2012 – Groovy 2.0 with static typing, rise of Git[Hub], NoSQL, mobile development (iOS etc.), Scala and Typesafe stack 2.0, big data, HTML5, security (Java issues etc.), cloud, DevOps.
  • 20 Kick-ass programming quotes – including Bill Gates’ “Measuring programming progress by lines of code is like measuring aircraft building progress by weight.”,  B.W. Kernighan’s “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”, Martin Golding’s “Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.” (my favorite)
  • How to Have a Year that Matters (via @gbrindusa) – do you want to just survive and collect possessions or do you want to make a difference? Some questions everybody should pose to him/herself.
  • Expression Language Injection – security defect in applications using JSP EL that can sometimes leads to double evaluation of the expressions and thus makes it possible to execute data supplied by the user in request parameters etc. as expressions, affects e.g. unpatched Spring 2.x and 3.

Languages etc.

  • HN discussion about Scala 2.10 – compilation speed and whether it matters, comparison of the speed and type system with Haskell and OCaml, problems with incremental compilation (dependency cycles, fragile base class), some speed up tips such as factoring out subprojects, the pros and cons of implicits etc.
  • Blog Mechanical Sympathy – interesting posts and performance tests regarding “writing software which works in harmony with the underlying hardware to gain great performance” such as Memory Access Patterns Are Important and Compact Off-Heap Structures/Tuples In Java.
  • Neal Ford: Functional thinking: Why functional programming is on the rise – Why you should care about functional programming, even if you don’t plan to change languages any time soon – N. Ford explains the advantages of FP and why FP concepts are spreading into other languages (higher abstractions enabling focus on the results over steps and ceding control to the language, more reusability on a finer level (higher-order functions etc.), few generic data structures with many operations -> better composability, “new” and different tool such as lazy collections, shaping the language towards the problem instead of vice versa, aligning with trends such as immutability)
  • Neal Ford: Java.next: The Java.next languages Leveraging Groovy, Scala, and Clojure in an increasingly polyglot world – a comparison of these languages with focus on what they are [not] suitable for, exploration of their paradigms (static vs. dynamic typing, imperative vs. functional)

SW development

  • How to Completely Fail at BDD – a story of an enthusiastic developer who tried to make everyone’s life better by introducing automated BDD tests and failed due to differences in culture (and inability to change thinking from the traditional testing), a surprising lack of interest in the tool and learning how to write good tests: “Culturally, my current team just isn’t ready or interested in something like this.” Morale: It is hard to change people, good ideas are not enough.
  • M. Feathers: Refactoring is Sloppy – refactoring is often prioritized out of regular development and refactoring sprints/stories aren’t popular due to past failures etc. An counter-intuitive way to get refactoring in is to imagine, during planning, what the code would need to be like to make it easy to implement a story. Then create a task for making it so before the story itself and assign it to somebody else then the story (to force a degree of scrutiny and communication). “Like anything else in process, this is medicine.  It’s not meant to be ‘the way that people do things for all time’ [..]” – i.e. intended for use when you can’t fit refactoring in otherwise. It may also make the cost of the current bad code more visible. Read also the commits (f.ex. the mikado method case).
  • Cyber-dojo: A great way to practice TDD together. Compare your read-green cycle and development over time with other teams. Purposefully minimalistic editor, a number of prepared tdd tasks.
  • On the Dark Side of “Craftsmanship” – an interesting and provoking article. Some developers, the software labouers, want to get work done and go home, they haven’t the motivation and energy to continualy spend time improving themselves. There is nothing wrong with that and we shouldn’t disparge them because of that. We shouldn’t divide people into craftsmen and the bad ones. A summary of and response to the varied reactions follows up in More on “Craftsmanship”. The author is right that we can’t expect everybody to spend nights improving her/his programming skills. Still they should not produce code of poor quality (with few exceptions) since maintaining such code costs a lot. There should be time for enough quality in a 9-5 day and people should be provided with enough guidance and education to be able to write decent code. (Though I’m not sure how feasible it is, how much effort it takes to become an acceptable developer.) Does the increased cost of writing (an learning to write) good code overweight the cost of working with bad code? That is an eternal discussion.

Cloud, web, big data etc.

  • Whom the Gods Would Destroy, They First Give Real-time Analytics (via Leon) – a very reasonable argument against real-time analytics: yes, we want real-time operational metrics but “analytics” only makes sense on a sensible amount of data (for the sake of statistical significance etc.) RT analytics could easily provide misguided results.
    CAP Twelve Years Later: How the “Rules” Have Changed (tl;dr, via @_dagi) – an in-depth discussion of the CAP theorem and the simplification (2 out of 3) that it makes; there are many more nuances. By Eric Brewer, a professor of computer science at the University of California, Berkeley, and vice president of infrastructure at Google.
  • ROCA: Resource-oriented Client Architecture – “A collection of simple recommendations for decent Web application frontends.” Server-side: true REST, no session state, working back/refresh etc. Client: semantic HTML independent of layout, progressive enhancement (usable with older browsers), usable without JS (all logic on the server) etc. Certainly not suitable for all types of apps but worthwile to consider the principles and compare them with your needs.

Clojure Corner

Tools

  • Vaurien, the Chaos TCP Proxy (via @bsvingen) – an extensible proxy that you can control from your tests to simulate network failure or problems such as delays on 20% of the requests; great for testing how an application behaves when facing failures or difficulties with its dependencies. It supports the protocols tcp, http, redis, memcache.
  • Wvanbergen’s request-log-analyzer for Apache, MySQL, PostgreSQL, Rails and more (via Zarko) – generates a performance report from a supported access log to point out requests that might need optimizing
  • Working Effectively With iTerm2 (Mac) – good tips in the body and comments

Favorite Quotes

A very good (though not very scientific) definition of project success applicable for distinguishing truly agile from process-driven projects:

[..] a project is successful if:

  • Something was delivered and put to use
  • The project members, sponsors and users are basically happy with the outcome of the project

- Johannes Brodwall in “How do we become Agile?” and why it doesn’t matter, inspired by Alistair Cockburn

(Notice there isn’t a single word about being “on time and budget”.)

Posted in General, Languages, Testing, Tools, Top links of month | Tagged: , , , , , , , , , , , | Comments Off

Most interesting links of December ’12

Posted by Jakub Holý on December 31, 2012

Recommended Readings

Software development

  • Kent Beck: When Worse Is Better: Incrementally Escaping Local Maxima – Kent reintroduces his Sprinting Centipede strategy (“reduce the cost of each change as much as possible so as to enable small changes to be chained together nearly continuously” => “From the outside it is clear that big changes are happening, even though from the inside it’s clear that no individual change is large or risky.”) and advices how to deal with situations where improvements have reached a local maxima by making the design temporarily intentionally worse (f.ex. by inlining all the ugly methods or writing to both the old and the new data store); strongly recommended
    • Related: Efficient Incremental Change – transmute risk into time by doing small, safe steps, then optimize your ability to make these steps quickly and thus being able to achieve large changes
  • Researchers: It is not profitable to outsource development – the Scandinavian research organisation SINTEF ICT has studied the effects of outsourcing and discovered that often it is more expensive than in-country development due to hidden costs caused by worse communication and cultural differences (f.ex. Indians tend not to ask questions and work based on their, often incomplete, understanding) and very high people turn-over; even after the true cost is discovered, companies irrationally stay there. However it is possible to succeed, in some cases.
  • Bjørn Borud: Tractor pulling and software engineering – very valuable and pragmatic advices on producing good software (i.e. avoiding accumulating so much crap that the software just stops progressing). Don’t think only about the happy path. Simplify. Write for other developers, i.e. avoid too “smart” solutions, test & document, dp actually think about design and its implication w.r.t performance etc. Awake the scientist in you: “Do things because you know they work, not because it happens to be the hip thing to do.”
    (Note: I see the good intention behind “design for the weakest programmer you can think of” but plase don’t take it too far! Software should be primarily simple, not necessarily easy.
  • Know your feedback loop – why and how to optimize it – to succeed, we need to learn faster; the only way to do that is to optimize our feedback loops, i.e. shorten the path our assumptions travel before they are (in)validated, whether about our code, business functionality, or the whole project idea. Conscise, valuable.
  • Code quality is the least important reason to pair program – the author argues, based on his experience, other benefits of pair programming are more important than code quality: “[..]  the most important reasons why we pair: it contributes to an amazing company culture, it’s the best way to bring new developers up to speed, and it provides a great way to share knowledge across the development team.”
  • You Can’t Refactor Your Way Out of Every Problem – refactoring can’t help you if the design is fundamentally wrong, you need to rewrite it; know when it can or cannot help and act accordingly (related to how much design is needed upfront since some design decision cannot be reverted/improved upon)

Languages

  • Josh Bloch: Java – the good, bad and ugly parts (video, 15 min); summary: right design decisions (VM, GC, threads, dynamic linking, OOP, static typing, exceptions, …), some bad details (signed byte, lossy long-> double, == doesn’t cal .equals, ability to call overriden methods from constructors, …); Mr. Bloch has also given a longer talk examining the evolution of Java from 1.0 to 1.7 in The Evolution of Java: Past, Present, and Future.
  • True Scala complexity – a thoughtful criticism of the complexity of Scala, based on code samples; “[it is true that] Scala is a language with a smaller number of orthogonal features than found in many other languages. […] However, the problem is that each feature has substantial depth, intersecting in numerous ways that are riddled with nuances, limitations, exceptions and rules to remember. It doesn’t take long to bump into these edges, of which there are many.”; however, its possible to avoid many of the problems mentioned by resorting to less smart, more clumsy and verbose Java-like code; also, the author still likes Scala.
  • Scala or Java? Exploring myths and facts (3/2012) – a balanced view of Scala’s strengths and weaknesses; “[..] the same features that makes Scala expressive can also lead to performance problems and complexity. This article details where this balance needs to be considered.” Topics: productivity, complexity, concurrency support, language extensibility, Java interoperability, quality of tooling, speed, backward compatibility. Plenty of useful links.

Big data & Cloud:

  • Dean Wampler’s slides from Beyond Map Reduce – 1) Hadoop Map Reduce is the EJB 2 of big data but there are better APIs such as Cascading with Scala/Clojure wrappers; there are also “alternative” solutions like Spark and Storm; 2) functional/relational programming with simple data structures (lists, sets, maps etc.) is much more suitable for big data than OOP (for we do mostly stateless data transformations)
  • Apache HBase vs Apache Cassandra – comparison sheet – if you want to decide between the two
  • Optimizing MongoDB on AWS – 20 min talk about the current state of the art. Simplicity: Mongo AMIs by 10gen, Cloudformation template etc. Stability & perf.: new storage options – EBS with provisioned IOPS volumes (high I/O) + EBS Optimized Instances (dedicated throughput to EBS), High IO instances (hi1.4xlarge – SSD)); comparison of throughput (number of operations, MBs) of these storages; tips for filesystem config. Scalability: scale horizontally and vertically, shrink as needed.
  • Getting Real About Distributed System Reliability by Jay Kreps, the author of the Voldemort DB: distributed software is NOT somehow innately reliable; a common mistake is to consider only probability of independent failures but failures typically are dependent (e.g. network problems affect the whole data center, not a single machine); the theoretical reliability “[..] is an upper bound on reliability but one that you could never, never approach in practice”; “For example Google has a fantastic paper that gives empirical numbers on system failures in Bigtable and GFS and reports empirical data on groups of failures that show rates several orders of magnitude higher than the independence assumption would predict. This is what one of the best system and operations teams in the world can get: your numbers may be far worse.” The new systems are far less mature (=> mor bugs, worse monitoring, less experience) and thus less reliable (it takes a decade for a FS to become mature, likely similar here). Distributed systems are of course more complex to configure and operate. “I have come around to the view that the real core difficulty of these systems is operations, not architecture or design.” Some nice examples of failures.

Other

  • Talks To Help You Become A Better Front-End Engineer In 2013 (tl;dr) – topics such as mobile web development, modern web devel. workflow, current/upcoming featrues of CSS3, ECMAScript 6, CSS preprocessors (LESS etc.), how to write maintainable JS, modular CSS, responsive design, JS debugging, offline webapps, CSS profiling and speed tracer, JS testing
  • On Being A Senior Engineer – valuable insights into what makes an engineer “senior” (i.e. mature; from the field of web operations but applies to IT in general): mature engineers seek out constructive criticism of their designs, understand the non-technical areas of how they are perceived (=> assertive, nice to work with etc.), understand that not all of their projects are filled with rockstar-on-stage work, anticipate the impact of their code (on resource usage, others’ ability to understand & extend it etc.), lift the skills and expertise of those around them, make their trade-offs explicit when making decisions, do not practice “Cover Your Ass Engineering,” are able to view the project from another person’s (stakeholder’s) perspective, are aware of cognitive biases (such as the Planning Fallacy), practice egoless programming, know the importance of (sometimes irrational) feelings people have.

Clojure Corner

  • Polymorphism in Clojure – Tim Ewald’s 1h live coding talk at Øredev conference introducing mechanisms for polymorphism (and Java interoperability) in Clojure and explaining well the different use cases for them. Included: why records, protocols & polymorphism with them (shapes, area => open, not explicit switch) (also good for Java interop.: interfaces), reify, multimethods.
  • Stuart Sierra: Thinking in Data (1h talk) – Sierra introduces data-oriented programming, i.e. programming with generic, immutable data structures (such as maps), pure functions, and isolated side-effects. Some other points: Records are an optimization, only for perforamnce (rarely) or polymorphism (ot often); the case for composable functions;  testing using simulations (generative testing) etc.; visualization of state & process

Tools & Libs

  • Netflix’ Hysterix: library to make distributed systems more resilitent by preventing a single slow/failing dependency from causing resource (thread etc.) exhaustion etc. by wrapping external calls in a separate thread with clear timeouts and support for fallbacks, with good monitoring etc. Read “Problem Definition” on the page to understand the problem it tries to solve.

Favorite Quotes

if you build something that is fundamentally broken it isn’t really interesting that you followed the plan or you followed some methodology — the thing you built is fundamentally broken.

- Bjørn Borud, Chief Architect at Comoyo.no, in an email 12/2012

The root of the Toyota Way is to be dissatisfied with the status quo; you have to ask constantly, “Why are we doing this?”

- Katsuaki Watanabe, Tyota President 2005 – 2009 (from the talk Deliberate Practice)

Posted in General, Languages, Tools, Top links of month | Tagged: , , , , , , , , , , , , | Comments Off

Creating On-Demand Clusters in EC2 with Puppet

Posted by Jakub Holý on May 5, 2012

For a recent project I needed to be able to start on-demand clusters of machines in Amazon EC2. We needed each instance in a cluster to allow SSH and sudo access for all team members and to install and configure the software appropriate for that cluster (“database” node or “testclient” node).

You can see the results of my effort using EC2 command-line tools, Puppet etc. at the project’s Puppet GitHub repository, the setup is described in detail in its README.

(Tips for improvements are welcome. And not, Star Cluster isn’t what we needed.)

Posted in General, Tools | Tagged: , , , | Comments Off

Most interesting links of Mars ’12

Posted by Jakub Holý on March 31, 2012

Recommended Readings

  • ThoughtWorks Technology Radar 3/2012 – including apps with embedded servlet containers (assess), health check pages for webapp monitoring, testing at the appropriate level (adopt), JavaScript micro-framewors (trial, see Microjs.com), Gradle over Maven (e.g. thanks to flexibility), OpenSocial for data & content sharing between (enterprise) apps (assess), Clojure (before in asses) and CoffeeScript on trial (Scala very close to adopt), JavaScript as a 1st class language (adopt), single-threaded servers with aync I/O (Node.js, Webbit for Java [http/websocket], …; assess).
  • Jez Humble: Four Principles of Low-Risk Software Releases – how to make your releases safer by making them incremental (versioned artifacts instead of overwritting, expand & contract DB scripts, versioned APIs, releasing to a subset of customers first), separating software deployment from releasing it so that end-users can use it (=> you can do smoke tests, canary releasing, dark launching [feature in place but not visible to users, already doing something]; includes feature toggles [toggle on only for somebody, switch off new buggy feature, …]), delivering features in smaller batches (=> more frequently, smaller risk of any individual release thanks to less stuff and easier roll-back/forward), and optimizing for resiliance (=> ability to provision a running production system to a known good state in predictable time – crucial when stuff fails).
  • The Game of Distributed Systems Programming. Which Level Are You? (via Kent Beck) – we start with a naive approach to distributed systems, treating them as just a little different local systems, then (painfully) come to understand the fallacies of distributed programming and start to program explicitely for the distributed environment leveraging asynchronous messaging and (often functional) languages with good support for concurrency and distribution. We suffer by random, subtle, non-deterministic defects and try to separate and restrict non-determinism by becoming purely functional … . Much recommended to anybody dealing with distributed systems (i.e. everybody, nowadays). The discussion is worth reading as well.
  • Shapes Don’t Draw – thought-provoking criticism of inappropriate use of OOP, which leads to bad and inflexible code. Simplification is OK as long as the domain is equally simple – but in the real world shapes do not draw themselves. (And Trades don’t decide their price and certainly shouldn’t reference services and a database.)
  • Capability Im-Maturity Model (via Markus Krüger) – everybody knows CMMI, but it’s useful to know also the negative directions an organization can develop in. Defined by Capt. Tom Schorsch in 1996, building on Anthony Finkelstein’s paper A Software Process Immaturity Model.
  • Cynefin: A Leader’s Framework for Decision Making – an introduction into the Cynefin cognitive framework – the key point is that we encounter 5 types of contexts differing by the predictability of effects and each of them requires a different management style, using the wrong one is a recipe for a disaster. Quote:

    The framework sorts the issues facing leaders into five contexts defined by the nature of the relationship between cause and effect. Four of these—simple, complicated, complex, and chaotic—require leaders to diagnose situations and to act in contextually appropriate ways. The fifth—disorder—applies when it is unclear which of the other four contexts is predominant.

  • Et spørsmål om kompleksitet (Norwegian). Key ideas mixed with my own: Command & control management in the traditional Ford way works very well – but only in stable domains with clear cause-and-effect relationships (i.e. the Simple context of Cynefin). But many tasks today have lot of uncertanity and complexity and deal with creating new, never before seen things. We try to lead projects as if they were automobile factories while often they are more like research – and researchers cannot plan when they will make a breakthrough. Most of the new development of IT systems falls into the Complex context of Cynefin – there is lot of uncertanity, no clear answers, we cannot forsee problems, and have to base our progress on empirical experience and leverage emergence (emergent design, ..).
  • The Economics of Developer Testing – a very interesting reflection on the cost and value of testing and what is enough tests. Tests cost to develop and maintain (and different tests cost differently, the more complex the more expensive). Not having tests costs too – usually quite a lot. To find the right ballance between tests and code and different types of tests we must be aware of their cost and benefits, both short & long term. Worth reading, good links. (Note: We often tend to underestimate the cost of not having good tests. Much more then you might think.)

Links to Keep

Quotes

Kent Beck answering a question about how much testing to do (highlighted by me):

I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it. I do tend to make sense of test errors, so I’m extra careful when I have logic with complicated conditionals. When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong.

Different people will have different testing strategies based on this philosophy, but that seems reasonable to me given the immature state of understanding of how tests can best fit into the inner loop of coding. Ten or twenty years from now we’ll likely have a more universal theory of which tests to write, which tests not to write, and how to tell the difference. In the meantime, experimentation seems in order.

Posted in General, Testing, Top links of month | Tagged: , , , , , | Comments Off

Most interesting links of December

Posted by Jakub Holý on December 31, 2011

Recommended Readings

  • The Netflix Chaos Monkey – how to test your preparedness for dealing with a system failure so that you won’t experience nasty wakeup when something really fails in Sunday 3 am? Release a wild, armed monkey into your datacenter. Watch carefuly what happens as it randoly kills your instances. This is exactly what Netflix does with their with their cloud infrastructure – also a great inspiration for my recent project. Do you need to be always available? Than consider employing the chaos monkey – or a whole army of monkeys!
    (PS: There is also a post with a picture of the scary monky.)

Links to Keep

  • BDD: Write specifications, not scripts (from the Concordion site) – relatively brief yet very enriching practical description of how to do behavior-driven development a.k.a. Specification by Example right, the key point here being “Write specifications, not scripts.” It says why not (for scripts overspecify -> are brittle, specs should be stable) and how to do it (decouple the stable spec and the volatile system via fixture code, expose minimal stuff to the spec, perhaps evolve a DSL between the fixture and the system). It also lists common “smells” of BDD done wrong. If it still isn’t clear to you, read the Script to Specification Makeover example (or perhaps read it anyway). BTW, Concordion is a new tool for doing BDD based on JUnit and HTML, which was created as a response to the weaknesses of Fit[Nesse], i.e. exactly the tendency to do scripting instead of specifications. It looks very promissing to me!

SW Utilities

  • (Linux/Mac) Autojump – superfast navigation between favorite directories in the command line (via Jake McCrary) – it keeps track of how much time you spend in each directory and when you execute j <substring of directory path/name>, it jumps into the most frequently used one matching the substring. It awesome! (You can also run jumpstat to see the statistics.)
  • (Linux) Tcpkill – service/network outage testing (via Jake McCrary) – kill connections to or from a particular host, network, port, or combination of all – useful e.g. when you want to test that your software is resilient to the outage of a particular service or server – less brutal than actually killing the database etc. instances. We need to test that our application recoveres properly when one of our MongoDB nodes dies so this may be quite useful.
  • Manik Hot Deploy Plugin for Maven Projects (v1.0.2 in 5/2011; older version in the Marketplace) – plugin that can do hot and incremental deployment to any app server (simply by copying to its hotdeploy directory or the directory of an installed webapp) whenever you run mvn install or automatically whenever sources change, multi-module support

Clojure Corner

  • Jake McCrary: Continuous Testing With Clojure and Expectations – continuous test runner lein-autoexpect for Clojure tests written using the library expectations by Jay Field.
  • Jake McCrary: Quickly Starting a Powerful Clojure REPL – Clojure REPL only two steps away: 1) Run Emacs, 2) Execute M-x clojure-swank (no more need to open an existing Leinigen project) – the trick is to install the Leinigen plugin swank-clojure and use Jake’s elisp function clojure-swank that automatically starts the swank-clojure server. (I had to hack the function for the clojure-swank output contained “null” instead of “localhost”, likely due to incorrect DNS setup.)

Posted in eclipse, General, Testing, Top links of month | Tagged: , , , , , , , | Comments Off

Getting Started with Amazon Web Services and Fully Automated Resource Provisioning in 15 Minutes

Posted by Jakub Holý on December 7, 2011

While waiting for a new project, I wanted to learn something useful. And because on many projects we need to assess and test the performance of the application being developed while only rarely there is enough hardware for generating a realistic load, I decided to learn more about provisioning virtual machines on demand in the Cloud, namely Amazon Web Services (AWS). I’ve learned a lot about the tools available to work with AWS and the automation of the setup of resources (machine instances, security groups, databases etc.) and automatic customization of virtual machine instances in the AWS cloud. I’d like to present a brief introduction into AWS and a succinct overview of the tools and automation options. If you are familiar with AWS/EC2 then you might want to jump over the introduction directly to the automation section.

Read the rest of this entry »

Posted in General, Testing, Tools | Tagged: , , , , | 1 Comment »