The Holy Java

Building the right thing, building it right, fast

Posts Tagged ‘book’

Review: Clojure for Machine Learning (Ch 1-3)

Posted by Jakub Holý on June 26, 2014

Book coverPack Publishing has asked me to review their new book, Clojure for Machine Learning (4/2014) by Akhil Wali. Interested both in Clojure and M.L., I have taken the challenge and want to share my impressions from the first chapters. Regarding my qualification, I am a medium-experienced Clojure developer and have briefly encountered some M.L. (regression etc. for quantitive sociological research and neural networks) at the university a decade ago, together with the related, now mostly forgotten, math such as matrices and derivation.

In short, the book provides a good bird-eye view of the intersection of Clojure and Machine Learning, useful for people coming from both sides. It introduces a number of important methods and shows how to implement/use them in Clojure but does not – and cannot – provide deep understanding. If you are new to M.L. and really like to understand things like me, you want to get a proper textbook(s) to learn more about the methods and the math behind them and read it in parallel. If you know M.L. but are relatively new to Clojure, you want to skip all the M.L. parts you know and study the code examples and the tools used in them. To read it, you need only elementary knowledge of Clojure and need to be comfortable with math (if you haven’t worked with matrices, statistics, or derivation and equations scare you, you will have a hard time with some of the methods). You will learn how to implement some M.L. methods using Clojure – but without deep understanding and without knowledge of their limitations and issues and without a good overview of alternatives and the ability to pick the best one for a particular case.

Read the rest of this entry »

Posted in Languages | Tagged: , , , | Comments Off

Most interesting links of November ’13

Posted by Jakub Holý on November 30, 2013

Recommended Readings

Some interesting topics this time despite me spending lot of time on the Principles of Reactive Programming course: Java x Node.js, REST x other future-proof architectures, scary legacy code. Of course, also plenty of Clojure.

People, organizations, teams, development:

  • Chris Argyris (1923-2013): An Appreciation – Thinkers 50 – recently departed Ch. Argyris is a person whose work you should know, if a bit interested in learning and organizations and how they (dis)function; and since we all work in organizations and want our work to be pleasant, this means all of us. We all want to work in orgs that do double-loop learning, i.e. they actually evovle as they learn. “Argyris argued that organizations depend fundamentally on people and that personal development is and can be related to work.” Now stop and go read it!
  • Bob Marshall: The Antimatter Principle – “the only principle we need for becoming wildly effective at collaborative knowledge work” – can be summarized as “attend to folks’ needs” (importantly, including your own) => find out what people actually need, interpret their behavior (including anger, seemingly irrational or stupid requests etc.) in terms of needs; mastering this will make you excell in communication and effective work. Read the post to find out more.
  • It’s a state of mind: some practical indicators on doing agile vs. being agile – are you agile or are just “doing agile”? Read on ti find out, if you dare! F.ex. “Non Agile teams have a process that slows the review of the changes.” Cocnlusion: “An Agile mindset is just that – a state of mind, a set of values. Its a constant questioning, and an opening up to possibilities. Its a predisposition to produce great things.
  • Johannes Brodwall:  Humble architects – how to avoid being an architect that does more harm than good, as so many out there? Some tips: Don’t assume stupidity, Be aware that you might be wrong, Be careful with technology (i.e. simplicity beats everything; applies so much to us developers too!), Consistency isn’t as important as you think (or beware context), Tactical reuse in across systems is suboptimization (i.e. reuse has a cost), separate between (coding) rules and dogma (i.e. is that way unsafe, incomprehensible, or just a heresy w.r.t. a dogma?) Very valuable insights into creating good technical solutions and teams that work.
  • Liz Keogh’s The Dream Team Nightmare: a Review – a very good review of this adventure-style book about coaching “agile” teams through (around?) common pitfalls, provides a good base for deciding whether you shall read the book (Liz essentially says yes)
  • Fibonacci Kittens: One Idea One Commit – a short story of coming from biannual releases to frequent release of individual features; I link to this primarily to spread optimism, if this company managed it then, perhaps, we other can too?
  • The Eternal Struggle Between Business and Programmers – “Business: More features! Now! Programmers: More refactoring! Now!” How can we resolve this eternal conflict of needs? This post reveals how the two parties can find a common ground and mutual understanding (beware, everybody must give up on something) and thus work together rather than against each other.

Coding, architecture, legacy

  • Why the future is NOT RESTful – always refreshing to read something against the mainstream; “REST is not fit for the next generation of smart client applications because it has not been designed for smart clients.” According to the author, a smart client app stack needs: “1. persistence (storage and query), 2. documents/orm (conversion to tree-like datastructures), 3. data authorization (once authenticated), 4. pub/sub (websocket communications), 5. client db (client-side caching and querying), 6. templating (presentation level)” Meteor.js has nearly all but #3 thanks to mongodb (1+2), dpp (4), mongo on the client (5), spark (6). The author considers a similar but Clojure-based stack (with Datomic, Angular etc.) and looks at authorization possibilities. “Secured, personalised, CRUD operations are the future to a more simplified web.” We may agree or not but it certainly is worth reading.
  • Michael Feathers (of Working Effectively With Legacy fame): Unconditional Programming – “Over and over again, I find that better code has fewer if-statements, fewer switches, and fewer loops. Often this happens because developers are using languages with better abstractions. [..] The problem with control structures is that they often make it easy to modify code in bad ways.” Includes a nice example of replacing if-else with much more readable and safer code.
  • The Quality of Embedded Software, or the Mess Has Happened – an interesting and scary read about terrible spaghetti code (and hardware) that makes some Toyotas to accelerate when the driver tries to break; 11,000 global variables, the key function showing so high cyclomatix complexity that “makes it impossible not only to test but even maintain this program in any way.” Then 80k violations of the car industry coding standard cannot surprise. And a safety control that does not work. Interesting that a great manufacturer may have so terrible IT (and Toyota isn’t the only one).
  • The string type is broken – the String type is M. Feathers’ favourite example of a leaky abstraction – most languages fail to process/split/count less common Unicode characters properly, the fact that String is implemented as a series of bytes leaks through (UTF-16 langs like Java); worth reading to be aware of the limitations

Languages

  • Why I’m Productive In Clojure – some interesting points about simplicity, consciousness, interactive development, power without overwhelming fatures, etc. “With it [Clojure] I can always easily derive a solution to a particular problem from a small set of general patterns. [..] However, the number of ways that these concepts can be combined to solve all manner of problems appears to be inexhaustible.
  • Node.js at PayPal – PayPal is switching from Java to Node.js (among others to promote language consistency) and, as a part of that, has implemented the same app in Node and Java; results: Node was done earlier, had less code, performed better (though, as Daniel Kvasnička pointed out, “Comparing Node.js and servlet-based archs is not fair… compare Node with @vertx_project and you’ll get a whole different story ;)”; also, as Charles Nutter said, “The @PayPal numbers for their Java to Node move are absurd. A JVM app doing 1.8 pages/s isn’t slow…it’s broken.“)
  • IBM: Developing mobile apps with Node.js and MongoDB, Part 1: A team’s methods and results – also IBM has implemented the same (REST) app once with Java and DB2, once with Node and Mongo where Node+Mongo required less work and performed better; one of the great wins was having JSON as a native structure everywhere instead of transforming from/to it so Mongo is, in my opinion, an important factor in this particular case
  • Dynamics of Programming: Benefits of Scala in CS1 – reasons for and experiences with using Scala in an introductory computer science course, worth reading; some of the advantages over Java are consistency (.asInt on String and Double vs. casting/parsing, no “primitive” types), REPL with time inference good for learning, functional style enables focus on what rather than how; quite persuasive arguments

Security

  • The New Threat: Targeted Internet Traffic Misdirection – did you know that internet traffic to any site can be made to go through a particular server without anybody noticing? This has been observed repeatedly in the wild, for banks and other sites. Rather make sure you use strong encryption (NSA-approved, of course ;)).
  • A (relatively easy to understand) primer on elliptic curve cryptography – Everything you wanted to know about the next generation of public key crypto – you cannot just read this, you have to study it, which I did not; but it looks good and I guess the time will come when I will come back to it

Other

Clojure Corner

  • Stuart Sierra’s Component – a library for making it easier to implement Stuart’s reloadable workflow; a component is something that can be started, stopped, and depend on other components so that it is easier to do interactive REPL development
  • Logan Linn: Clojure/conj 2013 – a pretty good overview of the conference
  • Caribou – “the kernel of usefulness that has emerged from years of this basic practice“- a new Clojure web framework – seems to be interesting
  • Results of the 2013 State of Clojure & ClojureScript survey and drill-down into what features people want – the most interesting fact is how many more participants use Clojure in production than last year and perhaps also the relatively wide adoption of Datomic among the respondents. Light Table has become the 3rd most popular dev. env., after Emacs and Vim. Some of the most mentioned language features requested were types (=> core.typed, Prismatic’s Schema), better error reporting (=> slingshotdireclj-stacktraceio.aviso:pretty, etc.), debuger (though progress is being made)
  • Book: Clojure High Performance Programming
  • Improving Clojure Feedback : Stack Traces – making Clojure stacktraces more usable by filtering out noise and linking to relevant content – io.aviso:pretty,  io.aviso:twixt
  • Clojure Dev discussion: Hashing strategies – Executive summary – “In Clojure, however, it is far more common than in Java to use longs, vectors, sets, maps and compound objects comprised of those components (e.g., a map from vectors of longs to sets) as keys in other hash maps.  It appears that Java’s hash strategy is not well-tuned for this kind of usage.  Clojure’s hashing for longs, vectors, sets, and maps each suffer from some weaknesses that can multiply together to create a crippling number of collisions.” Ex.: An implementation of N-Queens took forever, spending most time on hash collisions. But be calm, smart Clojurians are working on a solution.
  • Datomic Pro Starter Edition – Datomic with all storages, Datomic Console, a year of updates, full Datomic programming model; limitations: no HA transactor, no integrated memcached, max 2 peers and 1 transactor

Tools/Libs

  • AirPair.com – a new site that enables developers to get help from other devs via remote pairing, code review, code mentoring etc. – a good opportunity to get help / help others (and earn something); I haven’t tried it but it sounds pretty interesting

MongoDB web stacks

  • Meteor: JS frontend + MongoDB backend with changes in the DB pushed live to the clients, i.e. MongoDB is used both as the “application server” and storage. It seems great for apps where users need to collaborate in real-time with each other, certainly great for quick proof of concepts etc.; worth checking out; it also comes with free (at least for start?) hosting so really good for prototyping – “an open-source platform for building top-quality web apps in a fraction of the time.” The intro screencast will give you a good overview (10 min).
  • Mean.io – MEAN (Mongo, Express, Angular, Node) stack Boilerplate – frontend, backend and storage using the same language and some of the most popular technologies (not that popular = best fit for you :)); it seems to be very new but since it just glues together 4 popular and documented technologies, that should not be an obstacle. There is an intro on the MongoDB blog.

Other

  • BusyBox (get latest win binary from Tigress.co.uk) – reportedly a better POSIX env for Windows than gow, Cygwin, et al.

Favourite Quotes

[..] no organization should exist unless it is “of service” to its employees, its customers, its community.
- @Tom_Peters 28/11

I hope you’ll agree that there is a certain amount of irony involved in having to write repetitive code
Dmitri Sotnikov in Why I’m Productive In Clojure

Happy teams are productive teams but:

Morale is 95% a function of the prevailing system (the way the work works). Which in turn is a function of the prevailing collective mindset
@flowchainsensei Nov 10th

Posted in General, Languages, Top links of month | Tagged: , , , , , , , , , , | Comments Off

Most interesting links of September ’13

Posted by Jakub Holý on September 30, 2013

Recommended Readings

  • Stuff The Internet Says On Scalability For September 13, 2013 – a collection of interesting performance related articles with summaries (via @_dagi)
  • Can you copy a culture? The NUMMI story (audio/transcript) – how the GM factory with the worst workforce has been turned around via a good application of Toyota Production System – “a truly inspiring story of human potential and how systems can be designed to bring the best or worst of of people.” And how GM failed to learn from it and to copy Toyota’s culture.
  • The Reactive Manifesto – why to write reactive SW – “Reactive applications represent a balanced approach to addressing a wide range of contemporary challenges in software development. Building on an event-driven, message-based foundation, they provide the tools needed to ensure scalability and resilience. On top of this they support rich, responsive user interactions. We expect that a rapidly increasing number of systems will follow this blueprint in the years ahead.
  • Frameworkless JavaScript – Why Angular, Ember, or Backbone don’t work for us [Moot discussion platform] (via JavaScriptWeekly) Me: Frameworks are not always evil, but are likely overused and there are good cases when rolling your own solution is the best way. Why in Moot? Because the want a minimal API (no framework methods), small code size, small and familiar code base, no dependency hell and external package updates, no lock-in to technology that will be gone in few years, need WebSockets not REST. “Moot uses native pushState for managing URLs, John Resig’s “micro templating” for views, and internal communication between model and views happens with a custom event library. There is no router or automatic data-binding.” The looked at Angular, Ember, Backbone. “As a result of our combined perfectionism and minimalism, Moot is an extremely lightweight, manageable, and independent web application [..]
  • NYT: Eiji Toyoda, Promoter of the Toyota Way and Engineer of Its Growth, Dies at 100 – learn about the life of one of the founders of lean thinking
  • Gojko Adzic: How we solved our #1 product management problem – valuable experience of false assumptions, learning from users, and a much helpful UI remake: even if you build a product to scratch your itch, you have to test it with real users

Big data

  • Don’t use Hadoop – your data isn’t that big – a great post about the downside of Hadoop and that there are much better options (large disks, large RAM, Pandas/R/Postgres) for data up to few TBs. “In addition to being more difficult to code for, Hadoop will also nearly always be slower than the simpler alternatives.”
  • Gartner On Big Data: Everyone’s Doing It, No One Knows Why – golf talk / hype -driven initiatives FTW! “According to a recent Gartner report, 64% of enterprises surveyed indicate that they’re deploying or planning Big Data projects. Yet even more acknowledge that they still don’t know what to do with Big Data.”
  • What makes Spark exciting – why it might be a good replacement for Hive/Hadoop, based on experiences with H/H: “Hive has served us well for quite a while now. [...]  That said, it has gotten to the point where Hive is more frequently invoked in negative contexts (“damn it, Hive”) than positive. (Primarily due to being hard to test, hard to extend.)” “We had heard about Spark, but did not start trying it until being so impressed by the Spark presentation at AWS re:Invent [..] that we wanted to learn more. [..] Spark, either intentionally or serendipitously, addresses both of Hive’s primary shortcomings, and turns them into huge strengths. (Easy to test, extend.) [..] I find the codebase small and very easy to read, [..] –which is a nice consolation compared to the daunting codebases of Hadoop and Hive.” Cons.: Spark is only pre-1.0, the author hasn’t yet tested it heavily.
  • 10 Ways to Make Your Office Fun To Work In – because we spend there plenty of our time so why not have a pleasant/cosy, inspiring environment? Some tips: plants, not-your-boring-enteprprise-look-and-feel, open it to the nature (I want this!), design it as home, not office, provide play space (I am too into work to want to play but having a resting place for a nap is st. I’d love).

Books

Other

  • Stanford engineers build computer using carbon nanotube technology (via @RiczWest)
  • NYT: The Banality of Systemic Evil – a good article about human tendency to “obey the system” thus potentially causing evil – and thus the need to resist the system, as heroic individuals such as Snowden, Hammond, Schwartz, Manning. See the famous Eichmann in Jerusalem for how “doing your job” can create evil – “[..] what happens when people play their “proper” roles within a system, following prescribed conduct with respect to that system, while remaining blind to the moral consequences of what the system was doing — or at least compartmentalizing and ignoring those consequences.” (Tip: The book Moral Mazes explores the ethics of decision making within several corporate bureaucracies => mid-managers rules of life: (1) never go around your boss, (2) tell the boss what she wants to hear, (3) drop what she wants dropped, (4) anticipate what the boss wants so that she doesn’t need to act as a boss to get it, (5) do not report something the boss does not want reported, cover it up; the the job & keep your mouth shut.) “The bureaucracy was telling him [Snowden] to shut up and move on (in accord with the five rules in “Moral Mazes”), but Snowden felt that doing so was morally wrong.” “[..] there can be no expectation that the system will act morally of its own accord. Systems are optimized for their own survival and preventing the system from doing evil may well require breaking with organizational niceties, protocols or laws.
  • Fairphone – “A seriously cool smartphone that puts social values first” (likely the only one not built by poorly paid workers and creating too much ecological burden), for just €325. You can see detailed cost breakdown, list of suppliers, specs, and essentially everything. This is, in my opinion, super cool! Go and read the story!

Clojure Corner

  • Amazonica – “A comprehensive Clojure client for the entire Amazon AWS api.”
  • Talk Ritz, The Missing Clojure Tooling (40min, 9/2013) – thanks to this I finally understood how to use Ritz but it still seems not to work well, f.ex. setting a breakpoint always reported “Set 0 breakpoints” (lein ritz/middleware 0.7.0, nrepl-ritz.el 0.7.1); according to callen, debug-repl is simpler and nicer if you only care about local vars and evaluation. To try ritz: use M-x nrepl-ritz-jack-in, then M-x nrepl-ritz-break-on-exception, exec. f.ex. “(/ 1 0)”. In the poped-up buffer, t or enter to show frame locals, e to eval a code in the context of the frame etc. If you managed to trigger the debug buffer through a breakpoint, the actions lists would contain STEP etc. (See fun. nrepl-ritz-line-breakpoint)
  • C. Grand’s spreadmap – “library to turn Excel spreadsheets in persistent reactive associative structures” => access content via map functions; changing a value updates formula cells using it
  • Alembic Reloads your Leiningen project.clj Dependencies – add a dependency to your project.clj w/o needing to restart your REPL (just call (alembic.still/load-project), provided you have it in your lein dependencies). Limitations: cannot remove deps or change versions.
  • Defeating stack overflows – techniques for transforming mutually recursive calls etc. into something that won’t blow the stack – “Priming the pump” (memoize subresults first), core.async
  • Google Groups: Clean Architecture for Functional Programming – How do the Clean Architecture and the Clean Code best practices apply to FP (Clojure/Haskell)? Some points: OOP isn’t worse than FP, only people do class-oriented programming instead; OO better e.g. for UIs, combining them (func. core, imperative shell) can be sometimes best. Some clean arch. patterns are actually more like functions – “Interactors and Presenters, for example, do not maintain any state of their own.  Even those objects that do imply some kind of state, such as entities and gateways, keep that state hidden behind boundaries and present a functional interface instead.
  • night-vision: Handy, super light weight debugging utility – add it to your lein profile and then call (night-vision.goggles/introspect-ns! '<name of ns>) and it will print each entry/exit of a function within the scope of the namespace with the argument/return values
  • Nil Punning (Or Null Pointers Considered Not So Bad) – a great post about why nil in Clojure is not bad contrary to Java’s null (because it is actually an object, you can call functions on it, treat it as false/empty list/map/set, most core functions work on it)

Tools/Libs

Posted in General, Languages, Testing, Tools, Top links of month | Tagged: , , , , , , , , , , , , | Comments Off

Book Review & Digest: Boyd: The Fighter Pilot Who Changed the Art of War (Relevant for IT/Business)

Posted by Jakub Holý on April 29, 2013

This book is about a great person, about change, about one of the largest bureaucracies and dysfunctional organizations, about projects gone astray, about warfare and its latest evolution. Many of the challenges and ideas that we encounter in the book are not limited to the military domain but apply also to business and IT. It is worth reading whether you are  a military person, somebody trying to push through a change, a business person, or interested in thinking and organizations.

Read the rest of this entry »

Posted in General | Tagged: | Comments Off

Most interesting links of Mars ’13

Posted by Jakub Holý on March 31, 2013

Recommended Readings

A lot of stuff this month since I have finally got time to review some older articles. Quite a few articles by Fowler. Few really great (yet short) talks on agile & SW development.

Top

  • Agile in a Nutshell (originally Agile Product Ownership in a Nutshell) by Henrik Kniberg – the best explanation of the agile development process ever, in just 15 minutes and with wonderful animation; every developer should see this. Some highlights: the most important task of product owner is to say NO so that backlog doesn’t grow infinitely; at start, the estimates of size and value will suck and that’s OK because the value is in the conversation, not in the numbers (that are anyway just relative); the goal is to maximize outcome (value), not output (# features). Compromises between short-term vs. long-term goals, knowledge vs. customer value building etc. Build the right thing (PO) x build it right (devs) x build it fast (SM). Technical debt x sustainable pace. As I said – you MUST see it.
  • Martin Fowler: The Value of Software Design (talk, 22 min, from 0:45:00 til 1:07; Feb 2013) – a balanced argument for the value of good software design and internal code quality based on paying off by enabling us to keep our development speed. Discusses the DesignStaminaHypothesis (bad design => rapid decline of development speed), TechnicalDebt, TechnicalDebtQuadrant (Prudent x Reckless, Deliberate x Inadvertent), TradableQualityHypothesis. According to the experience of Fowler and others, the good design payoff point “it’s weeks, not months.”
  • What Does It Take To Become A Grandmaster Developer? – great post about cognition and learning, valuable references, quotes from an interesting study of good vs. mediocre developers. We have mental capacity for ~7 chunks of information => great performers recognize patterns and see and understand thus higher-level chunks and have many “chunks” (patterns encountered previously) readily available. You need deliberate effort to learn more chunks – especially initially but you must always try to get out of your comfort zone to grow. Experienced collegues can help a lot in acending the learning curve.

Agile, organization, innovation, project management

  • How to Prioritize a User Story Map – we all know that we should prioritize features by their value, risk, and lack of knowledge and that we should slice the features thin so that they fit into short iteration and can be deployed soon to produce feedback, right? Here we see a nice example of what happens if not done so and how to do feature slicing better.
  • Bob Marshall: Rightshifting – according to the author, 80% of knowledge work organizations are very ineffective, wasting resources on non-value-adding activites; only few are effective, even fewer highly effective. Rightshifting is the attempt at shiting them to the right, towards higher effectiveness. Links to a few videos explaining it more. Related: Steve McConnell’s Business Case for Better Software Practices, referring to a study by SEI; “The actual distribution of software effectiveness is asymmetric. Most organizations perform much closer to the worst practice than to the best.” – the best performing 10 times better then the worst/average (productivity, speed, defects, value)
  • On Antifragility in Systems and Organizational Architecture – introduces the concept of antifragility, based on Nassim Taleb’s book Antifragile that compares fragile, robust, and antifragile systems and organizational structures (which is also applicable to SW systems); robust = resists change (unless too large); antifragile: learn, adapt; closely related to DevOps and continous delivery
  • M. Fowler: PurposeOfEstimation – many Agilist disdain estimation, this is a balanced view: “estimation is valuable when it helps you make a significant decision.” (F.ex. when deciding what we (don’t) have resources for or when in need of coordinating related activities.) It is evil when used as commitments that people are forced to stick to and blamed for not managing to do so. “Above all be wary of anyone who tells you they [estimates] are always needed, or never needed.” A. Ferguson: “[..] it is poor project management (whether by project managers or other team members) that results in a client who thinks estimates are fixed, or that raw estimates = actual effort/duration”.
  • Ron Jeffries: Estimation is Evil – discusses the problems estimates can cause, issues with requirements gathering up front and their volatility, transparency and politics. Very valuable, highly recommended. See the “favorite quotes” at the bottom of this post. Also contains an interesting lesson learnt from the failed Chrysler C3 project: don’t try to build a grand new system to replace and fix the old one, fix one problem at a time – worth reading for this alone.
  • Interview with Steve Blank: Why Big Companies Can’t Innovate – the 2013 list of the world’s 50 most innovative companies has only a few large, established firms (those that have built innovation into its DNA such as Apple and Google). Established companies are less innovative because they focus in their existing business model, have risk-aversion (while there are many failures on the way to a new business model); finally “the people who are best suited to search for new business models and conduct iterative experiments usually are not the same managers who succeed at running existing business units.” – and thus aren’t given the chance. “[..]  the process of starting a new business [..] is fundamentally different from running an existing one. So if you want your company to grow organically, then you need to organize your efforts around these differences.”

Architecture & Ops

  • M. Fowler: Schemalessness + NoSQL and Consistency (20 + 20 min) two short, very good, balanced talks about NoSQL. He explains schemalessness and consistency and points out common misunderstanding about them so if you are into NoSQL, watch it.
  • What Powers Instagram: Hundreds of Instances, Dozens of Technologies (2012) – interesting high-level overview of the Instagram infrastructure based on AWS and Python (25 XL instances running Django/Gunicorn behind ELB with 3 Nginxes, sharded PostgreSQL with streaming replication on 12 QXL mem instances with software raid and XFS to freeze when snapshoting, media in S3, Redis, Solr for geo-search, Memcached. Gearman for task queues, pyapns for notifications. Munin for monitoring.)
  • The Netflix API Optimization Story – how Netflix redesigned its APIs to improve performance, reduce chattiness, and power product development and experimentation. The common REST API has become a development bottleneck and a lowest common denominator solution (w.r.t. supporting various clients). The main changes were: usage Hystrix for fault tolerance, each device team managing their own end-points in any JVM languges (primarily Groovy) and re-using common APIs (i.e. pushing some device-specific code to the server) => able to experiemnt more quickly, using the Functional Reactive Programming Model and asynchronous APIs (to abstract away thread-safety and parallel execution implementation details from the device teams so that code can execute sync. or async. without them needing to know).
  • Me: Overview of current monitoring libs for Java – Netflix’ Servo, Yammer’s Metrics, JavaMelody, JavaSimon.
  • Debug Servlets, or ‘HTTP Won; Use It’ – expose all debugging info of your services over HTTP – it makes debugging much simpler. We do a part of it and it really helps. Expose config (values, where they come from), logs, log configuration, JMX (setting it up otherwise not trivial), version, build number, git hash, server time (timezones tricky), metrics, stack dumps, app-specific status (Hadoop: live nodes, data size etc.). The author recommends JavaMelody to collect & visualize many common metrics. Not on security: Make sure to hide passwords and make the endpoints visible only internally. (Tip: consider Jolokia for exposing JMX over HTTP, see below.)
  • JVM Crash/Core Dump Analysis – 3 common categories of  JVM crash causes (JVM/JIT/JNI) and how to recognize and troubleshoot them

Other

  • How to lose wight in the browser:  The definitive front-end performance guide – a site by a number of experts from Twitter, Opera, Google, and other places with best practices for performant web sites (HTML, CSS, JS, jQuery, images). Ex.: styles up top, scripts down bottom; minify your html, css and JS; async script loading; combine css/JS files into one; cache array lengths while looping; use css sprites for icons.
  • Luke Stevens: The harsh truth about HTML5′s structural semantics (part 1) – “HTML’s structural elements — article, section, nav and aside — are, at first glance, some of the easiest parts of the HTML5 specification to understand and implement. However, they’re actually some of the most poorly specified, poorly understood, and poorly implemented parts of HTML5.” Interesting: The “research” leading to their establishment was quite random, ignoring a crucial source of information (css IDs).
  • Marco Emmanuel Patiño: Six non-technical books every programmer should read – 1. Team Geek: A Software Developer’s Guide to Working Well with Others (-> effective communication and collaboration), 2. The Pragmatic Programmer: From Journeyman to Master, 3. The Passionate Programmer: Creating a Remarkable Career in Software Development, 4. Clean Code: A Handbook of Agile Software Craftsmanship, 5. 97 Things Every Programmer Should Know: Collective Wisdom from the Experts (online), 6. Code Simplicity: The Fundamentals of Software.
    • Related: Top 5 Java programming books – Best of lot (actually 8) – 1) Head First Java, 2) Effective Java, 3) Thinking in Java, 4) Head First Design Pattern, 5) Concurrency Practice in Java, 6)Java performance, 7) Java Puzzlers, 8) Head First Object Oriented Analysis and Design.
  • Humans as slaves of chemistry: America’s Real Criminal Element – Lead – a fascinating article about how whole nations can be seriously influenced by a single chemical substance. Aside of that it is also fascinating to observe how we tend to search for causes in our domain of expertise (police, sociologists, …) and of interest while denying other possible causes, no matter how strong are the proofs. If the facts presented are true, then the fivefold increase in serious crimes in (not only) America since 60s has been caused by the increase of lead in the environment (pushing many people over the edge of ocassional violent loss of control). How many social problems in the world have similar industrial causes? Are we careful enough with what we let into our air and bodies?

Languages

  • newcoder.io: Learning more Python via projects – an excellent next step when you have learned Python syntax via LPHW or similar; in this tutorial series you will be building real-world apps while learning more of Python. You will play with, Data Visualization, APIs, Web Scraping, Networks, GUI.
  • Brian McCallister: Go is PHP for the Backend – a very good explanation why you might want to use Go and that you have to first learn “the Go way” to avoid insanity, since it is very opinionated and different from what you might be used to. Some pros: “native code, UNIX friendly, higher level then C, lower level then Python or Ruby, garbage collected, strongly typed, good performance, good concurrency support, etc.”
  • The Neophyte’s Guide to Scala 1 to 15 (list) – a good follow-up on the Cursera FP in Scala course, a series of blog posts exploring some topics more in depth. F.ex.: extracotrs (unapply, for pattern-matching), the broad applicability of pattern matching, pattern matching anonymous functions & partial functions #4, usiong Option idiomaticly #5, nice FP error handling with the Try type #6, Futures, etc. Higly recommended! Thx to Jakob Lind

Libs

  • Jolokia is remote JMX with JSON over HTTP: a REST API bridged to JMX, with support for security, fine-grained access control, bulk operations. Especially useful if you either 1)  need to perform bulk operations (e.g. get multiple values) or 2) want to access them from something that doesn’t support JMX. JSON is in general very easy to use and navigate. You can install Jolokia as a WAR (or mebedd its Servlet), a JVM agent, or attach it on-the-fly to a running JVM.
  • The Appeal and Controversy of ZeroMQ – why to use 0MQ? It is a messaging library that focuses on performance, decentralization and simplicity, solving some really hard problems (sending async. messages w/o locks, distribuing to specific subscribers) and providing a simple API. Main pros: decentralized (no central broker), many languages; cons: no security (but you can use it over SSH).

Talks

  • Tim O’Reilly: Create More Value Than You Capture (30 min + questions) – build apps that matter, that change how we do things. Thinking just about money is bad. Try to make the society better, smart, create more value than you capture, solve important problems, help people. Ex. startups: Uber, Square, Code for America.
  • TED: Bruce Feiler: Agile programming — for your family (20 min) – an inspirational talk, based on positive experience from multiple families, about applying the agile thinking and values to make our families happier by empowering the children (enlist them in their upbringing, deciding on goals, rewards, punishments), letting them know who they are, being adaptive, having regular “retrospectives” (that eventually become cherrished memories). Backed by research. Did you know that the #1 wish of children isn’t that parents spend more time with them but that they are less stressed?

Clojure Corner

  • What’s new in Clojure 1.5.x – reducers, new threading macros (cond->, as->, some->, ..), various improvements, improved performance, erro messages, doc strings, bug fixes
  • Stuart Sierra: On the Perils of Dynamic Scope – summary: don’t create macros like with-connection binding to a thread-local var; make all methods take the resource as a parameter – thus the user has the freedom to decide when to close the resource and isn’t limited to a single thread and can use lazy sequences
  • Logic programming is overrated – core.logic is essentially only a complex DSL for doing an exhaustive search and there is already a nice, clean tool for that: the for comprehension. A logic puzzle can be much more clearly and also efficiently using for. But it is not completely useless – logic programming is good e.g. for running programs backwards, unification is important for writing type checkers, and the new constraint programming piece has good potential. Read also Logic Programming is Underrated, which provides a faster core.logic solution than for-comprehension and provides some pointers rgarding the practical usefulness of core.logic.
  • Prismatic – Graph: Abstractions for Structured Computation – How to reduce the complexity overhead in large, real-world, FP systems by decoupling what is done from how it is executed. Graph is a Clojure library enabling a declarative way to describe how data flows between (mostly pure) functions => “It allows us to formalize the informal structure of good FP code, and enables higher-order abstractions over these structures that can help stamp out many persistent forms of complexity overhead.” By decoupling the description of how data flows and the actual execution, we can execute it in different ways (parallelized, with memoization, lazy/eager) and apply various interceptors (for logging etc.). See especially the part “Graph and complexity overhead.”
  • Mike Anderson: Game development in Clojure : Alchemy 7DRL post-mortem (and the previous 7 daily updates, Alchemy @ GitHub) – an interesting report about game making in Clojure during 7 days, in as functional and immutable style as possible while keeping it sufficiently fast. How do you represent & handle statuf game objects, the world map, game state? The design of the game, what was easy and what hard with Clojure. Tl;dr: search it for “Some parting thoughts” (Clojure productive, immutability hard but pays off, prototype objects great, more typing would have helped). “Making everything immutable in Clojure is harder than it would have been in an OOP language like Java where everything can be encapsulated in mutable classes. In particular, the state update functions are tricky to make both correct and performant. The payoff is big however: in terms of the simplicity and effectiveness later on, and in the conceptual clarity being able to treat the entire game state as an immutable value”. Having REPL is a big win.
  • Refactoring Java using Clojure with the Eclipse Java development tools (JDT) (operation on AST nodes, i.e. little too low level; the Eclipse Refactoring API might be better)
  • Clojure at a Bank – [Our] Clojure Code Immaturity – experiences with going from Java to Clojure: 1) too few comments, too short names => hard to learn the code; 2) not knowing clojure.core well enough => reimplementing (if-let, juxt, …); 3) structure, comment, split up your namespaces well, navigating more complicated then in Java IDEs; 4) reasonably used Macros, Protocols, Defrecords payed off;
  • Datomic for Five Year Olds – explaining the key characteristics of Datomic compared to relational and NoSQL DBs (schema, architecture, programmability/language); doesn’t go into details of how it works (e.g. how does Datomic determine what subset of the DB to cache locally and what if it is few GBs); Honey Badger’s 2012 talk Exploring Datomic: a database deconstructed explores the architecture and technical details much more

Tools

  • Vagrant 1.1.0 is out (what’s new?), with support for VMWare Fusion and AWS VM backends in addition to VirtualBox – use the same config to create, provision, stop, destroy and connect to a virtual machine locally or in the cloud (with limited support for shared folders, I’d suppose). V. 1.1 is backwards compatible aside of plugins, upgrade to new config optional.
  • Animated presentations: ArtRage (drawing program, also for iPad), Wacom Intuos 5 (drawing tablet), Screenflow (screen & audio capture) – used for the Agile in a Nutshell (Agile Product Ownership in a Nutshell) mentioned above
  • ckjm — Chidamber and Kemerer Java Metrics (via Neal Ford) – a command-line tool (also Maven/Ant plugin) to compute some metrics, outputting text or XML for further processing; the metrics: WMC: Weighted methods per class (cyclomatic complexity of its methods), DIT: Depth of Inheritance Tree, NOC: Number of Children, CBO: Coupling between object classes, RFC: Response for a Class, LCOM: Lack of cohesion in methods, NPM: Number of Public Methods, Ca: afferent coupling.
  • Bulletproof Demos: Record & Replay built into Chrome – ever got a failure while demonstrating a web app though it has worked moments ago? No more! You Chrome to record your requests and responses and let its cache handle them during the real demonstration. (Mac: stop Chrome, to record run open -a “Google Chrome” –args –record-mode, to replay run open -a “Google Chrome” –args –playback-mode. Linux: google-chrome –record-mode and –playback-mode. Win.: run chrome <arg>)
  • UserTesting.com (via Ash Maurya, the author of Running Lean) – on-demand usability testing; they have a large base of test users, can select those matching your criteria and unleash them upon your site guided by a script your provide, watch videos of their actions while they verbalize their thinking process, recieve written answers from them, talk to them.
  • MindMup.comopensource, free mind-mapping in the cloud by Gojko Adzic & co., with main focus on productivity. Store private maps in Goolge Drive, support for mobile devices, keyboard shortcuts. No registration needed.

Favorite Quotes

Once we estimated a project to require 9 man-months but were later told that we do not understand a thing and it may not take more then 4. At the end it took over 25 and still wasn’t done.

- paraphrasing my collegue Kim Leskovski

On collecting requirements up front:

At the very beginning, we know less about our project than we’ll ever know again. This is the worst possible moment to be making firm decisions about what we “require.”

- Ron Jeffries in Estimation is Evil: Overcoming the Estimation Obsession

On the estimate of project delivery date at its initial phase:

It’s based on an unrealistic list of requirements, using weak estimates, made at the moment of maximum ignorance, by people who are always optimistic about their own abilities.
- ibid

On planning and requirements (the Chrysler’s C3 payroll project, having a payroll expert and a team familiar with the domain):

This was one of the best-planned projects I’ve ever seen, and even so, at least one third of the requirements were added, removed, or substantially changed.
- ibid

[..] a line of code is a liability, not an asset [..]
- Jez Humble in Why Software Development Methodologies Suck

Agile is not something you do, it is something you are.
- Huib Schoots in Creating my own flow
with personal kanban, Agile Record Feb 2013

Posted in General, SW development, Tools, Top links of month | Tagged: , , , , , , , , , , , , , , , , | Comments Off

Books Our Developers Should Read

Posted by Jakub Holý on March 12, 2013

Republished from blog.iterate.no with the permission of my co-author, Morten Berg, and later updated.

There are a few books that every developer in Iterate should read because they express what we believe in and are extremely valuable in themselves. The books chosen are generally and broadly useful and not tied to some too limited domain (contrary to e.g. Effective Java).  The list is kept as short as possible, about 4-5 books, and is revised regularly.

Why particularly these books, why lean and agile? Our people are primarily responsible for crafting solutions for our clients, for making sure that they use the customers’ limited resources efficiently to produce the maximal business value possible. However, according to our experience, it is never truly known upfront where that value lies. Software development is therefore inherently a learning and exploration process. A process that needs to be continually adjusted based on empirical feedback from the reality and on shifting conditions. This is what lean is about: eliminating waste, maximizing value by maximizing learning, making sure that the right product is built. We value software craftmanship and building things right – but building the right things is crucial.

Here are the books and why we believe they are so important.

Read the rest of this entry »

Posted in General | Tagged: , | 2 Comments »

Recommended Book: Real World Java EE Night Hacks by Adam Bien

Posted by Jakub Holý on August 13, 2012

Real World Java EE Night Hacks – Dissecting the Business Tier, Adam Bien, 2011, ISBN 9780557078325.

I highly recommend this very thin and down-to-the-earth-practical book to everybody interested in back-end Java (a basic knowledge of servlets, Java ORM, and REST might be useful). The book evolves around a small but realistic project (X-Ray), which we follow from the inception through couple of iterations til the end. Bien shows us how lean Java EE can be, how to profit from the functionality that it offers, and how these pieces of functionality fit together to deliver something useful. He actually introduces a complete Java EE development environment including continuous integration (CI), code quality analysis, functional and stress testing.

Some of the things that I appreciate most in the book is that we can follow author’s decision process with respect to implementation options (such as SOAP vs. REST vs. Hessian etc., a REST library vs. sockets, or when (not) to use asynchronous processing) and that we can see the evolution of the design from an initial  version that failed through cycles of growing and refactoring and eventually introducing new technologies and patterns (events, configuration over convention) to support new and increased requirements. Read the rest of this entry »

Posted in General, j2ee, Languages | Tagged: , , , , | Comments Off

Book Review: Implementation Patterns

Posted by Jakub Holý on July 5, 2012

Implementation Patterns, Kent Beck, 2007, ISBN 0321413091.

Summary: Should you read the book? Yes, the chapter on principles and values is trully enlightening. The book in general contains pearls of wisdom hidden in the mud of “I know that already, man.” I would thus recommend skimming through the book and reading only the pieces matching your level and needs.

The book seems to be targeted a lot at Java beginners (especially the chapter on collections), going into otherwise unnecessary details, yet there are many valuable advises of which some can only be appreciated by somebody with multiple years of professional programming experience. It thus seems to me that the book isn’t a perfect match for anybody but everybody will find there many useful ideas. It would best be split in two.

An experienced developer will already know many of the patterns though it’s perhaps useful to see them named and described explicitly and listed next to each – it helps to be aware and clearer of what you do and why you do it.

I’d absolutely recommend everybody to read the chapter A Theory of Programming, explaining Kent’s style of programming and the underlying key values of communication, simplicity and flexibility as well as the more concrete principles (local consequence, minimize repetition, logic and data together, symmetry, declarative expression, co-locating data and logic having the same rate of change). Also in the rest of the book there are valuable ideas that it would be a pity to miss. I list below some of those that I found particularly interesting.

Posted in General | Tagged: , , , | 1 Comment »

Most interesting links of January ’12

Posted by Jakub Holý on January 31, 2012

Recommended Readings

  • Jeff Sutherland: Powerful Strategy for Defect Prevention: Improve the Quality of Your Product – “A classic paper from IBM shows how they systematically reduced defects by analyzing root cause. The cost of implementing this practice is less than the cost of fixing defects that you will have if you do not implement it so it should always be implemented.” – categorize defects by type, severity, component, when introduced; 80% of them will originate in 20% of the code; apply prioritized automated testing (solve always the largest problem first). “In three months, one of our venture companies cut a 4-6 week deployment cycle to 2 weeks with only 120 tests.”
  • Ebook draft: Beheading the Software Beast – Relentless restructurings with The Mikado Method (foreword by T. Poppendieck) – the book introduces the Mikado Method for organized, always-staying-green (large-scale) refactorings, especially useful for legacy systems, shows it on a real-world example (30 pages!), discusses various application restructuring techniques, provides practical guidelines for dealing with different sizes of refactorings and teams, discusses in depth technical debt and more. To sum it up in three words: Check it out!
  • Daily Routine of a 4 Hour Programmer (well, it’s actually about 4h of focused programming + some hours of the rest) – a very interesting reading with some inspiring ideas. We should all find some time to follow up the field, to reflect on our day and learn from it (kaizen)
  • The Agile Testing Quadrants – understanding the different types of tests, their purpose and relation by slicing them by the axis “business facing x technology facing” and the axis “supporting the team x critiquing the product” => unit tests x functional tests x exploratory testing x performance testing (and other). It helps to understand what should be automated, what needs to be manual and helps not to forget all the dimensions of testing.
  • Adam Bien: Can stateful Java EE apps scale? – What does “stateless” really mean? “Stateless only means, that the entire state is stored in the database and has to synchronized on every request.” “I start the development of non-trivial (>CRUD) applications with Gateway / PDOs [JH: stateful EJBs exposing JPA entities] and measure the performance and memory consumption continuously.” Some general tips: Don’t split your web server and servlet container, don’t use session replication.
  • Brian Tarbox: Just-In-Time Logging – How to remove 90% of worthless logs while still getting detailed logs for cases that matters – the solution is to (1) only add logs for a particular “transaction” with the system into a runtime structure and (2) flush it to the log only if the transaction fails or st. else significant happens with it. The blog also proposes a possible implementation in detail.
  • DZone’s Top 10 NoSQL Articles of 2011
  • DZone’s Top 5 DevOps Articles of 2011
  • Test Driven Infrastructure with Vagrant, Puppet and Guard – this is interesting for me for I’m using Vagrant and Puppet on my project to create and share development environments or their parts and applying test-first approach to it seems interesting as do also the tools, rspec-puppet, cucumber-puppet and Guard (events triggered by file changes) and referenced articels.
  • 5+1 Sonar Plugins you must not miss (2012 version) – Timeline Plugin (with Google Visualization Annotated TimeLine), Useless Code Plugin, SIG Maintainability Model Plugin (metrics Analysability, Changeability, Stability, Testability), Quality Index Plugin (1-number health indicator), Technical Debt Plugin

Links to Keep

Clojure Corner

  • ClojureScript One Guide – “ClojureScript One shows you how to use ClojureScript to build single-page, single-language applications in a productive, effective and fun way.”
  • Asynchronous workflows in Clojure – true asynchronous (non-blocking) network access in Clojure with Netty/the Lamina project.
  • Clojure 2011 Year in Review – a list with important events in the Clojure sphere with links to details – C. 1.3.0, ClojureScript, logic programming with core.logic, clojure-contrib restructuring, birth of 4Clojure and Avout.
  • Clojure Atlas – interesting project (alpha version) presenting Clojure documentation in the form of interactive graph of related concepts and functions; it’s far from perfection but I like the concept and consider paying those ~ $25 for the 1.3.0 version when its out (however, the demo is free and it might become open-sourced in 2012)

Posted in General, Languages, Testing, Tools, Top links of month | Tagged: , , , , , , , , , , | Comments Off

Book Review: Agile Project Management With Scrum

Posted by Jakub Holý on November 7, 2011

A review of and extract from Agile Project Management With Scrum by Ken Schwaber, Microsoft Press 2003, ISBN 0-7356-1993-X.

The book is basically a set of case studies about Scrum that show how to implement the individual aspects of Scrum, what are the common pitfalls and how to avoid them, and help to understand its mantra of “the art of the possible” and how to adapt Scrum to various situations. It’s very easy to read thanks to the case studies being brief and organized by topics (team, product owner, …). I’d absolutely recommend it as a third book in this domain, after a general introduction into the lean thinking (Implementing Lean Software Development – From Concept to Cash by M. & T. Poppendieck is great for that) and an introduction into Scrum itself. Scrum is not just a set of practices, it requires an essential shift in thinking. Thus it is not enough to learn about the practices – you have to learn, understand, and accept the principles behind. This book will hopefully help you to refine your understanding of these principles.

Extract

This extract contains the quotes and observations that I find the most interesting. It tries by no means to be objective or representative, a different person with a different experience and background would certainly pick different ones. Thus its value for others than myself is rather limited but it may perhaps serve as an inspiration to read the book. My all favourite quotes are in italics. Read the rest of this entry »

Posted in General | Tagged: , , , | Comments Off