The Holy Java

Building the right thing, building it right, fast

Posts Tagged ‘legacy’

A Secret Weapon Against Technical Debt

Posted by Jakub Holý on February 2, 2014

(Cross-posted from blog.iterate.no.)

Technical debt is not the only monster we have to fight – it has a hidden evil twin,  as pointed out by Niklas Björnerstedt: Competence Debt. The rope of ignorance that binds our hands and suffocates us by fear so that we don’t dare to change the system. Technical debt makes change difficult because the structure of the system does not support it. Competence debt makes change difficult because we do not know the system well enough, what & where to change and what impacts a change may have.

There is an often neglected tool at our fingertips that might help us fight competence debt. Its name is – behold – JavaDoc. An example from practice: I have returned to a client after two years and needed to understand the functionality of a part of the system. And, to my surprise, I found my own JavaDoc providing exactly the answer I was looking for. A colleague of mine mentioned that I should get the award for the “most documenting developer”. But I don’t do it for fun or just to help my bad memory and to be nice to my colleagues. It is an important contribution to the fight against the ever growing hydra of legacy code. Next time you code, try to remember that you are not typing code, but fighting. As every fight, it is hard – but do not give up or the enemy will prevail.

Side note: Writing JavaDoc that helps yet is not too verbose and not too likely to get outdated soon is hard. Getting the right balance – neither too little nor too much, focusing on the why and the broader context and relationships instead of the changing implementation etc. is difficult. But it is worth it. Help yourself, help your fellow colleagues, strike the hydra. Write good JavaDoc.

PS: This post is about competence debt even though the title mentions technical debt, sorry for the confusion (even though they are two sides of the same coin). And “good enough” documentation, though important, is not the sole remedy, as well as developers’ lack of knowledge is not the sole cause of competence debt. Also, Niklas provides a more in-depth review of the technical & competence debt terms in Misunderstanding technical debt  (tech. debt as evolving understanding [but not code] and crappy code). He wrote also A deeper look at Competence debt.

Posted in General | Tagged: , | Comments Off

Bad Code: Are We Thinking Too Little?

Posted by Jakub Holý on December 31, 2013

Do we not think enough when coding? Do we jump to the first solution without really considering the problem, without trying to analyze and decompose it and understand the components and orthogonal forces invovled? Is that the cause of bad code (together with time press) and the reason why we typically see a “patchvolution” rather than evolution (of design)?

For example I want a certain item on my list shown grayed out because it has been marked for removal or is currently being edited and I therefore add a flag called isDisabled. But if I really thought about it, I would likely call it based on the purpose rather than display, e.g. isBeing Edited. And I have often observed that I/we tend to jump to the first acceptable solution without trying to consider other, (radically) different and perhaps better alternatives. That is easily explained with our inborn intelectual laziness and we certainly can agree that we should not overthink things and that we need to ship but still, shouldn’t we try to think a little more?

The Clojure community has been very inspiring for me in this regard. There is a strong focus on spending more time on the problem than the solution to really understand it, and on separating the different concerns involved and adressing them separately, as well as on achieving simplicity. One of the manifestation is the strong preference of small, focused, composable libraries over frameworks. F.ex. it took couple of years for Clojure to get support for named arguments – but the result – destructuring – is something much more powerful, that now pervades the whole languages (of course, this is a language, not an app). When you listen to Rich Hickey talking f.ex. about core.async (vs. actors, Reactive Extensions etc.) you see that the man thought deeply about the problem, alternatives, and their pros and cons.

May be we should spend little more time with our problems before jumping to solutions, no matter how much we like to solve things. Perhaps we would end up with a much better code than we typically have, and thus considerably lower maintenance costs.

Ref: Simple Made Easy, The Clojure Philosophy from Joy of Clojure.

Posted in General | Tagged: , , | Comments Off

Most interesting links of November ’13

Posted by Jakub Holý on November 30, 2013

Recommended Readings

Some interesting topics this time despite me spending lot of time on the Principles of Reactive Programming course: Java x Node.js, REST x other future-proof architectures, scary legacy code. Of course, also plenty of Clojure.

People, organizations, teams, development:

  • Chris Argyris (1923-2013): An Appreciation – Thinkers 50 – recently departed Ch. Argyris is a person whose work you should know, if a bit interested in learning and organizations and how they (dis)function; and since we all work in organizations and want our work to be pleasant, this means all of us. We all want to work in orgs that do double-loop learning, i.e. they actually evovle as they learn. “Argyris argued that organizations depend fundamentally on people and that personal development is and can be related to work.” Now stop and go read it!
  • Bob Marshall: The Antimatter Principle – “the only principle we need for becoming wildly effective at collaborative knowledge work” – can be summarized as “attend to folks’ needs” (importantly, including your own) => find out what people actually need, interpret their behavior (including anger, seemingly irrational or stupid requests etc.) in terms of needs; mastering this will make you excell in communication and effective work. Read the post to find out more.
  • It’s a state of mind: some practical indicators on doing agile vs. being agile – are you agile or are just “doing agile”? Read on ti find out, if you dare! F.ex. “Non Agile teams have a process that slows the review of the changes.” Cocnlusion: “An Agile mindset is just that – a state of mind, a set of values. Its a constant questioning, and an opening up to possibilities. Its a predisposition to produce great things.
  • Johannes Brodwall:  Humble architects – how to avoid being an architect that does more harm than good, as so many out there? Some tips: Don’t assume stupidity, Be aware that you might be wrong, Be careful with technology (i.e. simplicity beats everything; applies so much to us developers too!), Consistency isn’t as important as you think (or beware context), Tactical reuse in across systems is suboptimization (i.e. reuse has a cost), separate between (coding) rules and dogma (i.e. is that way unsafe, incomprehensible, or just a heresy w.r.t. a dogma?) Very valuable insights into creating good technical solutions and teams that work.
  • Liz Keogh’s The Dream Team Nightmare: a Review – a very good review of this adventure-style book about coaching “agile” teams through (around?) common pitfalls, provides a good base for deciding whether you shall read the book (Liz essentially says yes)
  • Fibonacci Kittens: One Idea One Commit – a short story of coming from biannual releases to frequent release of individual features; I link to this primarily to spread optimism, if this company managed it then, perhaps, we other can too?
  • The Eternal Struggle Between Business and Programmers – “Business: More features! Now! Programmers: More refactoring! Now!” How can we resolve this eternal conflict of needs? This post reveals how the two parties can find a common ground and mutual understanding (beware, everybody must give up on something) and thus work together rather than against each other.

Coding, architecture, legacy

  • Why the future is NOT RESTful – always refreshing to read something against the mainstream; “REST is not fit for the next generation of smart client applications because it has not been designed for smart clients.” According to the author, a smart client app stack needs: “1. persistence (storage and query), 2. documents/orm (conversion to tree-like datastructures), 3. data authorization (once authenticated), 4. pub/sub (websocket communications), 5. client db (client-side caching and querying), 6. templating (presentation level)” Meteor.js has nearly all but #3 thanks to mongodb (1+2), dpp (4), mongo on the client (5), spark (6). The author considers a similar but Clojure-based stack (with Datomic, Angular etc.) and looks at authorization possibilities. “Secured, personalised, CRUD operations are the future to a more simplified web.” We may agree or not but it certainly is worth reading.
  • Michael Feathers (of Working Effectively With Legacy fame): Unconditional Programming – “Over and over again, I find that better code has fewer if-statements, fewer switches, and fewer loops. Often this happens because developers are using languages with better abstractions. [..] The problem with control structures is that they often make it easy to modify code in bad ways.” Includes a nice example of replacing if-else with much more readable and safer code.
  • The Quality of Embedded Software, or the Mess Has Happened – an interesting and scary read about terrible spaghetti code (and hardware) that makes some Toyotas to accelerate when the driver tries to break; 11,000 global variables, the key function showing so high cyclomatix complexity that “makes it impossible not only to test but even maintain this program in any way.” Then 80k violations of the car industry coding standard cannot surprise. And a safety control that does not work. Interesting that a great manufacturer may have so terrible IT (and Toyota isn’t the only one).
  • The string type is broken – the String type is M. Feathers’ favourite example of a leaky abstraction – most languages fail to process/split/count less common Unicode characters properly, the fact that String is implemented as a series of bytes leaks through (UTF-16 langs like Java); worth reading to be aware of the limitations

Languages

  • Why I’m Productive In Clojure – some interesting points about simplicity, consciousness, interactive development, power without overwhelming fatures, etc. “With it [Clojure] I can always easily derive a solution to a particular problem from a small set of general patterns. [..] However, the number of ways that these concepts can be combined to solve all manner of problems appears to be inexhaustible.
  • Node.js at PayPal – PayPal is switching from Java to Node.js (among others to promote language consistency) and, as a part of that, has implemented the same app in Node and Java; results: Node was done earlier, had less code, performed better (though, as Daniel Kvasnička pointed out, “Comparing Node.js and servlet-based archs is not fair… compare Node with @vertx_project and you’ll get a whole different story ;)”; also, as Charles Nutter said, “The @PayPal numbers for their Java to Node move are absurd. A JVM app doing 1.8 pages/s isn’t slow…it’s broken.“)
  • IBM: Developing mobile apps with Node.js and MongoDB, Part 1: A team’s methods and results – also IBM has implemented the same (REST) app once with Java and DB2, once with Node and Mongo where Node+Mongo required less work and performed better; one of the great wins was having JSON as a native structure everywhere instead of transforming from/to it so Mongo is, in my opinion, an important factor in this particular case
  • Dynamics of Programming: Benefits of Scala in CS1 – reasons for and experiences with using Scala in an introductory computer science course, worth reading; some of the advantages over Java are consistency (.asInt on String and Double vs. casting/parsing, no “primitive” types), REPL with time inference good for learning, functional style enables focus on what rather than how; quite persuasive arguments

Security

  • The New Threat: Targeted Internet Traffic Misdirection – did you know that internet traffic to any site can be made to go through a particular server without anybody noticing? This has been observed repeatedly in the wild, for banks and other sites. Rather make sure you use strong encryption (NSA-approved, of course ;)).
  • A (relatively easy to understand) primer on elliptic curve cryptography – Everything you wanted to know about the next generation of public key crypto – you cannot just read this, you have to study it, which I did not; but it looks good and I guess the time will come when I will come back to it

Other

Clojure Corner

  • Stuart Sierra’s Component – a library for making it easier to implement Stuart’s reloadable workflow; a component is something that can be started, stopped, and depend on other components so that it is easier to do interactive REPL development
  • Logan Linn: Clojure/conj 2013 – a pretty good overview of the conference
  • Caribou – “the kernel of usefulness that has emerged from years of this basic practice“- a new Clojure web framework – seems to be interesting
  • Results of the 2013 State of Clojure & ClojureScript survey and drill-down into what features people want – the most interesting fact is how many more participants use Clojure in production than last year and perhaps also the relatively wide adoption of Datomic among the respondents. Light Table has become the 3rd most popular dev. env., after Emacs and Vim. Some of the most mentioned language features requested were types (=> core.typed, Prismatic’s Schema), better error reporting (=> slingshotdireclj-stacktraceio.aviso:pretty, etc.), debuger (though progress is being made)
  • Book: Clojure High Performance Programming
  • Improving Clojure Feedback : Stack Traces – making Clojure stacktraces more usable by filtering out noise and linking to relevant content – io.aviso:pretty,  io.aviso:twixt
  • Clojure Dev discussion: Hashing strategies – Executive summary – “In Clojure, however, it is far more common than in Java to use longs, vectors, sets, maps and compound objects comprised of those components (e.g., a map from vectors of longs to sets) as keys in other hash maps.  It appears that Java’s hash strategy is not well-tuned for this kind of usage.  Clojure’s hashing for longs, vectors, sets, and maps each suffer from some weaknesses that can multiply together to create a crippling number of collisions.” Ex.: An implementation of N-Queens took forever, spending most time on hash collisions. But be calm, smart Clojurians are working on a solution.
  • Datomic Pro Starter Edition – Datomic with all storages, Datomic Console, a year of updates, full Datomic programming model; limitations: no HA transactor, no integrated memcached, max 2 peers and 1 transactor

Tools/Libs

  • AirPair.com – a new site that enables developers to get help from other devs via remote pairing, code review, code mentoring etc. – a good opportunity to get help / help others (and earn something); I haven’t tried it but it sounds pretty interesting

MongoDB web stacks

  • Meteor: JS frontend + MongoDB backend with changes in the DB pushed live to the clients, i.e. MongoDB is used both as the “application server” and storage. It seems great for apps where users need to collaborate in real-time with each other, certainly great for quick proof of concepts etc.; worth checking out; it also comes with free (at least for start?) hosting so really good for prototyping – “an open-source platform for building top-quality web apps in a fraction of the time.” The intro screencast will give you a good overview (10 min).
  • Mean.io – MEAN (Mongo, Express, Angular, Node) stack Boilerplate – frontend, backend and storage using the same language and some of the most popular technologies (not that popular = best fit for you :)); it seems to be very new but since it just glues together 4 popular and documented technologies, that should not be an obstacle. There is an intro on the MongoDB blog.

Other

  • BusyBox (get latest win binary from Tigress.co.uk) – reportedly a better POSIX env for Windows than gow, Cygwin, et al.

Favourite Quotes

[..] no organization should exist unless it is “of service” to its employees, its customers, its community.
- @Tom_Peters 28/11

I hope you’ll agree that there is a certain amount of irony involved in having to write repetitive code
Dmitri Sotnikov in Why I’m Productive In Clojure

Happy teams are productive teams but:

Morale is 95% a function of the prevailing system (the way the work works). Which in turn is a function of the prevailing collective mindset
@flowchainsensei Nov 10th

Posted in General, Languages, Top links of month | Tagged: , , , , , , , , , , | Comments Off

Most interesting links of October ’12

Posted by Jakub Holý on October 31, 2012

Recommended Readings

  • David Veksler: Some lesser-known truths about programming – things newcomers into the field of IT don’t know and don’t expect, true and an interesting read. Not backed by good data but anyway. F.ex.: “[..] a programmer spends about 10-20% of his time writing code [..] much of the other 90% thinking, researching, and experimenting”. “A good programmer is ten times more productive than an average programmer. A great programmer is 20-100 times more productive than the average [..]” “Bad programmers write code which lacks conceptual integrity, non-redundancy, hierarchy, and patterns, and so is very difficult to refactor.” “Continuous change leads to software rot, which erodes the conceptual integrity of the original design.” “A 2004 study found that most software projects (51%) will fail in a critical aspect, and 15% will fail totally.”
  • Brett L. Schuchert: Modern Mocking Tools and Black Magic – An example of power corrupting – interesting for two reasons: a good analysis of a poorly written piece of code and discussion of the code injection black magic (JMockIt) vs. actually breaking dependencies to enable tests.  The author presents a typical example of low-quality method (mixing multiple concerns, mixing different levels of abstractions, untestable due to a hardcoded use of an external call) and discusses ways to improve it and to make it testable. Recommended to read.
  • It’s Not About the Unit Tests – Learning from iOS Developers: iOS developers don’t do much testing yet they manage to produce high quality. How is that possible? The key isn’t testing itself, but caring for the code. (Of course, iOS is little special: small apps, no legacy, a powerful platform that does lot for the apps, very visual apps.) “It’s not about the practices. It’s about the spirit and intent behind them, and how they are applied.” (M. Fowler had a similar observation about a team that used mock-based testing exclusively and thus lacked integration tests yet all worked. [I’ve lost the link to the post and would be grateful for it])
  • Java Code Quality Tools – Overview – brief descriptions of 44 quality-related tools including some interesting tools and Eclipse plugins I didn’t know or knew but forgot. F.ex. analysis of dependencies with JBoss Tattletale or JarAnalyzer, Clirr to check libraries for source and binary backwards compatibility, JDiff generates JavaDoc-based report of removed/added/changed in an API. Spoon – read and check or transform Java code. Java PathFinder (NASA) – special JVM capable of checking all execution path to discover concurrency defects etc.

Tools

  • DirB, Directory Bookmarks for Bash (home) – moving efficiently among favourite directories (s <name> to create a bookmark for pwd, g <bookmark | relative/abs dir path> to enter a dir (=> works both for bookmarks and as a replacement for cd); also support for relative path bookmarks & more; sl lists bookmakrs in the last used order) (You might also want to check out Autojump, described in Dec 11; bashmarks is another similar project. Another similar project is rupa’s z and j2 and the fish clone z-fish)

Clojure Corner

  •  Jon Pither: Clojure at a Bank – Moving from Java –  the justification (productivity, dynamism, FP a better match for the domain) and process behind moving from Java to Clojure with a monolithic 1M LOC Spring/Hibernate app. (Random quotes: “I had used some dynamical languages before and it was quite obvious that we were essentially forcing lots of schema and type definition on to a problem domain that just didn’t want or need it.” “[..] it [dependency injection] just looks redundant in retrospect now that I’m working 95% with FP code.”) There is also a EuroClojure talk about their experiences one year later (35 min).
  • Prismatic’s “Graph” at Strange Loop – an interesting desing problem, its solution, and a resulting OSS library. The problem: How to break a large function into independently usable small ones that might depend on each other without ever needing to recompute a value once the function producing is called. The solution: Graph – “Graph is a simpledeclarative abstraction to express compositional structure.” (Enabling explicit declaration of data dependencies and pluging in different implementations.)
  • The Oblong: Blog about 2/3 D game programming in Clojure, starting from scratch (w/o an engine); interesting experiences
  • Ironclad: Steam Legions – Clojure game development battle report (the game on Github)
  • Building the Wishlisted.org webapp in Clojure – experiences from learning Clojure for real by building a webapp in Noir
  • Clojure vs. Scala smackdown (“Just kidding with the title of this post :-)”) – a short post with interesting discussion. Dmitri Sotnikov’s opinion resonates with me: “I found that for me Clojure wins on simplicity and consistency. While it looks more alien initially, once you learn the basics, you just reuse the same patterns everywhere.” Some more comments: “One major concern was maintainability, since it’s fairly easy to write very dense code. This turned out to not be a problem in practice. Because Clojure code is written as a tree, refactoring it is very easy.” REPL seems to be a big win (applies to Scala too). Scala’s type system might get tedious and learning its quirks takes time but there is lot of potential and both have they strong sides.
  • Code Fatigue – discussion of the advantages of learning, using, and combining the (many) standard Clojure functions instead of a “basic solution” using recursion etc. The argument is in favor of higher-level code with less complexity in the form of branching, recursion, nested expressions etc. and thus less mental fatigue.

Favorite Quotes

A classic test only cares about the final state – not how that state was derived. Mockist tests are thus more coupled to the implementation [emphasis mine] of a method. Changing the nature of calls to collaborators usually cause a mockist test to break.

- Martin Fowler in his classical Mocks Aren’t Stubs

I’m afraid of code. When I see a big pile of code, I get scared ;-). Some classes and method make me cry. I had troubles explaining why I prefer short pieces of code keeping the same level of abstraction, cohesive and loosely coupled. The following quote captures the essence – improved communication.

One way to improve communication is to reduce the need for it and the same can be said for code. […] Since we tend to read code more than write it, anything we can do to reduce the need to read code is time well invested in the life of a project.

- Brett L. Schuchert in Modern Mocking Tools and Black Magic – An example of power corrupting justifying extraction of code into another class or method

Posted in General, Languages, Testing, Tools, Top links of month | Tagged: , , , , , , | Comments Off

Help, My Code Isn’t Testable! Do I Need to Fix the Design?

Posted by Jakub Holý on September 9, 2012

Our code is often untestable because there is no easy way to “sense1” the results in a good way and because the code depends on external data/functionality without making it possible to replace or modify these during a test (it’s missing a seam2, i.e. a place where the behavior of the code can be changed without modifying the code itself). In such cases the best thing to do is to fix the design to make the code testable instead of trying to write a brittle and slow integration test. Let’s see an example of such code and how to fix it.

Read the rest of this entry »

Posted in Testing | Tagged: , , , | Comments Off

Most interesting links of January ’12

Posted by Jakub Holý on January 31, 2012

Recommended Readings

  • Jeff Sutherland: Powerful Strategy for Defect Prevention: Improve the Quality of Your Product – “A classic paper from IBM shows how they systematically reduced defects by analyzing root cause. The cost of implementing this practice is less than the cost of fixing defects that you will have if you do not implement it so it should always be implemented.” – categorize defects by type, severity, component, when introduced; 80% of them will originate in 20% of the code; apply prioritized automated testing (solve always the largest problem first). “In three months, one of our venture companies cut a 4-6 week deployment cycle to 2 weeks with only 120 tests.”
  • Ebook draft: Beheading the Software Beast – Relentless restructurings with The Mikado Method (foreword by T. Poppendieck) – the book introduces the Mikado Method for organized, always-staying-green (large-scale) refactorings, especially useful for legacy systems, shows it on a real-world example (30 pages!), discusses various application restructuring techniques, provides practical guidelines for dealing with different sizes of refactorings and teams, discusses in depth technical debt and more. To sum it up in three words: Check it out!
  • Daily Routine of a 4 Hour Programmer (well, it’s actually about 4h of focused programming + some hours of the rest) – a very interesting reading with some inspiring ideas. We should all find some time to follow up the field, to reflect on our day and learn from it (kaizen)
  • The Agile Testing Quadrants – understanding the different types of tests, their purpose and relation by slicing them by the axis “business facing x technology facing” and the axis “supporting the team x critiquing the product” => unit tests x functional tests x exploratory testing x performance testing (and other). It helps to understand what should be automated, what needs to be manual and helps not to forget all the dimensions of testing.
  • Adam Bien: Can stateful Java EE apps scale? – What does “stateless” really mean? “Stateless only means, that the entire state is stored in the database and has to synchronized on every request.” “I start the development of non-trivial (>CRUD) applications with Gateway / PDOs [JH: stateful EJBs exposing JPA entities] and measure the performance and memory consumption continuously.” Some general tips: Don’t split your web server and servlet container, don’t use session replication.
  • Brian Tarbox: Just-In-Time Logging – How to remove 90% of worthless logs while still getting detailed logs for cases that matters – the solution is to (1) only add logs for a particular “transaction” with the system into a runtime structure and (2) flush it to the log only if the transaction fails or st. else significant happens with it. The blog also proposes a possible implementation in detail.
  • DZone’s Top 10 NoSQL Articles of 2011
  • DZone’s Top 5 DevOps Articles of 2011
  • Test Driven Infrastructure with Vagrant, Puppet and Guard – this is interesting for me for I’m using Vagrant and Puppet on my project to create and share development environments or their parts and applying test-first approach to it seems interesting as do also the tools, rspec-puppet, cucumber-puppet and Guard (events triggered by file changes) and referenced articels.
  • 5+1 Sonar Plugins you must not miss (2012 version) – Timeline Plugin (with Google Visualization Annotated TimeLine), Useless Code Plugin, SIG Maintainability Model Plugin (metrics Analysability, Changeability, Stability, Testability), Quality Index Plugin (1-number health indicator), Technical Debt Plugin

Links to Keep

Clojure Corner

  • ClojureScript One Guide – “ClojureScript One shows you how to use ClojureScript to build single-page, single-language applications in a productive, effective and fun way.”
  • Asynchronous workflows in Clojure – true asynchronous (non-blocking) network access in Clojure with Netty/the Lamina project.
  • Clojure 2011 Year in Review – a list with important events in the Clojure sphere with links to details – C. 1.3.0, ClojureScript, logic programming with core.logic, clojure-contrib restructuring, birth of 4Clojure and Avout.
  • Clojure Atlas – interesting project (alpha version) presenting Clojure documentation in the form of interactive graph of related concepts and functions; it’s far from perfection but I like the concept and consider paying those ~ $25 for the 1.3.0 version when its out (however, the demo is free and it might become open-sourced in 2012)

Posted in General, Languages, Testing, Tools, Top links of month | Tagged: , , , , , , , , , , | Comments Off

Refactoring Spikes as a Learning Tool and How a Scheduled Git Reset Can Help

Posted by Jakub Holý on November 21, 2011

To learn how complex your code base really is and how much effort a particular refactoring might require compared to the initial expectations, follow these steps:

  1. Schedule git reset --hard; git clean -fd to run in 1 hour (e.g. via cron)
  2. Do the refactoring
  3. WT*?! All my changes disappeared?!” – this experience indicates the end of the refactoring :-)
  4. Go for a walk or something and think about what you have learned about the code, its complexity, the refactoring
  5. Repeat regularly, f. ex. once every week or two – thus you’ll improve your ability to direct the refactoring so that you learn as much as possible during the short time

Read the rest of this entry »

Posted in General | Tagged: , , , , | Comments Off

Most interesting links of July

Posted by Jakub Holý on July 31, 2011

Recommanded Readings

  • Martin Fowler, M. Mason: Why not to use feature branches and prefer feature toggles instead, when branches can actually be used (video, 12min) – feature branches are pretty common yet they are a hindrance for a good and stable development pace due to “merging hells”. With trusted developers, feature toggles are a much better choice.
  • M. Fowler: The LMAX Architecture – Martin describes the innovative and paradigm shaking architecture of the high-performance, high-volume financial trading platform LMAX. The platform can handle 6 million orders per second – using only a single java thread and commodity hardware. I highly recommend the article for two reasons: First, it crashes the common view that to handle such volumes you need multithreading. Second, for the rigorous, scientific approach used to arrive to this architecture. The key enablers are: 1) The main processing component does no blocking operations (I/O), those are done outside (in other threads). 2) There is no database – the state of the processor can be recreated by replaying the (persistent) input events. 3) To get further from 10k to 100k TPS they “just” wrote good code – well-factored, small methods (=> Hotspot more efficient, CPU can cache better). 4) To gain another multitude they implemented more efficient, cache-friendlier collections. All that was done based on evidence, enabled by thorough performance testing. 5) The processor and input/output components communicate without locking, using a shared (cyclic) array, where each of them operates on sum range of indexes and no element can ever be written by more than one component. Their internal range indexes do ever only increase so it is safe to read them without synchronization (at worst you will get old, lower value). The developers also tried Agents but found them in conflict with modern CPUs for their require context switch leading to emptying of the fast CPU caches.
    Updated: Martin has published the post titled Memory Image which discusses the LMAX approach to persistence in a more general way.
  • S. Mancuso: Working with legacy code with the goal of continual quality improvement – this was quite interesting for me as our team is in the same situation and arrived to quite similar approach. According to the author, the basic rule is “always first write tests for the piece code to be changed,” even though it takes so much time – he justifies it saying “when people think we are spending too much time to write a feature because we were writing tests for the existing code first, they are rarely considering the time spend elsewhere .. more time is created [lost] when bugs are found and the QA phase needs to be extended”. But it is also important to remember when to stop with refactoring to get also to creating business value and the rule for that is that quality improvements are done only with focus on a particular task. I like one of the concluding sentences: “Constantly increasing the quality level in a legacy system can make a massive difference in the amount of time and money spend on that project.”
  • Uncle Bob: The Land that Scrum Forgot – Scrum makes it possible to be very productive at the beginning but to be able to keep the productivity and continue meeting the expectations that are thus created we need to concentrate on some essential technical practices and code quality. Because otherwise we create a mess of growing complexity – the ultimate killer of productivity. Uncle Bob advices us what practices and how to apply to attain both high, sustainable productivity and (as required for it) high code quality. It’s essential to realize that people do what they are incented to do and thus we must measure and reward both going fast and staying clean.
    How do we measure quality? There is no perfect measure but we can build on the available established metrics – coverage, # (new) tests, # defects, size of tests (~ size of production code, 5-20 lines per method), test speed, cyclomatic complexity, function (< 20) and class (< 500) sizes, Brathwaite Correlation (> 2), dependency metrics (no cycles, in the direction abstraction).
    The practices that enable us to stay clean include among others TDD, using tools like Chceckstyle, FindBugs to find problems and duplication, implementing Continuous Integration.
  • Getting Started: Testing Concurrent Java Code – very good and worthy overview of tools for checking and testing of concurrent code with links to valuable resources. The author mentions among others FindBugs, concurrent test coverage (critical sections examined by multiple threads) measurement with IBM’s ConTest, multithreaded testing with ConTest (randomly tries to create thread interleaving situations; trial version – contact the authors for the full one) and MultithreadedTC (which divides time into “ticks” and enables you to fine-configure the interactions)
  • The top 9+7 things every programmer or architect should know – quite good selection of nine, respectively 7 things from the famous (on-line available) books 97 Things every programmer/architect should know.

Some fun:

  • Beans and noses – J. Spool reveals the First Rule of Consultancy: “No matter how much you try, you can’t stop people from sticking beans up their nose.” In other words, sometimes your clients decide to do some very unwise thing and no amount of reasoning can discourage them from that (quite understandably, as already the way they got to this decision defies logic).  “The only thing I can do in a beans-and-noses situation is wait. Wait until the bean is in its final resting place.” Then you ask the person how it is working for him and “… if sticking a bean deep into their nostril doesn’t meet the very high expectations they’d had, I can now start talking alternative approaches to reaching those expectations.” Already before you can actually ask them about their expectations, in some cases (50:50) this discussion can lead them to realize they could achieve them with an alternative, less painful approach. Now, if you have read up to this point, you clearly have enough time, so go an read the article because it’s really orth it!

Free e-Books

Posted in Languages, Testing, Tools, Top links of month | Tagged: , , , , , , | Comments Off

What I’ve Learned from (Nearly) Failing to Refactor Hudson

Posted by Jakub Holý on April 28, 2011

We’ve tried to refactor Hudson.java but without success; only later have I been able to refactor it successfully, thanks to the experience from the first attempt and more time. In any case it was a great learning opportunity.

Lessons Learned

The two most important things we’ve learned are:

  • Never underestimate legacy code. It’s for more complex and intertwined than you expect and it has more nasty surprises up in its sleeves than you can imagine.
  • Never underestimate legacy code.

And another important one: when you’re tired and depressed, have some fun reading the “best comments ever” at StackOverflow :-). Seeing somebody else’ suffering makes one’s own seem to be smaller.

I’ve also started to think that the refactoring process must be more rigorous to protect you from wandering too far your original goal and from getting lost in the eternal cycle of fixing something <-> discovering new problems. People tend to do depth-first refactoring changes that can easily lead them astray, far from where they actually need to go; it is important to stop periodically and look at where we are, where we are trying to get and whether we aren’t getting lost and shouldn’t just prune the current “branch” of refactorings and return to some earlier point and try perhaps a completely different solution. I guess that one of the key benefits of the Mikado method is that it provides you with this global overview – which gets easily lost when it is only in your head – and with points to roll-back to.

Evils of Legacy Code

Use a dependency injection framework, for God’s sake! Singletons and their manual retrieval really complicate testing and affect the flexibility of the code.

Don’t use public fields. They make it really hard to replace a class with an interface.

Reflection and multithreading make it pretty difficult if not impossible to find out the dependencies of a particular piece of code and thus the impacts of its change. I’d hard time finding out all the places where Hudson.getInstance is invoked while its constructor is still running.

Our Way to Failure and Success

There is a lot of refactoring that could be done with Hudson.java, for it is a typical God Class which additionally spreads its tentacles through the whole code base via its evil singleton instance being used by just about anyone for many different purposes. Gojko describes some of the problems worth removing.

The Failure

We’ve tried to start small and “normalize” the singleton initialization, which isn’t done in a factory method, but in the constructor itself. I haven’t chosen the goal very well as it doesn’t bring much value. The idea was to make it possible to have potentially also other implementations of Hudson – e.g. a MockHudson – but with respect to the state of the code it wasn’t really feasible and even if it was, a simple Hudson.setInstance would perhaps suffice. Anyway we’ve tried to create a factory method and move the initialization of the singleton instance there but at the end we got lost in concurrency issues: there were either multiple instances of Hudson or the application deadlocked itself. We tried to move pieces of code around, but the dependencies wouldn’t have let us do that.

The Success

While reflecting on our failure I’ve come to the realization that the problem was that Hudson.getInstance() is called (many times) already during the execution of the Hudson’s constructor by the objects used there and threads started from there. It is of course a hideous practice to access a half-baked instance before it is fully initialized. The solution is then simple: to be able to initialize the singleton field outside of the constructor, we must remove all calls to getInstance from its context.

Mikado Graph: Hudson Refactoring (click for full size)

The steps can be seen very well from the corresponding GitHub commits. Summary:

  1. I used the “introduce factory” refactoring on the constructor
  2. I modified ProxyConfiguration not to use getInstance but to expect that the root directory will be set before its first use
  3. I moved the code that didn’t need to be run from the constructor out, to the new factory method – this resulted in some, hopefully insignificant, reordering of the code
  4. Finally, I also moved the instance initialization to the factory method

I can’t be 100% sure that the resulting code has the same semantic as far as it matters, for I had to do few changes outside of the safe automated refactorings and there are no useful tests except for trying to run the application (and, as is common with legacy applications, it wasn’t feasible to create them beforehand).

The refactored code doesn’t provide much added value yet but it is a good start for further refactorings (which I won’t have the time to try :-( ), it got rid of the offending use of an instance while it is being created and the constructor code is simpler and better. The exercise took me about four pomodoros, i.e. little less than two hours.

If I had the time, I’d continue with extracting an interface from Hudson, moving its unrelated responsibilities to classes of their own (perhaps keeping the methods in Hudson for backwards compatibility and delegating to those objects) and I might even  use some AOP magic to get a cleaner code while preserving binary compatibility  (as Hudson/Jenkins actually already does).

Try it for Yourself!

Setup

Get the code

Get the code as .zip or via git:

git@github.com:iterate/coding-dojo.git # 50MB => takes a while
cd coding-dojo
git checkout -b mybranch INITIAL

Compile the Code

as described in the dojo’s README.

Run Jenkins/Hudson

cd coding-dojo/2011-04-26-refactoring_hudson/
cd maven-plugin; mvn install; cd ..       # a necessary dependency
cd hudson/war; mvn hudson-dev:run

and browse to http://localhost:8080/ (Jetty should pick changes to class files automatically).

Further Refactorings

If you’re the adventurous type, you can try to improve the code more by splitting out the individual responsibilities of the god class. I’d proceed like this:

  1. Extract an interface from Hudson and use it wherever possible
  2. Move related methods and fields into (nested) classes of their own, the original Hudson’s methods just delegate to them (the move method refactoring should be useful); for example:
    • Management of extensions and descriptors
    • Authentication & authorization
    • Cluster management
    • Application-level functionality (control methods such as restart, updates of configurations, management of socket listeners)
    • UI controller (factoring this out would require re-configuration of Stapler)
  3. Convert the nested classes into top-level ones
  4. Provide a way to get instances of the classes without Hudson, e.g. as singletons
  5. Use the individual classes instead of Hudson wherever possible so that other classes depend only on the functionality they actually need instead of on the whole of Hudson

Learning about Jenkins/Hudson

If you want to understand mode about what Hudson does and how it works, you may check:

Sidenote: Hudson vs. Jenkins

Once upon time there was a continuous integration server called Hudson but after its patron Sun died, it ended up in the hands of a man called Oracle. He wasn’t very good at communication and nobody really knew what he is up to so when he started to behave little weird – or at least so the friends of Hudson perceived it – those worried about Hudson’s future (including most people originally working in the project) made its clone and named it Jenkins, which is another popular name for butlers. So now we have Hudson backed by Oracle and the maven guys from Sonatype and Jenkins, supported by a vivid community. This exercise is based on the source code of the Jenkins, but to keep the confusion level low I refer to it often as Hudson for that is how the package and main class are called.

Conclusion

Refactoring legacy code always turns out to be more complicated and time-consuming than you expect. It’s important to follow some method – e.g. the Mikado method – that helps you to keep a global overview of where you want to go and where you are and to regularly consider what and why you’re doing so that you don’t get lost in a series of fix a problem – new problems discovered steps. It’s important to realize when to give up and try a different approach. It’s also very hard or impossible to write tests for the changes so you must be very careful (using safe, automated refactorings as much as possible and proceeding in small steps) but fear shouldn’t stop you from trying to save the code from decay.

Posted in General, Languages | Tagged: , , | 2 Comments »

Refactoring the “Legacy” Hudson.java with the Mikado Method as a Coding Dojo

Posted by Jakub Holý on April 16, 2011

I’m preparing a coding dojo for my colleges at Iterate where we will try to collectively refactor the “legacy” Hudson/Jenkins, especially Hudson.java, to something more testable, using the Mikado Method. I’ve got the idea after reading Gojko Adzic’s blog on how terrible the code is and after discovering the Mikado Method by a chance. Since a long time I’m interested in code quality and since recently especially in improving the quality of legacy applications, where “legacy” means a terrible code base and likely insufficient tests. As consultants we often have to deal with such application and with improving their state into something easier and cheaper to maintain and evolve. Therefore such a collective practice is a good thing.

The Mikado Method

The Mikado Method, which the authors describe as “a tool for large-scale refactorings”, serves two purposes: Read the rest of this entry »

Posted in General, Languages | Tagged: , , , , | 1 Comment »