The Holy Java

Notes of a passionate Java EE developer

Posts Tagged ‘legacy’

Most interesting links of October ’12

Posted by Jakub Holý on October 31, 2012

Recommended Readings

  • David Veksler: Some lesser-known truths about programming – things newcomers into the field of IT don’t know and don’t expect, true and an interesting read. Not backed by good data but anyway. F.ex.: “[..] a programmer spends about 10-20% of his time writing code [..] much of the other 90% thinking, researching, and experimenting”. “A good programmer is ten times more productive than an average programmer. A great programmer is 20-100 times more productive than the average [..]” “Bad programmers write code which lacks conceptual integrity, non-redundancy, hierarchy, and patterns, and so is very difficult to refactor.” “Continuous change leads to software rot, which erodes the conceptual integrity of the original design.” “A 2004 study found that most software projects (51%) will fail in a critical aspect, and 15% will fail totally.”
  • Brett L. Schuchert: Modern Mocking Tools and Black Magic – An example of power corrupting – interesting for two reasons: a good analysis of a poorly written piece of code and discussion of the code injection black magic (JMockIt) vs. actually breaking dependencies to enable tests.  The author presents a typical example of low-quality method (mixing multiple concerns, mixing different levels of abstractions, untestable due to a hardcoded use of an external call) and discusses ways to improve it and to make it testable. Recommended to read.
  • It’s Not About the Unit Tests – Learning from iOS Developers: iOS developers don’t do much testing yet they manage to produce high quality. How is that possible? The key isn’t testing itself, but caring for the code. (Of course, iOS is little special: small apps, no legacy, a powerful platform that does lot for the apps, very visual apps.) “It’s not about the practices. It’s about the spirit and intent behind them, and how they are applied.” (M. Fowler had a similar observation about a team that used mock-based testing exclusively and thus lacked integration tests yet all worked. [I've lost the link to the post and would be grateful for it])
  • Java Code Quality Tools – Overview – brief descriptions of 44 quality-related tools including some interesting tools and Eclipse plugins I didn’t know or knew but forgot. F.ex. analysis of dependencies with JBoss Tattletale or JarAnalyzer, Clirr to check libraries for source and binary backwards compatibility, JDiff generates JavaDoc-based report of removed/added/changed in an API. Spoon – read and check or transform Java code. Java PathFinder (NASA) – special JVM capable of checking all execution path to discover concurrency defects etc.

Tools

  • DirB, Directory Bookmarks for Bash (home) – moving efficiently among favourite directories (s <name> to create a bookmark for pwd, g <bookmark | relative/abs dir path> to enter a dir (=> works both for bookmarks and as a replacement for cd); also support for relative path bookmarks & more; sl lists bookmakrs in the last used order) (You might also want to check out Autojump, described in Dec 11; bashmarks is another similar project. Another similar project is rupa’s z and j2 and the fish clone z-fish)

Clojure Corner

  •  Jon Pither: Clojure at a Bank – Moving from Java -  the justification (productivity, dynamism, FP a better match for the domain) and process behind moving from Java to Clojure with a monolithic 1M LOC Spring/Hibernate app. (Random quotes: “I had used some dynamical languages before and it was quite obvious that we were essentially forcing lots of schema and type definition on to a problem domain that just didn’t want or need it.” “[..] it [dependency injection] just looks redundant in retrospect now that I’m working 95% with FP code.”) There is also a EuroClojure talk about their experiences one year later (35 min).
  • Prismatic’s “Graph” at Strange Loop – an interesting desing problem, its solution, and a resulting OSS library. The problem: How to break a large function into independently usable small ones that might depend on each other without ever needing to recompute a value once the function producing is called. The solution: Graph – “Graph is a simpledeclarative abstraction to express compositional structure.” (Enabling explicit declaration of data dependencies and pluging in different implementations.)
  • The Oblong: Blog about 2/3 D game programming in Clojure, starting from scratch (w/o an engine); interesting experiences
  • Ironclad: Steam Legions – Clojure game development battle report (the game on Github)
  • Building the Wishlisted.org webapp in Clojure – experiences from learning Clojure for real by building a webapp in Noir
  • Clojure vs. Scala smackdown (“Just kidding with the title of this post :-) “) – a short post with interesting discussion. Dmitri Sotnikov’s opinion resonates with me: “I found that for me Clojure wins on simplicity and consistency. While it looks more alien initially, once you learn the basics, you just reuse the same patterns everywhere.” Some more comments: “One major concern was maintainability, since it’s fairly easy to write very dense code. This turned out to not be a problem in practice. Because Clojure code is written as a tree, refactoring it is very easy.” REPL seems to be a big win (applies to Scala too). Scala’s type system might get tedious and learning its quirks takes time but there is lot of potential and both have they strong sides.
  • Code Fatigue – discussion of the advantages of learning, using, and combining the (many) standard Clojure functions instead of a “basic solution” using recursion etc. The argument is in favor of higher-level code with less complexity in the form of branching, recursion, nested expressions etc. and thus less mental fatigue.

Favorite Quotes

A classic test only cares about the final state – not how that state was derived. Mockist tests are thus more coupled to the implementation [emphasis mine] of a method. Changing the nature of calls to collaborators usually cause a mockist test to break.

- Martin Fowler in his classical Mocks Aren’t Stubs

I’m afraid of code. When I see a big pile of code, I get scared ;-) . Some classes and method make me cry. I had troubles explaining why I prefer short pieces of code keeping the same level of abstraction, cohesive and loosely coupled. The following quote captures the essence – improved communication.

One way to improve communication is to reduce the need for it and the same can be said for code. [...] Since we tend to read code more than write it, anything we can do to reduce the need to read code is time well invested in the life of a project.

- Brett L. Schuchert in Modern Mocking Tools and Black Magic – An example of power corrupting justifying extraction of code into another class or method

Posted in General, Java, Testing, Tools, Top links of month | Tagged: , , , , , | Leave a Comment »

Help, My Code Isn’t Testable! Do I Need to Fix the Design?

Posted by Jakub Holý on September 9, 2012

Our code is often untestable because there is no easy way to “sense1” the results in a good way and because the code depends on external data/functionality without making it possible to replace or modify these during a test (it’s missing a seam2, i.e. a place where the behavior of the code can be changed without modifying the code itself). In such cases the best thing to do is to fix the design to make the code testable instead of trying to write a brittle and slow integration test. Let’s see an example of such code and how to fix it.

Read the rest of this entry »

Posted in Testing | Tagged: , , , | Leave a Comment »

Most interesting links of January ’12

Posted by Jakub Holý on January 31, 2012

Recommended Readings

  • Jeff Sutherland: Powerful Strategy for Defect Prevention: Improve the Quality of Your Product – “A classic paper from IBM shows how they systematically reduced defects by analyzing root cause. The cost of implementing this practice is less than the cost of fixing defects that you will have if you do not implement it so it should always be implemented.” – categorize defects by type, severity, component, when introduced; 80% of them will originate in 20% of the code; apply prioritized automated testing (solve always the largest problem first). “In three months, one of our venture companies cut a 4-6 week deployment cycle to 2 weeks with only 120 tests.”
  • Ebook draft: Beheading the Software Beast – Relentless restructurings with The Mikado Method (foreword by T. Poppendieck) – the book introduces the Mikado Method for organized, always-staying-green (large-scale) refactorings, especially useful for legacy systems, shows it on a real-world example (30 pages!), discusses various application restructuring techniques, provides practical guidelines for dealing with different sizes of refactorings and teams, discusses in depth technical debt and more. To sum it up in three words: Check it out!
  • Daily Routine of a 4 Hour Programmer (well, it’s actually about 4h of focused programming + some hours of the rest) – a very interesting reading with some inspiring ideas. We should all find some time to follow up the field, to reflect on our day and learn from it (kaizen)
  • The Agile Testing Quadrants – understanding the different types of tests, their purpose and relation by slicing them by the axis “business facing x technology facing” and the axis “supporting the team x critiquing the product” => unit tests x functional tests x exploratory testing x performance testing (and other). It helps to understand what should be automated, what needs to be manual and helps not to forget all the dimensions of testing.
  • Adam Bien: Can stateful Java EE apps scale? – What does “stateless” really mean? “Stateless only means, that the entire state is stored in the database and has to synchronized on every request.” “I start the development of non-trivial (>CRUD) applications with Gateway / PDOs [JH: stateful EJBs exposing JPA entities] and measure the performance and memory consumption continuously.” Some general tips: Don’t split your web server and servlet container, don’t use session replication.
  • Brian Tarbox: Just-In-Time Logging – How to remove 90% of worthless logs while still getting detailed logs for cases that matters – the solution is to (1) only add logs for a particular “transaction” with the system into a runtime structure and (2) flush it to the log only if the transaction fails or st. else significant happens with it. The blog also proposes a possible implementation in detail.
  • DZone’s Top 10 NoSQL Articles of 2011
  • DZone’s Top 5 DevOps Articles of 2011
  • Test Driven Infrastructure with Vagrant, Puppet and Guard – this is interesting for me for I’m using Vagrant and Puppet on my project to create and share development environments or their parts and applying test-first approach to it seems interesting as do also the tools, rspec-puppet, cucumber-puppet and Guard (events triggered by file changes) and referenced articels.
  • 5+1 Sonar Plugins you must not miss (2012 version) – Timeline Plugin (with Google Visualization Annotated TimeLine), Useless Code Plugin, SIG Maintainability Model Plugin (metrics Analysability, Changeability, Stability, Testability), Quality Index Plugin (1-number health indicator), Technical Debt Plugin

Links to Keep

Clojure Corner

  • ClojureScript One Guide – “ClojureScript One shows you how to use ClojureScript to build single-page, single-language applications in a productive, effective and fun way.”
  • Asynchronous workflows in Clojure - true asynchronous (non-blocking) network access in Clojure with Netty/the Lamina project.
  • Clojure 2011 Year in Review – a list with important events in the Clojure sphere with links to details – C. 1.3.0, ClojureScript, logic programming with core.logic, clojure-contrib restructuring, birth of 4Clojure and Avout.
  • Clojure Atlas – interesting project (alpha version) presenting Clojure documentation in the form of interactive graph of related concepts and functions; it’s far from perfection but I like the concept and consider paying those ~ $25 for the 1.3.0 version when its out (however, the demo is free and it might become open-sourced in 2012)

Posted in General, Java, Testing, Tools, Top links of month | Tagged: , , , , , , , , , | Leave a Comment »

Refactoring Spikes as a Learning Tool and How a Scheduled Git Reset Can Help

Posted by Jakub Holý on November 21, 2011

To learn how complex your code base really is and how much effort a particular refactoring might require compared to the initial expectations, follow these steps:

  1. Schedule git reset --hard; git clean -fd to run in 1 hour (e.g. via cron)
  2. Do the refactoring
  3. WT*?! All my changes disappeared?!” – this experience indicates the end of the refactoring :-)
  4. Go for a walk or something and think about what you have learned about the code, its complexity, the refactoring
  5. Repeat regularly, f. ex. once every week or two – thus you’ll improve your ability to direct the refactoring so that you learn as much as possible during the short time

Read the rest of this entry »

Posted in General | Tagged: , , , , | Leave a Comment »

Most interesting links of July

Posted by Jakub Holý on July 31, 2011

Recommanded Readings

  • Martin Fowler, M. Mason: Why not to use feature branches and prefer feature toggles instead, when branches can actually be used (video, 12min) – feature branches are pretty common yet they are a hindrance for a good and stable development pace due to “merging hells”. With trusted developers, feature toggles are a much better choice.
  • M. Fowler: The LMAX Architecture – Martin describes the innovative and paradigm shaking architecture of the high-performance, high-volume financial trading platform LMAX. The platform can handle 6 million orders per second – using only a single java thread and commodity hardware. I highly recommend the article for two reasons: First, it crashes the common view that to handle such volumes you need multithreading. Second, for the rigorous, scientific approach used to arrive to this architecture. The key enablers are: 1) The main processing component does no blocking operations (I/O), those are done outside (in other threads). 2) There is no database – the state of the processor can be recreated by replaying the (persistent) input events. 3) To get further from 10k to 100k TPS they “just” wrote good code - well-factored, small methods (=> Hotspot more efficient, CPU can cache better). 4) To gain another multitude they implemented more efficient, cache-friendlier collections. All that was done based on evidence, enabled by thorough performance testing. 5) The processor and input/output components communicate without locking, using a shared (cyclic) array, where each of them operates on sum range of indexes and no element can ever be written by more than one component. Their internal range indexes do ever only increase so it is safe to read them without synchronization (at worst you will get old, lower value). The developers also tried Agents but found them in conflict with modern CPUs for their require context switch leading to emptying of the fast CPU caches.
    Updated: Martin has published the post titled Memory Image which discusses the LMAX approach to persistence in a more general way.
  • S. Mancuso: Working with legacy code with the goal of continual quality improvement – this was quite interesting for me as our team is in the same situation and arrived to quite similar approach. According to the author, the basic rule is “always first write tests for the piece code to be changed,” even though it takes so much time – he justifies it saying “when people think we are spending too much time to write a feature because we were writing tests for the existing code first, they are rarely considering the time spend elsewhere .. more time is created [lost] when bugs are found and the QA phase needs to be extended”. But it is also important to remember when to stop with refactoring to get also to creating business value and the rule for that is that quality improvements are done only with focus on a particular task. I like one of the concluding sentences: “Constantly increasing the quality level in a legacy system can make a massive difference in the amount of time and money spend on that project.”
  • Uncle Bob: The Land that Scrum Forgot – Scrum makes it possible to be very productive at the beginning but to be able to keep the productivity and continue meeting the expectations that are thus created we need to concentrate on some essential technical practices and code quality. Because otherwise we create a mess of growing complexity – the ultimate killer of productivity. Uncle Bob advices us what practices and how to apply to attain both high, sustainable productivity and (as required for it) high code quality. It’s essential to realize that people do what they are incented to do and thus we must measure and reward both going fast and staying clean.
    How do we measure quality? There is no perfect measure but we can build on the available established metrics – coverage, # (new) tests, # defects, size of tests (~ size of production code, 5-20 lines per method), test speed, cyclomatic complexity, function (< 20) and class (< 500) sizes, Brathwaite Correlation (> 2), dependency metrics (no cycles, in the direction abstraction).
    The practices that enable us to stay clean include among others TDD, using tools like Chceckstyle, FindBugs to find problems and duplication, implementing Continuous Integration.
  • Getting Started: Testing Concurrent Java Code – very good and worthy overview of tools for checking and testing of concurrent code with links to valuable resources. The author mentions among others FindBugs, concurrent test coverage (critical sections examined by multiple threads) measurement with IBM’s ConTest, multithreaded testing with ConTest (randomly tries to create thread interleaving situations; trial version – contact the authors for the full one) and MultithreadedTC (which divides time into “ticks” and enables you to fine-configure the interactions)
  • The top 9+7 things every programmer or architect should know – quite good selection of nine, respectively 7 things from the famous (on-line available) books 97 Things every programmer/architect should know.

Some fun:

  • Beans and noses – J. Spool reveals the First Rule of Consultancy: “No matter how much you try, you can’t stop people from sticking beans up their nose.” In other words, sometimes your clients decide to do some very unwise thing and no amount of reasoning can discourage them from that (quite understandably, as already the way they got to this decision defies logic).  “The only thing I can do in a beans-and-noses situation is wait. Wait until the bean is in its final resting place.” Then you ask the person how it is working for him and “… if sticking a bean deep into their nostril doesn’t meet the very high expectations they’d had, I can now start talking alternative approaches to reaching those expectations.” Already before you can actually ask them about their expectations, in some cases (50:50) this discussion can lead them to realize they could achieve them with an alternative, less painful approach. Now, if you have read up to this point, you clearly have enough time, so go an read the article because it’s really orth it!

Free e-Books

Posted in Java, Testing, Tools, Top links of month | Tagged: , , , , , | Leave a Comment »

What I’ve Learned from (Nearly) Failing to Refactor Hudson

Posted by Jakub Holý on April 28, 2011

We’ve tried to refactor Hudson.java but without success; only later have I been able to refactor it successfully, thanks to the experience from the first attempt and more time. In any case it was a great learning opportunity.

Lessons Learned

The two most important things we’ve learned are:

  • Never underestimate legacy code. It’s for more complex and intertwined than you expect and it has more nasty surprises up in its sleeves than you can imagine.
  • Never underestimate legacy code.

And another important one: when you’re tired and depressed, have some fun reading the “best comments ever” at StackOverflow :-) . Seeing somebody else’ suffering makes one’s own seem to be smaller.

I’ve also started to think that the refactoring process must be more rigorous to protect you from wandering too far your original goal and from getting lost in the eternal cycle of fixing something <-> discovering new problems. People tend to do depth-first refactoring changes that can easily lead them astray, far from where they actually need to go; it is important to stop periodically and look at where we are, where we are trying to get and whether we aren’t getting lost and shouldn’t just prune the current “branch” of refactorings and return to some earlier point and try perhaps a completely different solution. I guess that one of the key benefits of the Mikado method is that it provides you with this global overview – which gets easily lost when it is only in your head – and with points to roll-back to.

Evils of Legacy Code

Use a dependency injection framework, for God’s sake! Singletons and their manual retrieval really complicate testing and affect the flexibility of the code.

Don’t use public fields. They make it really hard to replace a class with an interface.

Reflection and multithreading make it pretty difficult if not impossible to find out the dependencies of a particular piece of code and thus the impacts of its change. I’d hard time finding out all the places where Hudson.getInstance is invoked while its constructor is still running.

Our Way to Failure and Success

There is a lot of refactoring that could be done with Hudson.java, for it is a typical God Class which additionally spreads its tentacles through the whole code base via its evil singleton instance being used by just about anyone for many different purposes. Gojko describes some of the problems worth removing.

The Failure

We’ve tried to start small and “normalize” the singleton initialization, which isn’t done in a factory method, but in the constructor itself. I haven’t chosen the goal very well as it doesn’t bring much value. The idea was to make it possible to have potentially also other implementations of Hudson – e.g. a MockHudson – but with respect to the state of the code it wasn’t really feasible and even if it was, a simple Hudson.setInstance would perhaps suffice. Anyway we’ve tried to create a factory method and move the initialization of the singleton instance there but at the end we got lost in concurrency issues: there were either multiple instances of Hudson or the application deadlocked itself. We tried to move pieces of code around, but the dependencies wouldn’t have let us do that.

The Success

While reflecting on our failure I’ve come to the realization that the problem was that Hudson.getInstance() is called (many times) already during the execution of the Hudson’s constructor by the objects used there and threads started from there. It is of course a hideous practice to access a half-baked instance before it is fully initialized. The solution is then simple: to be able to initialize the singleton field outside of the constructor, we must remove all calls to getInstance from its context.

Mikado Graph: Hudson Refactoring (click for full size)

The steps can be seen very well from the corresponding GitHub commits. Summary:

  1. I used the “introduce factory” refactoring on the constructor
  2. I modified ProxyConfiguration not to use getInstance but to expect that the root directory will be set before its first use
  3. I moved the code that didn’t need to be run from the constructor out, to the new factory method – this resulted in some, hopefully insignificant, reordering of the code
  4. Finally, I also moved the instance initialization to the factory method

I can’t be 100% sure that the resulting code has the same semantic as far as it matters, for I had to do few changes outside of the safe automated refactorings and there are no useful tests except for trying to run the application (and, as is common with legacy applications, it wasn’t feasible to create them beforehand).

The refactored code doesn’t provide much added value yet but it is a good start for further refactorings (which I won’t have the time to try :-( ), it got rid of the offending use of an instance while it is being created and the constructor code is simpler and better. The exercise took me about four pomodoros, i.e. little less than two hours.

If I had the time, I’d continue with extracting an interface from Hudson, moving its unrelated responsibilities to classes of their own (perhaps keeping the methods in Hudson for backwards compatibility and delegating to those objects) and I might even  use some AOP magic to get a cleaner code while preserving binary compatibility  (as Hudson/Jenkins actually already does).

Try it for Yourself!

Setup

Get the code

Get the code as .zip or via git:

git@github.com:iterate/coding-dojo.git # 50MB => takes a while
cd coding-dojo
git checkout -b mybranch INITIAL

Compile the Code

as described in the dojo’s README.

Run Jenkins/Hudson

cd coding-dojo/2011-04-26-refactoring_hudson/
cd maven-plugin; mvn install; cd ..       # a necessary dependency
cd hudson/war; mvn hudson-dev:run

and browse to http://localhost:8080/ (Jetty should pick changes to class files automatically).

Further Refactorings

If you’re the adventurous type, you can try to improve the code more by splitting out the individual responsibilities of the god class. I’d proceed like this:

  1. Extract an interface from Hudson and use it wherever possible
  2. Move related methods and fields into (nested) classes of their own, the original Hudson’s methods just delegate to them (the move method refactoring should be useful); for example:
    • Management of extensions and descriptors
    • Authentication & authorization
    • Cluster management
    • Application-level functionality (control methods such as restart, updates of configurations, management of socket listeners)
    • UI controller (factoring this out would require re-configuration of Stapler)
  3. Convert the nested classes into top-level ones
  4. Provide a way to get instances of the classes without Hudson, e.g. as singletons
  5. Use the individual classes instead of Hudson wherever possible so that other classes depend only on the functionality they actually need instead of on the whole of Hudson

Learning about Jenkins/Hudson

If you want to understand mode about what Hudson does and how it works, you may check:

Sidenote: Hudson vs. Jenkins

Once upon time there was a continuous integration server called Hudson but after its patron Sun died, it ended up in the hands of a man called Oracle. He wasn’t very good at communication and nobody really knew what he is up to so when he started to behave little weird – or at least so the friends of Hudson perceived it – those worried about Hudson’s future (including most people originally working in the project) made its clone and named it Jenkins, which is another popular name for butlers. So now we have Hudson backed by Oracle and the maven guys from Sonatype and Jenkins, supported by a vivid community. This exercise is based on the source code of the Jenkins, but to keep the confusion level low I refer to it often as Hudson for that is how the package and main class are called.

Conclusion

Refactoring legacy code always turns out to be more complicated and time-consuming than you expect. It’s important to follow some method – e.g. the Mikado method – that helps you to keep a global overview of where you want to go and where you are and to regularly consider what and why you’re doing so that you don’t get lost in a series of fix a problem – new problems discovered steps. It’s important to realize when to give up and try a different approach. It’s also very hard or impossible to write tests for the changes so you must be very careful (using safe, automated refactorings as much as possible and proceeding in small steps) but fear shouldn’t stop you from trying to save the code from decay.

Posted in General, Java | Tagged: , | 2 Comments »

Refactoring the “Legacy” Hudson.java with the Mikado Method as a Coding Dojo

Posted by Jakub Holý on April 16, 2011

I’m preparing a coding dojo for my colleges at Iterate where we will try to collectively refactor the “legacy” Hudson/Jenkins, especially Hudson.java, to something more testable, using the Mikado Method. I’ve got the idea after reading Gojko Adzic’s blog on how terrible the code is and after discovering the Mikado Method by a chance. Since a long time I’m interested in code quality and since recently especially in improving the quality of legacy applications, where “legacy” means a terrible code base and likely insufficient tests. As consultants we often have to deal with such application and with improving their state into something easier and cheaper to maintain and evolve. Therefore such a collective practice is a good thing.

The Mikado Method

The Mikado Method, which the authors describe as “a tool for large-scale refactorings”, serves two purposes: Read the rest of this entry »

Posted in General, Java | Tagged: , , , | 1 Comment »