The Holy Java

Building the right thing, building it right, fast

Posts Tagged ‘performance’

Profiling Tomcat Webapp with VisualVM and NetBeans – Pitfalls

Posted by Jakub Holý on February 25, 2012

Profiling a webapp running on Tomcat with VisualVM or NetBeans wasn’t as easy as expected, so this is a brief record of what to avoid to succeed.

Environment: Mac OS X, Java JDK 1.6.0_29, Netbeans 7.1, VisualVM 1.3.3 (installed separately), Tomcat 6.

The Pitfalls:

VisualVM

  • VisualVM Sampler and Profiler: To be able to drill down to the slow methods you need to take a snapshot before you stop the sampling/profiling (there is a [Snapshot] button above the Hot Spots list). This is not very intuitive and the interface doesn’t communicate it.
  • VisualVM Profiler: Excludes Thread.sleep() & Object.wait() time (as opposed to the NetBeans profiler where you can choose to include/exclude them) => if you method is spending lot of time waiting for a lock, you won’t discover it
  • You might need to allow unsafe, passwordless JMX connections in your Tomcat config, see Resources

NetBeans Profiler

  • While VisualVM was able to dynamically connect to my Tomcat, NetBeans wasn’t able to do it and hasn’t provided any notification about the failure. The only visible manifestation was that it wasn’t showing and collecting any data. Solution: Use the “Direct” attach invocation, i.e. starting the target java application with the NetBeans profiling agentlib.

Tools Overview

VisualVM

Extremely useful tool, included in JDK since 6.0 (command line: jvisualvm) or on the VisualVM page.

  • Automatically discovers local Java processes and can connect to them (if they run Java 6+)
  • Monitoring – threads, heap, permgen, CPU, classes
  • Sampler – low-overhead profiling tool (takes thread snapshot at regular intervals and compares their stack traces to find out where most time is spent)
  • Plugins, such as MBeans, Tracers
  • Profiler – a simpler version of NetBeans profiler (comparison here); it can dynamically attach to running (Java 6+) processes and instrument classes on-the-fly. The not very visible checkbox Settings in the right-top corner can be used to set the classes to start profiling from (syntax: my.package.** to include subpackages or my.package.* or my.pkg.MyClass) and the packages not to / only to profile (syntax differs here: my.package.* to include subpackages or my.package. or my.pkg.).

Resources

Posted in Languages, Tools | Tagged: , , | Comments Off

Most interesting links of January ’12

Posted by Jakub Holý on January 31, 2012

Recommended Readings

  • Jeff Sutherland: Powerful Strategy for Defect Prevention: Improve the Quality of Your Product – “A classic paper from IBM shows how they systematically reduced defects by analyzing root cause. The cost of implementing this practice is less than the cost of fixing defects that you will have if you do not implement it so it should always be implemented.” – categorize defects by type, severity, component, when introduced; 80% of them will originate in 20% of the code; apply prioritized automated testing (solve always the largest problem first). “In three months, one of our venture companies cut a 4-6 week deployment cycle to 2 weeks with only 120 tests.”
  • Ebook draft: Beheading the Software Beast – Relentless restructurings with The Mikado Method (foreword by T. Poppendieck) – the book introduces the Mikado Method for organized, always-staying-green (large-scale) refactorings, especially useful for legacy systems, shows it on a real-world example (30 pages!), discusses various application restructuring techniques, provides practical guidelines for dealing with different sizes of refactorings and teams, discusses in depth technical debt and more. To sum it up in three words: Check it out!
  • Daily Routine of a 4 Hour Programmer (well, it’s actually about 4h of focused programming + some hours of the rest) – a very interesting reading with some inspiring ideas. We should all find some time to follow up the field, to reflect on our day and learn from it (kaizen)
  • The Agile Testing Quadrants – understanding the different types of tests, their purpose and relation by slicing them by the axis “business facing x technology facing” and the axis “supporting the team x critiquing the product” => unit tests x functional tests x exploratory testing x performance testing (and other). It helps to understand what should be automated, what needs to be manual and helps not to forget all the dimensions of testing.
  • Adam Bien: Can stateful Java EE apps scale? – What does “stateless” really mean? “Stateless only means, that the entire state is stored in the database and has to synchronized on every request.” “I start the development of non-trivial (>CRUD) applications with Gateway / PDOs [JH: stateful EJBs exposing JPA entities] and measure the performance and memory consumption continuously.” Some general tips: Don’t split your web server and servlet container, don’t use session replication.
  • Brian Tarbox: Just-In-Time Logging – How to remove 90% of worthless logs while still getting detailed logs for cases that matters – the solution is to (1) only add logs for a particular “transaction” with the system into a runtime structure and (2) flush it to the log only if the transaction fails or st. else significant happens with it. The blog also proposes a possible implementation in detail.
  • DZone’s Top 10 NoSQL Articles of 2011
  • DZone’s Top 5 DevOps Articles of 2011
  • Test Driven Infrastructure with Vagrant, Puppet and Guard – this is interesting for me for I’m using Vagrant and Puppet on my project to create and share development environments or their parts and applying test-first approach to it seems interesting as do also the tools, rspec-puppet, cucumber-puppet and Guard (events triggered by file changes) and referenced articels.
  • 5+1 Sonar Plugins you must not miss (2012 version) – Timeline Plugin (with Google Visualization Annotated TimeLine), Useless Code Plugin, SIG Maintainability Model Plugin (metrics Analysability, Changeability, Stability, Testability), Quality Index Plugin (1-number health indicator), Technical Debt Plugin

Links to Keep

Clojure Corner

  • ClojureScript One Guide – “ClojureScript One shows you how to use ClojureScript to build single-page, single-language applications in a productive, effective and fun way.”
  • Asynchronous workflows in Clojure - true asynchronous (non-blocking) network access in Clojure with Netty/the Lamina project.
  • Clojure 2011 Year in Review – a list with important events in the Clojure sphere with links to details – C. 1.3.0, ClojureScript, logic programming with core.logic, clojure-contrib restructuring, birth of 4Clojure and Avout.
  • Clojure Atlas – interesting project (alpha version) presenting Clojure documentation in the form of interactive graph of related concepts and functions; it’s far from perfection but I like the concept and consider paying those ~ $25 for the 1.3.0 version when its out (however, the demo is free and it might become open-sourced in 2012)

Posted in General, Languages, Testing, Tools, Top links of month | Tagged: , , , , , , , , , , | Comments Off

Most interesting links of December

Posted by Jakub Holý on December 31, 2011

Recommended Readings

  • The Netflix Chaos Monkey – how to test your preparedness for dealing with a system failure so that you won’t experience nasty wakeup when something really fails in Sunday 3 am? Release a wild, armed monkey into your datacenter. Watch carefuly what happens as it randoly kills your instances. This is exactly what Netflix does with their with their cloud infrastructure – also a great inspiration for my recent project. Do you need to be always available? Than consider employing the chaos monkey – or a whole army of monkeys!
    (PS: There is also a post with a picture of the scary monky.)

Links to Keep

  • BDD: Write specifications, not scripts (from the Concordion site) – relatively brief yet very enriching practical description of how to do behavior-driven development a.k.a. Specification by Example right, the key point here being “Write specifications, not scripts.” It says why not (for scripts overspecify -> are brittle, specs should be stable) and how to do it (decouple the stable spec and the volatile system via fixture code, expose minimal stuff to the spec, perhaps evolve a DSL between the fixture and the system). It also lists common “smells” of BDD done wrong. If it still isn’t clear to you, read the Script to Specification Makeover example (or perhaps read it anyway). BTW, Concordion is a new tool for doing BDD based on JUnit and HTML, which was created as a response to the weaknesses of Fit[Nesse], i.e. exactly the tendency to do scripting instead of specifications. It looks very promissing to me!

SW Utilities

  • (Linux/Mac) Autojump – superfast navigation between favorite directories in the command line (via Jake McCrary) – it keeps track of how much time you spend in each directory and when you execute j <substring of directory path/name>, it jumps into the most frequently used one matching the substring. It awesome! (You can also run jumpstat to see the statistics.)
  • (Linux) Tcpkill – service/network outage testing (via Jake McCrary) – kill connections to or from a particular host, network, port, or combination of all – useful e.g. when you want to test that your software is resilient to the outage of a particular service or server – less brutal than actually killing the database etc. instances. We need to test that our application recoveres properly when one of our MongoDB nodes dies so this may be quite useful.
  • Manik Hot Deploy Plugin for Maven Projects (v1.0.2 in 5/2011; older version in the Marketplace) – plugin that can do hot and incremental deployment to any app server (simply by copying to its hotdeploy directory or the directory of an installed webapp) whenever you run mvn install or automatically whenever sources change, multi-module support

Clojure Corner

  • Jake McCrary: Continuous Testing With Clojure and Expectations – continuous test runner lein-autoexpect for Clojure tests written using the library expectations by Jay Field.
  • Jake McCrary: Quickly Starting a Powerful Clojure REPL – Clojure REPL only two steps away: 1) Run Emacs, 2) Execute M-x clojure-swank (no more need to open an existing Leinigen project) – the trick is to install the Leinigen plugin swank-clojure and use Jake’s elisp function clojure-swank that automatically starts the swank-clojure server. (I had to hack the function for the clojure-swank output contained “null” instead of “localhost”, likely due to incorrect DNS setup.)

Posted in eclipse, General, Testing, Top links of month | Tagged: , , , , , , , | Comments Off

Most interesting links of November

Posted by Jakub Holý on November 30, 2011

Recommended Readings

  • Recommended Reading by Poppendiecks – an excellent selection, starting with Lean from Trenches, Management 3.0, Specification by Example, The Lean Startup etc.
  • Eric Allman says that Programming Isn’t Fun Any More  because problem solving has been replaced with learning, configuring, and integrating tons of libraries, frameworks, and tools and many people agree with that (as discussion on reddit proves). In other words we tend to go for any benefit we can have without considering the costs and for “easy” solutions without considering the true enemy: complexity. Perhaps we should always listen to the Rich Hickey’s Simple Made Easy talk before we add a lib/tool/framework?
    • Dean Wampler claims that functional programming can bring the joy back – “[..] a functional language, Scala, Clojure, Haskell, etc. will greatly reduce the amount of code you create. That won’t solve the problem of trying to integrate with too many libraries, but you’ll be less tempted. I also believe those libraries will be less bulky, etc.
    • Few quotes from a related article by M. Taylor: To put it another way, libraries make excellent servants, but terrible masters. | [..] frameworks [..] do keep their promise of making things very quick and easy … so long as you do things in exactly the way the framework author intended | On libraries: [..] we all assume (I know I do) that “plug in solutions X1 and X3″ is going to be trivial. But it never is — it’s a tedious exercise in impedance-matching, requiring lots of time spent grubbing around in poorly-written manuals [..] | On the effect of language choice: [..] different languages, with their different expressive power and especially their different culture, yield very different experiences.
    • To sum it up: Choose your tools and libraries wisely and always mind the global complexity. More usually means worse.
  • Java Magazine – Adam Bien: Stress Testing Java EE 6 Applications (page 41+) – do developer stress testing! Using: JMeter, VisualVM to find out resource consumption and behavior in the application, VisualVM’s Sampler profiling tool [cca 20% overhead], a webapp to extract metrics from GF (STM)
  • Java Magazine – Polyglot Programming on the JVM (page 50; excerpt from The Well-Grounded Java Developer) – why you should consider polyglot programming and how to decide whether to use it and what languages to pick, f.ex.: “These [Java's] qualities make the language a great choice for implementing functionality in the stable layer [of the polyglot programming pyramid]. However, these same attributes become a burden in the middle and upper [lower, DSL, on the linked image] tiers of the pyramid; for example: Recompilation is laborious; Static typing can be inflexible and lead to long refactoring times; Deployment is a heavyweight process; Java’s syntax is not a natural fit for producing DSLs.” “There is a wide range of natural use cases for *alternative languages*. [after identifying such a UC] You *now need to evaluate* whether using an alternative language is appropriate.”
  • Intrusion Detection for Web Apps – Detection Points – If security is a concern of your web application then you should build intrusion detection into the application f.ex. leveraging the  OWASP AppSensor project. The key is to detect malicious/unexpected behavior and proactively do something such as locking the user out or alerting the admins. The page linked above lists some common suspicious behaviors such as the use of multiple usernames, unexpected HTTP command/method, additional/duplicated data in request. Worth checking out!
  • Yammer Moving From Scala to Java- Scala is a cool language but sometimes its cost is higher than the benefits. Snippets from the post: “…the friction and complexity that
    comes with using Scala instead of Java isn’t offset by enough productivity benefit or reduction of maintenance burden …”. “Scala, as a language, has some profoundly interesting ideas in it. [...] But it’s also a very complex language. The number of concepts I had to explain to new members of our team for even the simplest usage of a collection was surprising: implicit parameters, builder typeclasses, ‘operator overloading’, return type inference, etc. etc.” (It’s claimed that only library authors need to know some of that but if it’s a part of library APIs, the users need to understand it too.) Notice that the author isn’t saying “Scala is bad” but only that Scala isn’t the best balance of their needs at this time, as Alex Miller put it*.
    Important note
    : The text wasn’t intended for publication and it is a private opinion of a Yammer developer, not the company itself. You should read the official Yammer’s position where Coda puts it into the right context.

Refactoring

  • Opportunistic Refactoring by Martin Fowler – refactor on the go – how & why
  • Michael Feathers: Getting Empirical about Refactoring – gather information that helps us understand the impact of our refactoring decisions using data from a SCM, namely File Churn (frequency of changes, i.e. commits) vs. Complexity – files with both high really need refactoring. Summary: “If we refactor as we make changes to our code, we end up working in progressively better code. Sometimes, however, it’s nice to take a high-level view of a code base so that we can discover where the dragons are. I’ve been finding that this churn-vs.-complexity view helps me find good refactoring candidates and also gives me a good snapshot view of the design, commit, and refactoring styles of a team.

UIs and Web Frameworks

  • Devoxx 2011 - WWW: World Wide Wait? A Performance Comparison of Java Web Frameworks (slides) – the authors did extensive performance testing of some of the most popular web frameworks. Of course it’s always hard to guess how general their results are, if/how they apply to one’s particular situation, and if they aren’t distorted in some way but it’s worth for their approach alone (AWS with its CloudWatch monitoring, WebDriver, additional measurement of page load with HAR and a browser plugin). In their particular tests GWT scored best, followed by Spring MVC, with JSF and Wicket lagging far behind (especially the MyFaces implementation). Conclusion: A web framework may have strong impact on performance and scalability, if they are important for you then do test the performance early with as realistic code and load as possible.
  • JSF2 – Benchmark datatable by N. Labrot, 2/2011 – performance comparison of PrimeFaces 2.2.1, IceFaces 2.0, Richfaces 4.0.0M4 on a simple page with Ajax. I do not trust any benchmark that I don’t fake myself :-) (for there are always too many factors that influence the conclusions to be drawn) but it’s interesting anyway – and perhaps a good thing to do before you decide for a JSF component library.
  • Alex MacCaw: Asynchronous UIs – the future of web user interfaces and the Spine framework – users in 2011 shouldn’t anymore wait for pages to load and operations to complete, we should build asynchronous UIs where changes to the UI are performed immediately while a request to the server is sent in the background, similarly to sending e-mail in GMail, which returns at once displaying a non-intrusive “Sending…” notification. As a user I very much agree with Alex.
  • Matt Raible’s 20 criteria for evaluating web frameworks, 2010 (detailed description, here’s a brief list) – Matt’s results are disputable and as he himself says you should always do your own evaluation and spikes but the criteria are pretty useful: Developer Productivity, Developer Perception, Learning Curve, Project Health, Developer Availability, Job Trends, Templating, Components, Ajax, Plugins or Add-Ons, Scalability, Testing, i18n and l10n, Validation, Multi-language Support (Groovy / Scala), Quality of Documentation/Tutorials, Books Published, REST Support (client and server), Mobile / iPhone Support, Degree of Risk.

NoSql

  • Don’t use MongoDB via @nicolaiarocci – a (fake?!) bad experience with MongoDB – the text is not credible (the author is anonymous, s/he doesn’t explicitely state which version of MongoDB they used, the 10gen CTO can’t find a matching client and any evidence for some of the issues mentioned) but it  gives context for the read-worthy response from the 10gen CTO, and a post that nicely explains how to correctly design for MongoDB. A comment about MongoDB experience at Forsquare: “Currently we have dozens of MongoDB instances across several different data clusters storing over a TB of data and handling 10s of thousands of requests per second (mostly reads but the write load is reasonably high as well).Have we run into problems with MongoDB along the way? Yes, of course we have. It is a new technology and problems happen.Have they been problematic enough to seriously threaten our data? No they have not.
  • Martin Fowler on Polyglot Persistence – the are when will be choosing persistence solution with respect to our needs instead of mindlessly picking RDBMS is coming. Applications will combine multiple, specific solutions, f.ex. we could pick Redis (key-value) for caching, MongoDB (document DB) for product catalog, Neo4J (graph DB) for recommendations, RDBMS for financial data and reporting… (of course not all in one project!). Polyglot persistence will come at a cost (complexity, learning) – but it will come because the benefits are worth it – performance, data storage model and behavior more aligned with the business logic (NoSql databases ofer various models and tradeoffs and thus we can find a much better fit than with general-purpose RDBMs).

Talks & Video

  • Adam Bien’s JavaOne talk Java EE 6: The Cool Parts (1h) – absolutely worth the time – a very practical fly through the cool features of Java EE (eventing, ..), most of the time is spent actually coding. Don’t forget to check also the interesting discussion below the video (JEE and other frameworks, Java FX and JSF 2, …).
  • Jurgen Appelo’s keynote How to Change the World at Smidig 2011 is well done and highly useful. We all strive to change the world around us – as consultants we want to make our clients more agile, as team members we want to make our Scrum teams more self-organizing, as employees we want to help building knowledge-sharing and open culture, … . However it isn’t easy to influence or change people and culture and if we aren’t aware of all the dimensions of a change (system, individuals, interactions, environment) and how to work along each of them, we are much less likely to succeed. The knowledge and experience that Jurgen shares with us can help us a lot in having an impact. You can also download the slides and change management questions.
  • Project X: What is being a programmer like? (5min) If ever again a non-geek asks you what you as a developer are doing, just show him this short and extremely funny video (created by my ex-employer – perhaps they estimated how much time and energy developers loose trying to explain it to normal people and decided to prevent this great waste :-))
  • RSA Animate – Drive: The surprising truth about what motivates us (10 min) – entertaining and enlightening; once we’ve enough money to cover our needs, it’s autonomy (self-direction), mastery, and purpose what motivates us (money actually decrease our performance). Now this is a great evidence for lean/agile – for they’re based on making people self-directing and encourage mastery (as in continous integration and top quality to enable steady pace). Autonomy enables engagement as does a higher purpose (“make the world a better place”) – Steve Jobs with his visions was able to provide such a purpose. Atlassian’s FedEx Days are a good example of what engagement and benefits autonomy brings.
  • Simon Sinek: How great leaders inspire action (18 min, subtitles in 37 languages) – do you want to succeed, to change the world around you for the better, to start a new company? Then you must start by communicating “why” you do what you do, not “what” – like M. L. King, bro Wrights, and Apple. Very inspiring! (More in his Why book.)

Links to Keep

Favorite Quotes

Refactoring is like advertising: it doesn’t cost, it pays.
- Mary & Tom Poppendiecks, Implementing Lean Software Development, p.166

Clojure Corner

Posted in Databases, General, j2ee, Top links of month | Tagged: , , , , , , , , , , , | Comments Off

Most interesting links of February

Posted by Jakub Holý on February 28, 2011

Articles, links etc.

Git

  • The Git Parable (thx to Alexander) – a good and easy to understand explanation of the story behind Git and thus also its main goals and concepts. It really helps in understanding what makes Git different from Subversion and thus empowers you to use it to its full capabilities and in accordance with its philosophy and not (painfully) against it. Though I’d appreciate little more coverage of merging and cooperation in the Git world.

Performance

  • Some thoughts on stress testing web applications with JMeter (part 2) – what to measure, when to stop, what listeners to use, some practical tips e.g. for using remote slave JMeter instances, how to interpret the results (explanation of mean, std. deviation, confidence interval). By Nicolas Vahlas, 3/2010.
  • Scalability Factors of JMeter in Performance Testing projects (conference paper, 2008) – what factors determine what number of virtual users (VU) a load test tool can efficiently simulate and experiments to determine the impact of those factors on JMeter. Each VU has an associated cost in terms of CPU and memory (send, process, store request/response etc.) => number of VUs depends on the HW and the complexity of the messages and protocol and on the load test script complexity.
    • For example one commercial tool claims to support up to nearly 2k VUs on 1GHz PIII w/ 1GB but w/o stating the other factors.
    • JMeter scalability experiments results (Def.: Scalability limit reached when additional VUs aren’t increasing the server’s load (i.e. the test tool has become the bottleneck)):
      • Response time: “The optimal number of virtual users increases with increase in response time. It increases from around 180 virtual users to around 700 optimal virtual users when the response time changes from 500 ms to 1.5 seconds.”
      • Application response size: “Application response size has a massive effect on the load generation capabilities of JMeter. The optimal # of virtual users drops from around 500 virtual users to a measly 115 virtual users when the application response size increases form 20 kb to 100kb.”
      • Communication protocol (HTTP x HTTPS): The load generation capability decreases by 50% or more when the protocol is HTTPS
    • HW used for the test machine: P4 2.4GHz, 2GB RAM.
  • A. Bien: Can Stateful Java EE 6 Apps Scale? – Another Question Of The Week – the answer: Yes. A.B. always starts with Gateway+PDO and measures the performance/overhead with VisualVM, JMeter. Don’t forget that stateless applications usually just move the state and therefore scalability problem to the DB.

Other

  • Switching to Plan J – isn’t Scala good enough (yet)? – by Jonathan Edwards. Really an interesting read including the comments (well, the first half – esp. Martin, Stuart R., Vincent etc.) – though very impressive, is Scala too complex and do we need syntax subsets for normal people? The state of IDE support is bad but should be getting better.
  • Opinion: The unspoken truth about managing geeks – a great article on corporate culture and psychology, which turn people into stereotypical geeks – invaluable advices for managing IT professionals. And it’s fun to read! (Well, at least if you’re a geek observeing the very things he speaks about around you.)
  • ThoughtWorks Technology Radar, Januar 2011 – Scala up to Trial, DevOps, …. . Unchanged: ESB, GWT & RIA on hold, Apache Camel (integration library) on Trial. Other interesting: CSS supersets supporting variables etc. like LESS.

Quotes of the month

I’ve decided to add this occassional section to capture interesting or inspirational comments of famous as well as ordinary people.

  • if i learn, i end up going fast. if i just try to go fast, i don’t learn & only go fast briefly.” (Kent Beck’s Twitter, 2/11). Reflects nicely a discussion I recently had with a colleague :-) I also like the follow-ups:
    • Mike Hogan: any links/books you can suggest to help get more grounded in this mindset of valuing learning over trying to go fast?
    • Kent Beck: “zen in the art of archery” is one classic that i’ve found helpful. that and endlessly trying to do it wrong…
  • The computer industry is the only industry that is more fashion-driven than women’s fashion.” – by Richard Stallman in Cloud computing is a trap, 2008-09-29. I might not agree with everything he says but there certainly is a seed of truth in that!
  • Shipping is a feature. A really important feature. Your product must have it.by Joel Spolsky in The Duct Tape Programmer, 2009-09-23. This is something we should really keep in mind and explain to the product owners too :-)

Posted in General, Testing, Tools, Top links of month | Tagged: , , , , , , , | Comments Off

Joshua Bloch: Performance Anxiety – on Performance Unpredictability, Its Measurement and Benchmarking

Posted by Jakub Holý on December 10, 2010

Joshua Bloch had a great talk called Performance Anxiety (30min, via Parleys; slides also available ) at Devoxx 2010, the main message as I read it was

  1. Nowadays, performance is completely non-predictable. You have to measure it and employ proper statistics to get some meaningful results.
  2. Microbenchmarking is very, very hard to do correctly. No, you misunderstand me, I mean even harder than that! :-)
  3. From the resources: Profiles and result evaluation methods may be very misleading unless used correctly.

Read the rest of this entry »

Posted in Languages | Tagged: , , , | 6 Comments »

Most interesting links of November

Posted by Jakub Holý on November 30, 2010

This month has been quite interesting, among others I ‘ve picked up several blogs by Adam Bien. I really like his brief, practical and insightful posts :-)

Java, Jave EE, architecture etc.

  • EJB 3.1 + Hessian = (Almost) Perfect Binary Remoting – “With hessian it is very easy to expose existing Java-interfaces with almost no overhead. Hessian is also extremely (better than IIOP and far better than SOAP) fast and scalable.” See also the discussion regarding a “DynamicHessianServlet” (which would be more comfortable then creating a new servlet for each service).
  • Binary XML – Fast Infoset performance results – FI is the name of a standard for binary XML encoding and also of its OSS implementation; according to these measurements (with default settings, compared to Xerces 2.7.1), on average: The FI SAX parser  is about 5 times faster than the Xerces SAX parser, the FI DOM serializer (using default settings) is 25% to 30% faster than the Xerces DOM XMLSerializer, and FI documents are 40% to 60% smaller than XML documents (i.e. it can provide modest to good compression). And: “The use of external vocabularies can be a very effective way to increase the efficiency of parsing, serializing and size at the expense of the fast infoset documents no longer being self-describing … .”
    For the interested ones: FI is built on ASN.1, a standard for describing data structures in a way that is independent of machine architecture and implementation language, it also includes a selection on different binary encoding rules, e.g. DER (triplets tag id – length – value).
  • EJB 3.1 And REST – The Lightweight Hybrid – Why to use @Stateless – there have been recently discussion about Spring vs. EJB and whether @Stateless is of any use. According to this article, the added value of EJB (though some are provided also by Spring) include injection capabilities, transactions, single threading model (for why that is good see #2 on Why I like EJB 3.0/3.1), visibility in JMX, concurrency restriction via thread/bean pools.
  • Why Service Isn’t A ServiceFacade, But ServiceFacade Is Sometimes A Service…
  • Experiences from migrating Hyperic 4.5 from EJB/JBoss to Spring/Tomcat, what good has it brought – simplified unit and integration tests, simplified code thanks to Jdbc/JmsTemplate, … (My comment: EJB 3.1 would certainly bring at least some of the advantages too.)

Performance – large JVM heaps are very much feasible

  • Tuning the IBM JVM for large heaps – tuning 64b IBM JVM 6 with 100GB heap to have acceptable GC times (~ 1/2s as opposed to 25s with the default settings)
  • BigMemory: Heap Envy – there have been recently a lot of fuss about Terracotta’s new BigMemory, a  GC-resistent many GB cache space (using NIO byte buffer). This post discusses it’s disadvantages and compares it with a solution based on ConcurrentHashMap, comming to the conclusion that the old good ConcurrentHashMap outperforms BigMemory considerably.

Other

Posted in j2ee, Languages, Top links of month | Tagged: , , , , , , , | Comments Off

Most interesting links of September

Posted by Jakub Holý on September 30, 2010

The most interesting articles and other IT resources I’ve encountered the previous month, delayed a bit due to my holiday in Andalusia. In no particular order. Included some performance stuff, few tools and few general SW engineering things.

  • The Open-Closed Principle , i.e. open to extension, closed to modification – one of the basic principles in OOP underlying many best practices such as “make all member variables private” nicely explained, with code samples with and without O/CP applied.  Noteworthy: The principle is implemented with abstractions (abstract classes defining the constant part with subclasses being the extensions). You can’t “close” your design against all changes and must thus choose the ones that are more likely, i.e. create a “strategic closure”.
  • Java EE 6 xor Spring by A. Bien – the main difference is in the philosophy behind – Java EE is based on Convention over Configuration. The decision factor is usually the support policy, though. Also the tc Server is certainly better than a “custom” Tomcat with Spring.
  • ThoughtWorks Technology Radar – to help decision-makers from CIOs to developers understand emerging technologies and trends that affect the market today; though their assignment of GWT under hold is disputable (http://www.dzone.com/links/r/thoughtworks_radar_demystified_gwt.html); TW answers: “As it turned out the conciseness of the text didn’t allow us to adequately make our points so that they were not misunderstood. We are interested in a discussion but our opinion about the suitability and usability of GWT has still not changed.”
    • JavaScript as a 1st-class language w/ the same best practices (unit t., refact.,…); functional languages (Clojure > Scala)
    • WS-* beyond the basic profile, GWT, RIA on hold
  • “Agile Business” for a startup with “Virtual Assistants”: outsource (to an “Virtual Assistant”) what you can, only do what’s necessary at the time even if that means doing manually (outsourced) st. that could be automated; this helped to decrease time from 160 MH to 10 MH.
    “The lesson is that before you launch your product, think about the processes you can avoid automating. How about reminder emails? How about monthly billing? Could a human being run a report once a month and send emails or charge credit cards?” , “Every hour spent writing code is wasted time if that code could be replaced by a human being doing the same task until your product proves itself.”
  • InMemProfiler: Identifying Memory Allocators – tool to track memory allocation capable of attributing it to the classes (i.e. packages) of interest without blurring the results with char[] and all the java.lang.* classes
  • Easy Performance Analysis with AppDynamics Lite – I’m fond of performance troubleshooting tools and AppDynamics Lite looks really cool. The post includes two very short yet very informative and nice screencast showing its installation and usage. App. D. automatically discovers Struts actions, JDBC calls, webservices etc. and captures slow operations with the necessary details; this is quite similar to what the open-source Glassbox.com does. The monitoring web UI is very nice and user friendly. See the white box on the right side at http://www.appdynamics.com/lite.php to see what JVMs, ASs and frameworks it supports. Check also the comparison of the lite and standard versions (max 30 transactions, max 2 hours if diagnostics data, …).
  • Google Relaunches Instantiations Developer Tools – Now Available for Free – incl. the static code analysis tool (Eclipse plugin) CodePro AnalytiX
  • A blog about Terracota’s new BigMemory (commercial) claims that 64b JVM with heap over few GB may be a nightmare – “… not all people know about 64-bit JVMs and the nightmare these things can cause. Contrary to what other vendors are claiming, most shops such as Unibet, PartyPoker, Expedia, Sabre Holdings, Intercontinental Hotels Group, JP Morgan, Goldman, and more will tell you a 64-bit JVM pauses unpredictably and for minutes at a time. even when a 64 bit JVM is small (<2GB) it takes 30+% more RAM than a 32-bit equivalent JVM running under the same app.
  • Presentation Real Software Engineering by Glenn Vanderburg  – the talk is pretty interesting and I recommend it. Few, subjectively selected and interpreted points:
    The SW engineering as taught in universities doesn’t work, it’s actually a caricature of “engineering” (that is, established practices that work). The reason is that SwE is unreasonably fascinated by (ideally mathematical) modeling and precise, repeatable processes. This is not how real SW development can or does work. Given the complexity and uncertainty, an empirical process, based on frequent feedback and continual adjustment, is much more suitable. Also we don’t need complex models because prototyping and testing is nearly “free”, compared e.g. to spacecraft engineering. And with BDD and tools like RSpec and FitNesse we may have both readable and executable specifications – the code becomes the model.
  • Monte Carlo Analysis of the Zero Defect Mentality of TDD – conclusion: TDD pays off in the long run even though being slower [learning curve; 0-value bringing defect fixing] Fow short life time, TDD may be not worth it. Don’t argue, simulate :-)

Posted in General, Languages, Testing, Tools, Top links of month | Tagged: , , , , , , , , | Comments Off

The power of batching or speeding JDBC by 100

Posted by Jakub Holý on September 20, 2010

We all know that one coarse-grained operation is more efficient than a number of fine-grained ones when communicating over the network boundary but until recently I haven’t realized how big that difference may be. While performing a simple query individually for each input record proceeded with the speed of 11k records per hour, when I grouped each 100 queries together (with “… WHERE id IN (value1, .., value100)), all 200k records were processed in 13 minutes. In other words, using a batch of the size 100 led to the speed-up of nearly two orders of magnitude!

The moral: It really pays of to spend a little more time on writing the more complex batch-enabled JDBC code whenever dealing with larger amounts of data. (And it wasn’t that much more effort thanks to Groovy SQL.)

Posted in Databases, Languages | Tagged: , , , | Comments Off

Most interesting links of July

Posted by Jakub Holý on August 2, 2010

This month about performance, the Java language and patterns, the development process, and a few interesting news.

Read the rest of this entry »

Posted in Languages, Top links of month | Tagged: , , , , | Comments Off