The Holy Java

Building the right thing, building it right, fast

Posts Tagged ‘BI’

Review: Clojure for Machine Learning (Ch 1-3)

Posted by Jakub Holý on June 26, 2014

Book coverPack Publishing has asked me to review their new book, Clojure for Machine Learning (4/2014) by Akhil Wali. Interested both in Clojure and M.L., I have taken the challenge and want to share my impressions from the first chapters. Regarding my qualification, I am a medium-experienced Clojure developer and have briefly encountered some M.L. (regression etc. for quantitive sociological research and neural networks) at the university a decade ago, together with the related, now mostly forgotten, math such as matrices and derivation.

In short, the book provides a good bird-eye view of the intersection of Clojure and Machine Learning, useful for people coming from both sides. It introduces a number of important methods and shows how to implement/use them in Clojure but does not – and cannot – provide deep understanding. If you are new to M.L. and really like to understand things like me, you want to get a proper textbook(s) to learn more about the methods and the math behind them and read it in parallel. If you know M.L. but are relatively new to Clojure, you want to skip all the M.L. parts you know and study the code examples and the tools used in them. To read it, you need only elementary knowledge of Clojure and need to be comfortable with math (if you haven’t worked with matrices, statistics, or derivation and equations scare you, you will have a hard time with some of the methods). You will learn how to implement some M.L. methods using Clojure – but without deep understanding and without knowledge of their limitations and issues and without a good overview of alternatives and the ability to pick the best one for a particular case.

Read the rest of this entry »


Posted in Languages | Tagged: , , , | Comments Off on Review: Clojure for Machine Learning (Ch 1-3)

Most interesting links of August ’12

Posted by Jakub Holý on August 31, 2012

Recommended Readings

  • How To Fail With Agile:Twenty Tips to Help You Avoid Success – a great overview of ways people may make agile projects and initiatives fail – use them to either avoid the failure or to make it certain, according to your attitude towards agile
  • Learning Vim keys in an entertaining way by playing an on-line 2D game. A brilliant idea!
  • The Search for a Better BIG Data Analytics Pipeline – how to use big data and analytics on it in a company. Big data = lot of data, simple processing; deep analysis = representative, small sample of data (no need for all), advanced techniques. Big data can provide input for analysis.

Links to Keep

  • Pat Kua’s Onboarding Strategies series – tips for getting new people onto your team as a tech lead and making them productive quickly. He also wrote the InfoQ article A Leaner Start: Reducing Team Setup Times based on the series. Some posts: Catalogue of patterns applied, Airing .. about feedback meetings, Pair programming, Preparation e-mail, Domain driven design and readable code, Tech huddles (what-we-learned session every 2nd day), Transparent technical debt, Visible architecture, Big vision business problem.

Useful Tools

  • OWASP Hatkit Proxy: TCP/HTTP proxy intended for developers that can intercept and modify requests and store parsed communication into MongoDB for later exploration. You can define what (not) to store/intercept with white- (black-)lists. Syntax highlighting, Swing UI etc.
  • Using Doxygen to understand a code base – Doxygen can generate a full cross reference of source code, class diagram, caller and call graphs for many languages including Java, PHP, C.

Interesting Quotes

Our standards by default exclude comments where possible replaced by representing as much intent as possible in the code itself. We focus on what it does and why. I’ve found “What” tends to be best represented by production code, whilst “Why” is better explained in tests because you can better represent different contexts there.
Pat Kua: Onboarding strategy: Domain driven design and readable code

Posted in General, Tools, Top links of month | Tagged: , , , , , | Comments Off on Most interesting links of August ’12

Most interesting links of April (renewed)

Posted by Jakub Holý on April 30, 2011

Only two articles this month:

Computerworld: 22 free tools for data visualization and analysis

– great review if different categories of data analysis and visualization tools. The tools I haven’t known (i.e. excluding R, Google Charts etc.) and found them especially interesting:
Data web apps: Google Refine (data cleansing in a spreadsheet-like UI: clustering, data distribution overview, …), Google Fusion Tables (data => map etc., beta), Impure (rich & interactive data visualization via a drag-and-drop UI reminiscent of Yahoo Pipes; cons: lacking documentation, steep learning curve, check the teaser video).
JS libraries: Exhibit (JavaScript library by MIT for creating interactive visualizations e.g. for articles – incl. maps, timeplots, calendars etc., supporting filtering, sorting, searching), InfoVis Toolit (JS lib for interactive data visualizations; pros: beautiful, cons: choice of visualization types is somewhat limited), Protovis (by Stanford University’s Visualization Group; one of the more popular JS libraries for turning data into visuals, great docs, robust); OpenLayers ( example; customize & display a map, e.g. Open Street Map of Google), Polymaps (interactive maps with overlays)
GIS: OpenHeatMap (webapp, “astonishingly easy to create a color-coded map from many types of location data”)
Other: Timelines with TimeFlow (interesting desktop /java/ app x alpha) or SIMILE Timeline widget (JS); Word clouds: IBM Word-Cloud Generator (free, desktop /java/); Gephi (graph/network visualization & exploration; desktop)

The Evolution of Test Driven Developers

– an entertaining and enlightening article with valuable links to resources that can help you get to the next evolutionary step, one of its benefits is that it helps to understand the true value of the different types of tests (some -> TDD -> Behaviour Driven Development (‘what’ rather than ‘how’) -> Acceptance Test Driven Development) and the shift from a technical to a business perspective along the line

Posted in General, Testing, Tools, Top links of month | Tagged: , , | Comments Off on Most interesting links of April (renewed)