The Holy Java

Building the right thing, building it right, fast

Posts Tagged ‘clojure’

Notes From CodeMesh 2014

Posted by Jakub Holý on November 3, 2014

My consise highlights from CodeMesh 2014.

Philip Potter has very good CodeMesh notes as well, as usually.

TODO: Check out the papers mentioned in the  NoSQL is Dead talk. (<- slides)

Tutorial: QuickCheck (John Hughes)

General
  • QC => Less code, more bugs found
  • QC tests are based on models of the system under test – with some kind of a simple/simplified state, legal commands, their preconditions, postconditions, and how they impact the state. The model is typically much smaller and simpler than the imple code.
  • QuickCheck CI (quickcheck-ci.com) – free on-line service for running CI tests for a GitHub project. Pros: You don’t need QC/Erlang locally to play with it, it provides history of tests (so you never loose a failed test case), it shows test coverage also for failed tests so you see which code you can ignore when looking for the cause.
  • See John’s GitHub repo with examples – https://github.com/rjmh/
Shrinking (a.k.a. simplification)
  • Doesn’t just make the example shorter by leaving things out by tries a number of strategies to simplify the exmple, typically defined by the corresponding generators – f.ex. numbers are simplified to 0, lists to earlier elements (as in “(elements [3, 4, 5])”) etc.
  • You may implement your own shrinking strategies. Ex.: Replace a command with “sleep(some delay)” – so that we trigger errors due to timeouts. (A noop that just waits for a while is simpler than any op).
Bug discovery
  1. Run QC; assuming a test failed:
  2. Instead of diving into the implementation, use first QC to check your hypothesis of what constitutes “bad input” by excluding the presumed bad cases – f.ex. “it fails when input has 8 characters” => exclude tests with 8 and rerun, if you find new failures you know the hypothesis doesn’t cover all problems – and you will perhaps refine it to  “fails when it has a multiple of 8 chars etc. We thus learn more about the wrong behavior and its bounds. Assumption we want to verify: No (other) tests will fail.
  3. Do the opposite – focus on the problem, i.e. modify the test case to produce only “bad cases”. Assumption we want to verify: all tests will fail.
QC vs. example-based testing

QC code tends to be 3-6* smaller than implementation (partly thanks to the consiseness of Erlang) and fairly simple.

The case of Vovlo: 3k pages of specs, 20 kLOC QC, 1M LOC C implementations, found 200 bugs and 100 problems (contradictions, unclarities) in the specs. It took 2-3 years of working on it on and off.

Erlang dets storage race conditions: 100 LOC QC, 6 kLOC Erlang impl.

Testing stateful stuff

Invoke dfferent API calls (“commands”) until one of presumabely legal calls fails due to an accumulated corrupted state. This is an iterative process where we evolve our model of the system – commands, their preconditions (when they can be legally invoked), postconditions, and our repreentation of the state.

Ex.: Testing of a circular queue. Commads: push (legal on non-full queue), get (legal on non-empty), create new => generates sequences of new and pushs and gets.

Testing race conditions

Precondition: Run on a multicore PC or control the process scheduler.

  • There are many possible correct results (valid interleavings) of parallel actions => impractical to enumerate and thus to test with example-based tests
  • Correct result is such that we can order (interleave) the concurrently executed actions such that we get a sequential execution yielding the same result. F.ex. an incorrect implementation of a sequence number generator could return the same number to two concurrent calls – which is not possible if the calls were done sequentially.
Testing data structures

Map the DS to a simpler one and use that as the model – f.ex. a list for a tree (provided there is a to_list function for the tree).

Tutorial: Typed Clojure (Ambrose Bonnaire-Sergeant)

Note: The documentation (primarily introductory one) could be better

Resources
Defining types
  1. separately: (ann ..)
  2. around: wrap in (ann-form <defn> <type def.>)
  3. inside: use t/fn, t/defprotocol etc.
Gradual introduction of typed.clojure
  • wrap everything in (t/tc-ignore …)
  • for unchecked fns you depend on, add (ann ^:no-check somefn […])
  • If you stare at a type error, consider using contracts (prismatic/Schema or pre/post conds etc.)
Other

Keynote: Complexity (Jessica Kerr, Dan North)

  • Always have tasks of all three types: research (=> surface conplexity), kaizen (cont. improvement, improvement of the imprv. process), coding – these 3 interlave the whole time
  • A team needs skills in a number of areas, it isn’t just coding – evaluation of biz valuedelivered, monitoring, programming, testing, deployment, DB, FS, networks, … .

Keynote: Tiny (Chad Fowler)

Keep things tiny to be efficient (tiny changes, tiny teams, tiny projects, …).

  • Research by armies and in SW dev [TODO: find the two slides / qsm, Scrum] shows that teams of max 5-6 work best
    • Teams of 7+ take considerably more time and thus money (5* more wrt. one study) to complete the same thing
    • => small, autonomous teams with separate responsabilities (decomosition, SRP etc. FTW!)
  • Human capacity to deal with others is limited – one company creates a new department whenever size exceeds 100
  • Big projects fail; Standish CHAOS report – only cca 10% larger projects succeed compared to nearly 80% of small ones (summed together: 39% succeed)
  • Note: 1 month is not a short iteration

Distributed Programming (Reid Draper)

RPC is broken

- it tries to pretend a remote call is same as local but:

  • what if the call never returns?
  • the connection breaks? (has the code been executed or not yet?)
  • what about serialization of arguments (file handles, DB conn.,…)

It ignores the special character of a remote code and the 8 fallacies of distributed progr.

Message passing

is batter than RPC. There is also less coupling as the receiver itself decides what code to call for a specific message.

Bloom

The Bloom lang from the BOOM research project explores new, better ways of distributed programming. Currently implemented as a Ruby DSL.

From its page (highlight mine):

Traditional languages like Java and C are based on the von Neumann model, where a program counter steps through individual instructions in order. Distributed systems don’t work like that. Much of the pain in traditional distributed programming comes from this mismatch:  programmers are expected to bridge from an ordered programming model into a disordered reality that executes their code.  Bloom was designed to match–and exploit–the disorderly reality of distributed systems.  Bloom programmers write programs made up of unordered collections of statements, and are given constructs to impose order when needed.

Correctness testing of concurrent stuff
  • Unit testing unsuitable – there are just too many combinations of correct results and can only test the cases the dev can think of
  • => generate the tests – property-based testing / QuickCheck
  • PULSE – an addon to property-based testing that tries to trigger concurrency problems by using a scheduler that tries different interleavings of actions (randomly but repeatedly) [Erlang]
  • Simulation testnng – Simulant
Benchmarking

Beware the effects of GC, page cache, cronjob (e.g. concurrently running backup), SW updates => running a simple load test for few mins is not enough.

Cheats & Liars: The Martial Art of Protocol Design (Pieter Hintjens)

Pieter is the brain behind AMQP, ZeroMQ and EdgeNet (protocols for anonymous, secure, peer-to-peer internet). He has shared great insights into designing good protocols, the dirty business surrounding protocols and standardization, and troll-proof organization of communities (as a self-roganizing, distributed team).

More: See Ch.6 The ØMQ Community in the online The ZeroMQ Guide – for C Developers. (He has also other interesting online or paid books.)

  • Protocol is a contract for working together
  • IT should be minimalistic and specific, name the participants, …
  • Protocols and their strandardization are prey to “psychopatic” organizations that want to hijack them for their own profit (by pushing changes that benefit them, taking over the standardization process, …) (Pieter has experienced it e.g. with AMQP; these trolls always show up). It’s advantegous to take control of a successful protocol so that you can make money off it or build stuff on it and sell that. Examples:
    • Microsoft MS Doc XML  – this “open” spec f.ex. reportedly defines that one functions works “as Word 95″
    • A company pushing changes that nobody else really understands, thus undermining compatibility of implementations
    • Pushing such changes that an implementor can claim compliance to the standard yet implement it so that his products only work with his products
    • Crazy/proprieetary protocol extensions, patenting/trademarking/copyrighting the spec (e.g. TM on Bluetooth)
  • Hijacking-safe protocol creation process (beware “predatory maliciousness”):
    • The specs is GPL => nobody can capture it (e.g. ZeroMQ)
    • The community has clear rules and deals with trolls by kicking them out
    • There is a good process for evolving the spec
  • How to spec i protocol?
    • Start with a very, very small and simple contract – only add things that you desperately need – e.g. ZeroMQ v1 had no versioning, security, metadata (versioning added in v2, metadata in v3, ecurity later). You don’t know what you really need until you try it. F.ex. even the original AMQP has 60-75% waste in it!!!!
    • Do very slow and gradual evolution
    • Layering is crucial – keep your protocol on one layer, only specify relevant things, leave the other layers for other specs so they can evolve and age in different speed; the more in a spec the earlier will st. be outdated (Pizza contract says nothing about the kitchen, f.ex.)
  • Community and cooperation (See the Ch.6 The ØMQ Community mentioned above.)
    • community needs clear rules to keep trolls away (and they always pop up)
    • don’t just absorb the damage trolls do, ban them
    • self-org., decentralized team

PureScript (Bodil Stokke)

Working with larger JavaScript apps (> 100 LOC :-)) is painful, primarily due to the lack of type checking and thus requiring one to take lot of care when chaning code so that nothing breaks at runtime. TypeScript is a possible solution but it still feels limited.

PureScript is very Haskell-like language compiled to JS. It is a pure functional lang, effects are performed only via the Effect Monad (Eff). It is pragmatic w.r.t. interoperability – it is possible to embedd JS code and just add a signature to it, the compiler will trust it.

Moreover, you can use property-basd testing with QuickCheck and Functional Reactive Programming with Bodil’s Signal library. Isn’t that wonderful?!

See the PureScript Is Magic code at GitHub, primarily the 150 LOC Main.purs.

<3

Category theory notes:

  • Semigroup is a domain with a cumulative operation (e.g. ints with +)
  • Monoid (?) is a semigroup with a unit element, i.e. one where “element operation unit = element” as 0 for + or 1 for *.

Megacore, Megafast, Megacool (Kevin Hammond)

Interesting research project ParaPhrase for parallelization of code through automatic refactoring and application of any of supported topologies (Farm, Pipeline, Map, …) – ParaPhrase-ict.eu and www.project-advance.eu (in general the promises of automatizatio regarding fixing software development problems have have hugely underdelivered but still something might be possible). In some cases their solution managed to do in hours what a developer did in days.

Quote Bob Harper:

The only thing that works for parallelism is functional programming

PS: C++17 is going to have support for parallel and concurrent programming.

 Categories for the Working Programmer (Jeremy Gibbons)

An elementary intro into category theory in 10 points, yet I got immediately lost. It might be worth to study the slides and look for more resources at the author’s uni page and not-so-active blog Patterns in Functional Programming .

 NoSQL is Dead (Eric Redmond)

Main message: There is just too many differences between NoSQL DBs for this expression to be meaningful.

Lobby talks

Hacking people

I had an inpiring lunch chat with Chad and a Polish lady whose name I unfortunately don’t know. Their companies do fascinating stuff to leverage the potential of their humans – one has replaced top-down management wrt. projects with environment where there are clear objectives (increase monthly active users) and the freedom to come with ideas to contribute to them, recruit other people for the idea and, if successful, go implement it (while continually measuring against the objectives). Clearly enough it is not easy, some people have troubles trying to manage everything or doing what they believe in without checking the real implect on the objectives etc. but it already provides more value then before. This has been going on for just a few months so hopefully when it settles more we will hear about their experience.

The other company realized that people are different (wow! how could our industry ignore that?!) and started to go psychological profiling of employees to understand what type of team member they are – a driver, a worker, a critic who is always hunting for possible issues and problems etc. And they compose teams so that they have the right mix of different personalities to avoid both insurpasable conflicts and the risks of group-think.

I believe this is the future of our industry – really understand people and “hack” our organizations to leverage that for greater happiness and better results.

Non-talks

  • Jessica Kerr: Simulation of team work and the effect of (no) slack

    - what happens when you let your programmers crunch work without any slack time? And when you introduce slack? Jessica has made this Scala simulation to produce the results we would expect – much more even production in the slack case, lot of rework after deploying features in the non-slack version. Not at all scientific but very nice when you want to *show* your higher-ups what happens when you do the former or the latter. Some people hear much more to a visual stimuli (even though totally made to conform to the message you want to get across) than tons of theory.

  • Aphyr – [313] Strong consistency models – “strong consistency” is a much broader term than I expected and not all consistency models are so consistent :-) Check out especially this consistency family tree image.

Posted in Languages, Testing, Tools | Tagged: , , | Leave a Comment »

Most interesting links of July ’14

Posted by Jakub Holý on July 31, 2014

Recommended Readings

  • Video: The Unreasonable Effectiveness of Dynamic Typing for Practical Programs – a static-typing zealot turned friend of dynamic typing under the experience of real-world projects and problems shares thoughts about the limits of type systems (f.ex. both energy and torque are measured in N*m yet cannot be combined) and their cost: according to the Hanenberg’s experiment about static and dynamic typing => the time required to handle the time chacker > time to debug the errors that it would have caught. According to a review of issues at GitHub, only 2% of reported issues for JS, Clojure, Python, and Ruby are type errors and for a large, closed-source Python project type/name/attribute errors were 1%. “I have come to believe that tests are a much better investment [than static typing].” Rigorous type system/model => limited applicability (due to different needs) <=> modelling some things with types doesn’t cut it. “Are the costs of static typing offset by a few percent fewer defects? Is agility more important than reliability?” “Static types are anti-modular” – too a tight coupling. “Static type checking comes at the expense of complexity, brittleness and a tendency to monoliths.
    (Personally I miss static typing – but that is perhaps due to having relied on it for so long.)
  • ThoughtWorks Tech Radar July 2014 (pdf): f.ex. Ansible in Adapt, Masterless Chef/Puppet in Trial, Machine image as a build artifact: Trial, PostgreSQL for NoSQL: Trial, Adopt Dropwizard (Rest 4 Java), Go lang, Reactive Extensions across langs [JH: RxJava, RxJS, ..]; Asses Property-based (generative) testing, … . Other highlights: Mapbox (open-source mapping platform), OpenID Connect as a less complex and thus promising alternative to SAML/generic OAuth, Pacto/Pact for Consumer-Driven Contracts (contract => simulate consumers/stubb producers => test your REST clients against the contract so that the rest of tests can assume it is correct and use a stubbed client), Swagger for REST documentation.
  • The madness of layered architecture – a nice critique of over-designed “enterprise” apps, why that is a problem (SRP, cost of code, unclear where to do a change, ….), why it is different from the successful layered network stack of Ethernet/IP/TCP/… (because in an app, all layers are on the same level of abstraction); bottom line: do not add a layer unless you have a really good reason (hint: the advice of a consultant/speaker does not count as one)
  • Key Takeaway Points and Lessons Learned from QCon New York 2014 (viz @RiczWest) – “[..] deep insights into real-world architectures and state of the art software development practices, from a practioner’s perspective.” – architectures of Fb, Foursquare etc., continuous delivery, creating culture, real world functional programming, … .
  • Questioning the Lambda Architecture (J. Kreps of LinkedIn) – maintaining the same processing in two very different systems (one batch, one stream & real-time) is a maintenance nightmare => improve the RT/stream processing to handle re-processing and thus both (using e.g. Kafka to store the data and thus be able to re-play them)
  • Google: Checklist for mobile website improvement
  • Google Dataflow and the transition from batch to stream processing – G. Dataflow might not be a Hadoop killer due to requiring that the data are in the Google Cloud but the trend is clear, going away from batch processing to more stream-oriented processing with tools like Spark, Flume etc. that are faster thanks to using memory better and more flexible thanks to not being limited to the rigitd two-stage model of map-reduce. (Reportedly, Google – the one that made Map-Reduce popular – doesn’t use it anymore.)
  • OS X: Extract JDK to folder, without running installer

Society, economics, people

  • HBR: The Power of Meeting Your Employees’ Needs – people feel better, perform better, are more engaged and likely to stay longer (=> profitability) when 4 basic needs are met: physical [energy] renewal (=> give opportunity, encourage to take a nap or do whatever that helps), value – feeling of being valued by the company, ability to focus, purpose (i.e. serving something larger than ourselves). “What’s surprising about our survey’s results is how dramatically and positively getting these needs met is correlated with every variable that influences performance. It would be statistically significant if meeting a given need correlated with a rise of even one or two percentage points in a performance variable such as engagement, or retention. Instead, we found that meeting even one of the four core needs had a dramatic impact on every performance variable we studied. [..] when all four needs are met, the effect on engagement rises from 50% for one need, to 125%. Engagement, in turn, has been positively correlated with profitability. [..] employers with the most engaged employees were 22% more profitable than those with the least engaged employees.
    [..] those who were encouraged to take intermittent breaks reported they were 50% more engaged, more than twice as likely to stay with the company, and twice as healthy overall. Valuing and encouraging renewal requires no financial investment. What it does require is a willingness among leaders to test their longstanding assumption that that performance is best measured by the number of hours employees puts in – and the more continuous the better — rather than by the value they generate, however they choose to do their work.
  • The Pitchforks Are Coming… For Us Plutocrats – increasing inequality will eventually lead to the collapse of the sysem (at least so does teach the history). It is people – primarily the middle class – that are the source of the wealth of the society, they produce and also consume most. Thus it is necessary to support them …
  • Why the U.S. Corporate World Became ‘A Bull Market for Corruption’ – Enron, GM, Goldman Sachs, … – we hear more and more the names of large corporations in the context of negligence and misues of their customers and investors. It seems that leadership (in the lead by example sense) has died out as well as the feeling of responsibility when one wields power over her customers/investors/markets. Instead, we have the me-first and  money at any cost thinking. Organizations are designed to shield higher-ups from responsibility (meetings with no records…). High pay for short term gains, failure to punish high ranking people.
  • (US) This is what happened when I drove my Mercedes to pick up food stamps – the experience of life in poverty after dropping down from $125k to $25k/year in two months due to childbirth, real estate market crash, and loss of a job. “Using the coupons was even worse. The stares, the faux concern, the pity, the outrage — I hated it. [..] That’s the funny thing about being poor. Everyone has an opinion on it, and everyone feels entitled to share. [..] Poverty is a circumstance, not a value judgment. I still have to remind myself sometimes that I was my harshest critic. That the judgment of the disadvantaged comes not just from conservative politicians and Internet trolls. It came from me, even as I was living it.

Clojure Corner

  • Isomorphic Clojure[Script], part I – enjoying all the benefits of Single-Page Apps while avoiding their drawbacks (SEO, slower page load, accessibility etc.) – a SPA that can be pre-rendered by the server. Using Om/React, JDK8 with the Nashorn JS engine, core.async, Sente (bi-dirrectional HTTP/WS communication over core.async) and Clojure in the JVM, ClojureScript in Nashorn in the JVM, and ClojureScript in the browser. Example app: Omelette.
  • clj-crud: a relatively feature-complete example of a Clojure web (4/2014; GitHub) – using Component, Liberator (REST), Datascript + Quiescent (=> React.js), Enlive, Friend etc. including couple of unit-test and ui-test libraries
  • Conclujon: Acceptance testing tool (α), Clojure reimplementation of Concordion, a beautifully simple ADD tool
  • dynalint: human-friendly error messages during dev – Clojure typically provides little helpful and sometimes confusing error messages thrown from somewhere deep in the implementation, such as “Don’t know how to create ISeq from: java.lang.Long at clojure.lang.RT.seqFrom” while we want st. like “First argument to clojure.core/first must be seqable: 1 (instance of class java.lang.Long” – and that’s what Dynalint does. In the tradition of defensive programming, it adds checks and good error messages to Vars at runtime. You typically run it only during dev, triggering it from the REPL.
  • Grimoire (Reid McKenzie) – a more up-to-date replacement for ClojureDocs
  • Adam Bard’s Top Clojure Articles for beginners and intermediate Clojure devs – f.ex. Five Mistakes Clojure Newbies Make, Acceptable Error Handling in Clojure, Clojure Reducers for Mortals
  • J. Wilk: Isolating External Dependencies in Clojure – a nice overview of the options and their pros and cons – with-redefs, alter-var-root, Midje (using alter-var-root in a more controlled manner), higher-order-functions (#1!) etc.
  • philandstuff’s detailed notes from Euroclojure 2014

Tools/Libs

  • NixOS (via @bodil) – a new interesting “purely functional” Linux distribution – system configuration is fully declarative (think of Puppet/Chef) and it is always trivial to roll back, you can have multiple versions of a package, users can install non-global SW
  • InfluxDB – time series, metrics, and events DB that scales; contrary to Graphite it can store richer data than Graphite and its single value; additional highlights: authorization for individual data, roll-up/clean old data, https API. Written in Go.

Posted in General, Languages, Top links of month | Tagged: , , , , , , , , , | Comments Off

Most interesting links of June ’14

Posted by Jakub Holý on June 30, 2014

Recommended Readings

  • The emperor’s new clothes were built with Node.js – I know sadly little about Node.js but this goes against the hype and is thus interesting. So what does Node.js give us? Performance 1-5x slower than Java [like Clojure] according to the Benchmarks Game (contrary to other benchmarks with the opposite result as mentioned in the comments), use of a single CPU/core on our multi-cpu, multi-core machines, callback hell. At the same time, there are good non-blocking servers available in other languages (Clojure’s http-kit, Vert.x, etc.) (Update: From the comments it seems that f.ex. the “callback hell” situation is geting better with 0.11, fibers and other things I do not know anything about. Also Sandro has a nice anti-comment (No. 36).)
    The Node.js Is Bad Ass Rock Star Tech 5 min video is a nice companion :)
  • The Expert (Short Comedy Sketch)  (7 min) – you’ve certainly seen this one but I had to put it here; a young engineer is hammered into being an “Of course I can do it, I am an expert” ‘expert/consultant’ during a business meeting. Maybe you too have experienced a dialog with the business where your true expert opinion was crushed by the business people’s insistence on their absurd requirements?
  • Reset The Net – Privacy Pack – privacy-enhancing apps for PC/mobile
  • The Dyslexic Programmer (via Kent Beck) – interesting read about quite a different way to percieve and think about code, the advantages of IDEs.
  • It’s Here: Docker 1.0 => more stable from now on
  • Kent Beck: Learning About TDD: The Purpose of #isTDDDead – what is the purpose and value of TDD? Where are the limits of its value? “I recognize that TDD loses value as tests take longer to run, as the number of possible faults per test failure increases, as tests become coupled to the implementation, and as tests lose fidelity with the production environment.
  • Failure & Cake: A Guide to Spotify’s Psychology of Success – want to be innovative and successfull? Learn to embrace failure, nurture the “growth mindset” (failure as opportunity to improve) rather than the “fixed mindset” (I do not learn and every failure shows I have no value). Read this if you want your org to be a better place to work!

Non-tech

  • LSD — The Problem-Solving Psychedelic – I never knew that drugs could be used to something positive, with an incredible effect. Are you stuck with a tech/design/art problem? Try LSD! :-)
  • The French are right: tear up public debt – most of it is illegitimate anyway – “Debt audits show that austerity is politically motivated to favour social elites. [..] 60% of French public debt is illegitimate” – not improving the lives of people but thos at power/rich. Time to reconsider this debt business and ways to make our system better?
  • Forbes: Why Financialization Has Run Amok – Wall Street is the kind and companies do everything to look better in its eyes – including giving up on opportunities. The might of the finance sector is destructive to our economy and distorts it, away from producing more value to making financial institutions richer, away from (value) creative activities to distributive ones. The article describes the problem and proposes a solution including limiting the size and leverage of banks, taxing financial transactions etc. Example of the effects: “[..] a cabal of senior IBM executives and the managers of some big investment firms got together and devised a five-year scheme—IBM’s Roadmap 2015—for increasing IBM’s earnings per share—and their own compensation—through measures that are not only increasing earnings per share but also steadily crippling IBM’s ability to innovate and compete [..]
  • Why Can’t We All Just Get Along? The Uncertain Biological Basis of Morality – very interesting criticism of “morality” that is mostly based on emotions and thus contradictory, a good argument for utilitarian morality [not that it hasn’t its own challenges]. According to the author, many conflicts are nor primarily due to divergent values but due to different interpretation of the reality and history (such as “who has right to this land?”). People suffer “[..] from a deep bias—a tendency to overestimate their team’s virtue, magnify their grievances, and do the reverse with their rivals.” “This is the way the brain works: you forget your sins (or never recognize them in the first place) and remember your grievances. [..] As a result, the antagonisms confronting you may seem mysterious, and you may be tempted to attribute them to an alien value system.” This leads to partial judgements that play very badly with another psychological feature – “Namely: the sense of justice—the intuition that good deeds should be rewarded and bad deeds should be punished.” “When you combine judgment that’s naturally biased with the belief that wrongdoers deserve to suffer, you wind up with situations like two people sharing the conviction that the other one deserves to suffer. Or two groups sharing that conviction. And the rest is history.” And “The most common explosive additive is the perception that relations between the groups are zero-sum—that one group’s win is the other group’s loss.” => “So maybe the first step toward salvation is to become more self-aware.
    When you’re in zero-sum mode and derogating your rival group, any of its values that seem different from yours may share in the derogation. Meanwhile, you’ll point to your own tribe’s distinctive, and clearly superior, values as a way of shoring up its solidarity. So outsiders may assume there’s a big argument over values. But that doesn’t mean values are the root of the problem.
    Those who choose not to act in the trolley dilemma[..] are just choosing to cause five deaths they won’t be blamed for rather than one death they would be blamed for. Not a profile in moral courage!

Clojure Corner

  • The Case for Clojure (video, 5 min) – a short video arguing for Clojure as a good solution language based on its simplicity, power, and fun factor. There are many claims and few facts (as dictated by the short length) but it might be interesting for somebody.
  • CrossClj.info – cross-reference of many OSS Clojure projects – find all uses of a fn across the projects, all fns with a given name, all projects using ring, … . Search by fn, macro, var, ns, prj.
  • The Weird and Wonderful Characters of Clojure – ‘A reference collection of characters used in Clojure that are difficult to “google”.’

Tools/Libs

Posted in General, Languages, Tools, Top links of month | Tagged: , , , , , , , , | Comments Off

Review: Clojure for Machine Learning (Ch 1-3)

Posted by Jakub Holý on June 26, 2014

Book coverPack Publishing has asked me to review their new book, Clojure for Machine Learning (4/2014) by Akhil Wali. Interested both in Clojure and M.L., I have taken the challenge and want to share my impressions from the first chapters. Regarding my qualification, I am a medium-experienced Clojure developer and have briefly encountered some M.L. (regression etc. for quantitive sociological research and neural networks) at the university a decade ago, together with the related, now mostly forgotten, math such as matrices and derivation.

In short, the book provides a good bird-eye view of the intersection of Clojure and Machine Learning, useful for people coming from both sides. It introduces a number of important methods and shows how to implement/use them in Clojure but does not – and cannot – provide deep understanding. If you are new to M.L. and really like to understand things like me, you want to get a proper textbook(s) to learn more about the methods and the math behind them and read it in parallel. If you know M.L. but are relatively new to Clojure, you want to skip all the M.L. parts you know and study the code examples and the tools used in them. To read it, you need only elementary knowledge of Clojure and need to be comfortable with math (if you haven’t worked with matrices, statistics, or derivation and equations scare you, you will have a hard time with some of the methods). You will learn how to implement some M.L. methods using Clojure – but without deep understanding and without knowledge of their limitations and issues and without a good overview of alternatives and the ability to pick the best one for a particular case.

Read the rest of this entry »

Posted in Languages | Tagged: , , , | Comments Off

Most interesting links of May ’14

Posted by Jakub Holý on May 31, 2014

Recommended Readings

  • Monolith – from The Codeless Code – fables and koans for the SW engineer – the Monad monolth #Haskell #fun
  • http2 explained (pdf, 27 pages) – cons of http 1 (huge spec / no full impl., wasteful use of TCP <=> latency [x spriting, inlining, concatenation, sharding]) => make it less latency sensitive, fix pipelining (issue a req before previous one finished), stop the need for ever increasing # connections, remove/reduce optional parts of http. Http2 is binary; multiple “streams” over 1 connection => much less conns, faster data delivery; header/data compression; [predictive] resource pushing; . Inspired by SPDY. Chrome and Mozilla will only support it over TLS, yay! (see also Is TLS Fast Yet? [yes, it is]) Promise: faster, more responsive web pages & deprecation of http/1 workarounds => simplified web dev.

Special

  • exercism.io – crowd-sourced good code mentorship – get an exercise, implement it in any of the supported language(s), submit and get feedback, repeat; when finished, you too can comment the same excercise submitted by others while working on your next assignment. Languages include Clojure, JS, Scala, Python, Haskell, Go, Elixir, Java, and more.

Podcasts (FP & related)

  • Cognicast (also @ iTunes) – Clojure, FP, etc.
  • Functional Geekery (@ iTunes) – A podcast on Functional Programming, covering topics across multiple languages.
  • Mostly λazy…a Clojure podcast by Chas Emerick
  • Giant Robots Smashing into other Giant Robots – “a weekly technical podcast discussing development, design, and the business of software development”
  • Software Engineering Radio (@ iTunes) – “The goal is to be a lasting educational resource, not a newscast. Every two to four weeks, a new episode is published that covers all topics software engineering. Episodes are either tutorials on a specific topic, or an interview with a well-known expert from the software engineering world.”
  • EngineerVsDesigner – design insight (@ iTunes) – product design podcast – the latest digital design news, tips & tricks, Q&A, and an industry special guest

Other

Clojure Corner

Tools/Libs

  • ownCloud – your own Dropbox/Google Drive, run on your server – sharing files between devices / PCs / web, syncing calendar and contacts, collaborative editing of documents (ODF)
  • Mailpile – “A modern, fast web-mail client with user-friendly encryption and privacy features.”, to be self-hosted on a PC, RaspberryPI, USB stick
  • Blackhole – role-based ssh proxy – an app that enables you to manage what users can ssh to what server as a particular user, from users’ point of view this is a ssh proxy; useful if many people need access to many servers but you do not want to add them all as users on those servers.
  • Wuala – Secure Cloud Storage – Backup. Sync. Share. Access Everywhere. – Dropbox alternative, secure by default
  • fb-flo – Facebook’s live-coding tool
  • owncloud.org – self-hosted Dropbox-like service with calendar and contact sync and more

Favourite Quotes

Posted in General, Languages, Tools, Top links of month | Tagged: , , , , , , , , , , | Comments Off

Clojure/Java: Prevent Exceptions With “trace missing”

Posted by Jakub Holý on May 19, 2014

The other day I got this little helpful exception from Clojure:

(cond (>= nil 1) :unreachable)
;=> NullPointerException [trace missing]

- no line number or anything to troubleshoot it.

It turns out it is not Clojure’s failure but a HotSpot optimization that can apply to NullPointerException, ArithmeticException, ArrayIndexOutOfBoundsException, ArrayStoreException, and ClassCastException. The remedy is to run the JVM with

-XX:-OmitStackTraceInFastThrow

From Oralce JDK release notes:

The compiler in the server VM now provides correct stack backtraces for all “cold” built-in exceptions. For performance purposes, when such an exception is thrown a few times, the method may be recompiled. After recompilation, the compiler may choose a faster tactic using preallocated exceptions that do not provide a stack trace. To disable completely the use of preallocated exceptions, use this new flag: -XX:-OmitStackTraceInFastThrow.

Many thanks to Ivan Kozik for the info!

Posted in Languages | Tagged: , | Comments Off

Most interesting links of April ’14

Posted by Jakub Holý on April 30, 2014

Recommended Readings

  • The economics of reuse – developing code for reuse costs much more than for one need – it might cost 300% more to develop and save you 75% of work when (re)using it instead of developing from scratch (if one of the factors goes down, the other one typically goes down too). Summary: “That means that to get any value from your reused component, you better have five or more reusers or you have to find a way to substantially improve the [reuse value factor] or [reusability cost factor]. Very smart people have failed to do this.
  • Book in making: Reactive Design Patterns (1st ch free)

Sharing data on the web

Clojure Corner

  • 8th Light: Combining Clojure and ClojureScript Libraries (3/2014) – really good and detailed article / tutorial using CLJX and platform-specific platform.clj[s] files to share code between Clojure and ClojureScript. It also recommends a file structure (src/(clj|cljs)/), demonstrates testing, discusses macro development, shows how to pack both into one jar.

Tools/Libs

Favourite Quotes

Posted in General, Languages, Top links of month | Tagged: , , , , | Comments Off

Clojure: How To Prevent “Expected Map, Got Vector” And Similar Errors

Posted by Jakub Holý on April 30, 2014

What my Clojure code is doing most of the time is transforming data. Yet I cannot see the shape of data being transformed – I have to know what the data looks like on the input and hold a mental model of how they change at each step. But I make mistakes. I make mistakes in my code so that the data does not correspond anymore to the model it should follow. And I make mistakes in my mental model of what the data currently looks like, leading likely to a code error later on. The end result is the same – a little helpful exception at some later step regarding wrong shape of data. There are two problems here: The error typically provides too little useful information and it usually manifests later than where the code/model mistake actually is. I therefore easily spend an hour or more troubleshooting these mistakes. In addition to that, it is also hard to read such code because a reader lacks the writer’s mental model of the data and has to derive it herself – which is quite hard especially if the shape of the input data is not clear in the first place.

I should mention that I of course write tests and experiment in the REPL but I still hit these problems so it is not enough for me. Tests cannot protect me from having a wrong model of the input data (since I write the [unit] tests based on the same assumptions as the code and only discover the error when I integrate all the bits) and even if they help to discover an error, it is still time-consuming the root cause.

Can I do better? I believe I can.

Read the rest of this entry »

Posted in Languages | Tagged: | 8 Comments »

Most interesting links of March ’14

Posted by Jakub Holý on March 31, 2014

Recommended Readings

Clojure Corner

  • Timo Mihaljov’s Pimp My REPL (3/2014)- really great tips – user.clj, :dev profile, user-wide config in .lein/profiles.clj, tools.namespace, making funs available everywhere & more via Vinyasa, form println with Spyscope, debug-repl, difform, clj-ns-browser

Tools/Libs

  • clj-ds – Clojure immutable datastructures extracted from Clojure and made easier for use directly in Java

Favourite Quotes

Posted in General, Top links of month | Tagged: | Comments Off

Most interesting links of February ’14

Posted by Jakub Holý on February 28, 2014

Recommended Readings

Development

  • Nathan Marz: Principles of Software Engineering, Part 1 – Nathan has worked with Big Data at Twitter and other places and really knows the perils or large, distributed, real-time systems and this post contains plenty of valuable advice for making robust, reliable SW. Main message: “there’s a lot of uncertainty in software engineering“; every SW operates correctly only for a certain range of inputs (including volume, HW it runs on, …) and you never control all of them so there always is an opportunity for failure; you can’t predict what inputs you will encounter in the wild. “[..] while software is deterministic, you can’t treat it as deterministic in any sort of practical sense if you want to build robust software.” “Making software robust is an iterative process: you build and test it as best you can, but inevitably in production you’ll discover new areas of the input space that lead to failure. Like rockets, it’s crucial to have excellent monitoring in place so that these issues can be diagnosed.“. From the content: Sources of uncertainty (bugs, humans, requirements, inputs, ..), Engineering for uncertainty (minimize dependencies, lessen % of cascading failure [JH: -> Hystrix], measure and monitor)
    • Suffering-oriented programming is certainly also worth reading (summary: do not start with great designs; only start generalizing and creating libs when you have suffered enough from doing things more manually and thus learned the domain; make it possible > make it beautiful > make it fast, repeat)
  • ThoughtWorks open-sources Go, continuous delivery platform – good bye, Jenkins! – better support for pipelines etc., see features and elementary concepts
  • Cloud Design Patterns: Prescriptive Architecture Guidance for Cloud Applications (recommended by @markusbk so it must be good); Patterns: Cache-aside, Circuit Breaker, Compensating Transaction, Competing Consumers, Compute Resource Consolidation, Command and Query Responsibility Segregation (CQRS), Event Sourcing, External Configuration Store, Federated Identity, Gatekeeper, Health Endpoint Monitoring, Index Table, Leader Election, Materialized View, Pipes and Filters, Priority Queue, Queue-based Load Leveling, Retry, Runtime Reconfiguration, Scheduler Agent Supervisor, (data) Sharding, Static Content Hosting (-> CDN), Throttling, Valet Key.
    Guidance topics: Asynchronous Messaging Primer, Autoscaling, Caching, Compute Partitioning, Data Consistency Primer, Data Partitioning, Data Replication and Synchronization, Instrumentation and Telemetry, Multiple Datacenter Deployment, Service Metering
  • MOOC course Functional programming with Clojure at Uni of Helsinki – to get started you need, I suppose, follow the “Material and course content” – essentially read the text for each chapter, clone its repo, submit pull requests to get your work graded
  • Jez Humble: The Case for Continuous Delivery – read to persuade manager about CD: “Still, many managers and executives remain unconvinced as to the benefits [of CD], and would like to know more about the economic drivers behind CD.” CD reduces waste: “[..]online controlled experiments (A/B tests) at Amazon. This approach created hundreds of millions of dollars of value[..],” reduces risks: “[..] Etsy, has a great presentation which describes how deploying more frequently improves the stability of web services.” CD makes development cheaper by reducing the cost of non-value-adding activities such as integration and testing. F.ex. HP got dev. costs down by 40%, dev cost/program by 78%

Web

  • Client-side messaging in JavaScript – Part 3 (anti-patterns) (via @ruudud so it must be worth reading)
  • Request Quest (via @jraregris) – entertaining and educational intractive quiz regarding what does (not) trigger a request in browsers and differences between them (and deviances from the standard) – img, script, css, etc.
  • The REST Statelessness Constraint – a nice post about statelessness in REST if you, like me, don’t know REST so much in depth; highlights: Statelesness (and thus the requirement for clients to send their state with every request) is a trade-off crucial for web-scale and partially balanced by caching – while typical enterprise apps have different needs (more state, less scale) so REST isn’t a perfect match. Distinguish application (client-side) and server (resources) state. Using a DB to hold the state still violates the requirement. Use links to transfer some state (e.g. contain a link to fetch the next page of records in the response).
  • Functional Programming in Javascript – an interactive tutorial teaching map, filter, mergeAll, reduce, zip

Other

  • CodeMesh 2013 presentations – good stuff! F.ex. Refactoring Functional Programs: Past and Future, Distribution, Scale and Flexibility with ZeroMQ, Deepak Giridharagopal on Puppet, Immutable Deployments, Analyzing Systems with PuppetDB, Francesco Cesarini and Viktor Klang on the Reactive Manifesto and more
  • Cognitive Biases in Times of Uncertainty – people under pressure/stress start to focus on risks over gains and (very) short-term rather than long-term and thus also adopt 0-some mindset (i.e. if sb. else wins, I loose) => polarization into we x them and focus on getting as big piece of the cake possible at any price, now, dismissal of collaboration. With accelerating rate of change in the society due to technology, this is exactly what is happening. How to counter it? Create more positive narratives than the threat-based ones (views of the world), support them via short-term gains. Bottom line: each of us must work on spreading a more positive attitude to save us from bleak future.
  • Book – Nathan Marz: Big Data – I dislike the big data hype (and, with passion, Hadoop) but would love to read this book; it presents a fresh look at big data processing, heavily inspired by functional programming. Nathan has plenty of experiences from Twitter and creating Storm and Cascalog (both in Clojure, btw.). Read ch 1:  A new paradigm for big data.
  • Facebook Engineering: The Mature Optimization Handbook (or go directly to the pdf,   ePubMobi). If you get bored, jump directly to ch 5. Instrumentation.

Clojure Corner

  • Schmetterling – Debug running clojure processes from the browser! – upon an exception, the process will pause and S. will show the stack, which you can navigate and see locals and run code in the context of any stack frame; you can also trigger it from your code
  • Gorilla REPL (screenshot, 11min video)- interactive web-based notebook where you can mix text (with Markdown formatting), mathematical formulas via LaTeX, graphs, tables, Clojure code. Great for exploring and, at the same time, describing data. <3
  • Local state is harmful – how can we answer the questions about when/why did state X change, how did output Y get where it is? Make state explicit, f.ex. one global map holding all of it, and perhaps not just the current state but also history – thus we can easily query it. Prismatic’ Graph can be used to make the state map, watches to keep history. Inspired by databases (Datomic is an excellent example of SW where answering such questions is trivial)
  • S. Corfield: Insanely Useful Leiningen Plugins – lein-ancient (find updated deps), lein-exec (execute Clj from cmd.line / scripts in Clj), lein-try (try a lib in REPL), Eastwood – a lint tool for Clojure
  • Sente – Clojure(Script) + core.async + WebSockets/Ajax – a tiny 600 LoC library for websockets (with fall-back to long-polling) communication between ClojureScript frontend and clojure backend, using EDN, support for request-reply and multiple user windows/tabs (comparison with Chord (no non-WS fallback or req/resp))
  • Nicholas Kariniemi: Why is Clojure bootstrapping so slow? – don’t blame the JVM, most time spent in clojure.core according to this analyzes on JVM and Android (create and set vars, load other namespaces); some proposals for improving it – lazy loading, excluding functionality not used, …
  • Cheat your way to running CLJS on Node – (ab)use D. Nolen’s mies template intended for client-side cljs development to create a Node project; the trick: compile everything into 1 file so that Node does not fail to find dependencies, disable source maps etc. Update: the nodecljs template now does this
  • lt-clojure-tutorial – A Clojure tutorial optimized for Light Table, ported from Nolen’s cljs tutorial

Tools/Libs

Posted in General, Languages, Tools, Top links of month | Tagged: , , , , , , , , | Comments Off