hpincket - Thoughts on Spec-ulation (Rich Hickey)

In Rich Hickey's 2016 talk Speculation he argues that too often library authors break backwards compatibility. Instead library authors should aim to maintain backwards compatibility potentially indefinitely.

Much of the talk is spent creating a framework for understanding backwards compatibility when it comes to software (and specifically Clojure). This is a little difficult to follow, since it uses Clojure-specific terminology. In short it breaks down to Growth and Breakages. A growth is adding a new function, class, package, etc. A breaking is either removing one of the above, or modifying it in such a way that the meaning has changed.

Well not quite, there are other ways you can modify functions. You can break a function by adding a new required argument. You can break it by changing the return values for a set of input. Depending on how you subscribe to Hyrum's Law, you can break it if stop throwing an NotImplementedException.

But this is likely common, if not well respected, knowledge. What else does the talk offer?

The talk has two other components:

A critique of Semantic Versioning (SemVer)
An analysis of why our library-level dependency tree isn't so useful

Semantic Versioning

The critique is scattered and not very thorough. At its core it seems Rich dislikes how unpredictable breaking changes can be. While breaking changes must happen at the MAJOR number change, they aren't the only reason the value can change. This leads to some uncertainty and reluctance to bump the major version.

He also critiques the fact that, though there are three values present in the version, consumers only care about two states. Is this a breaking change or not? The rest is just to entertain the library author. Rich offers some alternatives for version numbers, though I can't say they're that much more compelling. For instance versioning by date conveys more information than arbitrary numbers.

One critique leaves me puzzled. At the 39 minute mark he discusses the main response to his proposal that breaking changes necessitate a new library version. He says that the MAJOR number in SemVer does not let him designate changes as removals only. The thing is, I don't understand the need for this. Why separate breaking changes into different types?

If you understand this, please explain on email or Mastodon. I'm all ears.

Real Dependencies

This is found at the beginning of the talk. It discusses ways we might import more code and libraries than we really need. Rich talks about how this ties into future ideal versioning schemes. They should be source-code aware so that breaking changes are only indicated when our dependent source has actually broken.

This sort of static analysis seems more realistic in the world of compiled languages, and in fact you can get some semblance of it via static compilation.

SO Post

Others' Writing

I could write more, but I think this has all been covered. I refer you to this similarly named blog post. I think it makes a lot of great points:

A large portion of breaking changes are a result of negligence, and could be improved with better tooling.
Removing a function could be beneficial if it is dangerous to use. There are real-world consequences to leaving some functions around. This is a great point, though I'm still not sure that justifies removal.
The discussion around mixing multiple versions of the same library applies only to languages that don't believe in Data Hiding.
Tooling often doesn't let you have multiple versions of the same library in the project, so there's no option to slowly migrate to a newer (breaking) version of a library. This is possible if you place the breaking changes in a separate library.