Something's wrong with the world today
Introduction
The test of the machine is the satisfaction it gives you. There isn't any other test. If the machine produces tranquillity it's right. If it disturbs you it's wrong until either the machine or your mind is changed. The test of the machine's always your own mind. There isn't any other test. — Robert M. Pirsig, Zen And The Art Of Motorcycle Maintenance
Note: I originally wrote this article in July 2015. Almost 4 years later I looked at it and thought it's worth publishing, so here you go.
I've been doing Ruby programming at my 3 last jobs, and I'm starting to get increasingly frustrated with the language, the libraries, the tools, the mindset and the direction in which the language is heading (read: enterprise). In this article I'd like to talk about simplicity.
The problem
Apparently "This Is The House That Jack Built" is a thing. Where most people see a story for children, us programmers see a problem that needs a program written to solve it. And of course reproducing the original is no fun, so we want to spice it up with the ability to generate our own stories in similar fashion.
If we limit ourselves to the first lines (or "iterations"), the story we want to generate would look like this:
This is the house that Jack built.
This is the malt that lay in the house that Jack built.
This is the rat that ate the malt that lay in the house that Jack built.
This is the cat that killed the rat that ate the malt that lay in the house that Jack built.
Sketch of a solution
The first step would be to look at the example input and output, and figure out the steps needed to transform the input into the output. But since we only have the output we need to figure out the "algorithm" (or procedure) used to generate that output.
One thing is immediately apparent: each successive sentence is longer than the previous. And if we look closer we may notice that all sentences share the same tail, and new things are added at the beginning of each sentence. If we align common parts of the first two sentences we get the following:
This is the house that Jack built. This is the malt that lay in the house that Jack built. ======= ++++++++++++++++++++ ========================== same new same
Comparing the second line to the third makes me think I see pattern:
This is the house that Jack built. This is the malt that lay in the house that Jack built. This is the rat that ate the malt that lay in the house that Jack built. ======= ++++++++++++++++ =============================================== same new same
It's obvious (to me) that the "This is" prefix is the same on each line (read: does not depend on input data). Then, ignoring this prefix, we identify the following "building blocks", and put them into a file:
the house that Jack built. the malt that lay in the rat that ate the cat that killed
Now that we have the input data, and understanding of how that data is transformed into output, it should not be hard to write a command which takes this file and outputs the sentences we expect. Running it would look like this:
cat lines.txt | recite
The recite
command would read successive lines from input and prepend each
new line to the current line (initially empty), then print the constructed
line with "This is" as a prefix.
Writing such a command seems easy enough. It might be more interesting if the story would change every time. Could our approach accommodate this functionality? In shell syntax this would look as follows:
cat lines.txt | shuf | recite
At this point the programmers among us might realize (without even writing a single line of code) that the final period in the first line of input file should not be part of input. So we update our input file so that it looks like this:
the house that Jack built the malt that lay in the rat that ate the cat that killed
At this point we decide that our plan is good enough, and write our recite script:
ACC="" while read LINE; do # Treat the first line specially: no need for # separator space or newline. if [ -z "$ACC" ]; then ACC=$LINE else # Separate the lines so we can clearly see them # (gets messy if the lines are wrapped). echo ACC="$LINE $ACC" fi echo "This is $ACC." done
We check that it works:
cat lines.txt | recite
This is the house that Jack built.
This is the malt that lay in the house that Jack built.
This is the rat that ate the malt that lay in the house that Jack built.
This is the cat that killed the rat that ate the malt that lay in the house that Jack built.
And with the shuffling:
cat lines.txt | shuf | recite
This is the house that Jack built.
This is the cat that killed the house that Jack built.
This is the malt that lay in the cat that killed the house that Jack built.
This is the rat that ate the malt that lay in the cat that killed the house that Jack built.
At this point we can share our script with our friends. Or use it as a party
game where everybody contributes to the story by putting the lines in one big
file. Or, to make it even more interesting, the lines can be put in separate
files so that the participants don't see what others have added, and then,
instead of cat lines.txt
we could just do cat *.txt
. Or sort the files
chronologically. If we want shuffling, but keep the lines in each file
together (applies only to files with multiple lines), instead of shuffling the
lines we can shuffle the file list:
ls *.txt | shuf | xargs cat | recite
I could go on like this for a while, so now would be a good time to tell you why I'm actually writing this.
When things go wrong
You're still here? OK then, let's get back to our problem. And since we're real programmers here let's solve it like real professionals would. Oh but wait, somebody already did! There was a talk at Railsconf 2015 titled "Nothing is Something" by Sandi Metz. The presentation itself I liked a lot — it was very fluid and entertaining. I'd like to invite you to stop reading this article and watch the talk now.
At around 17:00 the speaker arrives at the following solution:
class House def recite (1..data.length).map {|i| line(i)}.join("\n") end def line(number) "This is #{phrase(number)}.\n" end def phrase(number) data.last(number).join(" ") end def data ["the horse and the hound and the horn that belonged to", "the farmer sowing his corn that kept", "the rooster that crowed in the morn that woke", "the priest all shaven and shorn that married", "the man all tattered and torn that kissed", "the maiden all forlorn that milked", "the cow with the crumpled horn that tossed", "the dog that worried", "the cat that killed", "the rat that ate", "the malt that lay in", "the house that Jack built"] end end
Sure the code is short and cute, but there are still things I do not quite like:
- There is no reason for the
phrase
method to have an implicit dependency ondata
method.data
should be an explicit parameter — implicit parameters can only be "injected", the explicit ones can be just passed in. If you watched the talk you should have noticed the speaker implying supplying things as parameters is a good thing. - The
line
method taking thenumber
parameter. In what way is that related to what it does? Implicit dependencies all over the place. Ugh. - The name of the class:
House
. Later in the talk the speaker talks about howRandomHouse
is not actually a different kind ofHouse
— it supposedly is the originalHouse
with a random dependency injected into it. Know what? Let's question another assumption! Namely theHouse
being a "house". It's not! The name somehow sneaked in from the last line (or the first, depending on how one looks at the problem). But as we see later, that choice is random, because any other line might be the first one. So why not call the classHorse
? OrPotato
? I'm pretty sure somebody at some point will add a line containing "potato" to the input. - Why did we end up with a class, anyway? I'm pretty sure the task was not to write a class.
After "solving" the original problem additional requirements start rolling in.
First one (shuffling the lines) is solved by RandomHouse
class. The speaker
has added a few extra restrictions on the requirement: write the new class
(writing a class was never really a requirement, but let's play along) without
breaking original class and without using conditionals
(19:50).
I'd also like to note here that in the previous section we implemented this very requirement while adhering to three, not two, restrictions:
- We did not change our script.
- We did not use conditionals. The conditional in our recite script does not count because the restriction here is about adding the new functionality without using conditionals.
- We did this by re-using code, which is the holy grail of programming (and the promised-but-never-delivered feature of OOP).
Let's continue with the talk. Next requested feature is the ability to
duplicate each input line, with the same restrictions as before, resulting in
an EchoHouse
class (20:40). Along
the way (multiple) inheritance is dismissed as a non-solution to the problem
(23:30,
25:20), which is a topic for
another article.
This is the point in the talk where I actually get very annoyed with the whole
mental masturbation thing where jumping through hoops is considered a major
accomplishment and the hoops are set up in increasing numbers as the talk
progresses. Supposedly the right thing to do is to write collaborator
classes, and inject instances of those into the House
class. My guess is
that passing parameters is too old-school and boring, so the process must be
called "injection".
Here are the collaborator classes which the House
class now accepts in the
constructor (I'll have a few things to say about naming these
later):
class DefaultOrder def order(data) data end end class RandomOrder def order(data) data.shuffle end end class DefaultFormatter def format(parts) parts end end class EchoFormatter def format(parts) parts.zip(parts).flatten end end
The presentation did not have the full source to the class (or I have missed it) so I've done the thing left as "an exercise for the reader" into the original and this is how it looks now (using only 3 input lines to conserve electrons):
class House DATA = ["the rat that ate", "the malt that lay in", "the house that Jack built"] attr_reader :formatter, :data def initialize(orderer: DefaultOrder.new, formatter: DefaultFormatter.new) @formatter = formatter @data = orderer.order(DATA) end def recite (1..data.length).map {|i| line(i)}.join("\n") end def line(number) "This is #{phrase(number)}.\n" end def phrase(number) parts(number).join(" ") end def parts(number) formatter.format(data.last(number)) end end
With the changes in place, this is how the classes can be combined to solve the different variations of the problem. First, with using defaults:
puts House.new.recite
This is the house that Jack built.
This is the malt that lay in the house that Jack built.
This is the rat that ate the malt that lay in the house that Jack built.
With shuffling:
puts House.new(orderer: RandomOrder.new).recite
This is the malt that lay in.
This is the house that Jack built the malt that lay in.
This is the rat that ate the house that Jack built the malt that lay in.
With lines duplication:
puts House.new(formatter: EchoFormatter.new).recite
This is the house that Jack built the house that Jack built.
This is the malt that lay in the malt that lay in the house that Jack built the house that Jack built.
This is the rat that ate the rat that ate the malt that lay in the malt that lay in the house that Jack built the house that Jack built.
With shuffling and duplication:
puts House.new(orderer: RandomOrder.new, formatter: EchoFormatter.new).recite
This is the malt that lay in the malt that lay in.
This is the rat that ate the rat that ate the malt that lay in the malt that lay in.
This is the house that Jack built the house that Jack built the rat that ate the rat that ate the malt that lay in the malt that lay in.
Searching for the light
Now, we can see the code works (for some values of "works", see below). But at what cost? My personal opinion is: if your solution looks like this, you did something wrong. Something very, very wrong1. How wrong exactly? I'm glad you asked!
Collaborators
If you look at the four collaborator classes defined above for more than two seconds, you might notice that they all look very similar:
class BlahBlah def execute(parameter) # Do the thing. end end
Reminds you of anything? What in your programming career have you seen that's executable, accepts parameters and does not maintain state? That's right: functions! Don't worry if you had trouble recognizing this — functions in the Kingdom of Nouns are F-words, and people are brainwashed to see things where there aren't any.
How about we skip the ceremony, and drop the whole class proliferation business?
module Functions def self.identity(data) data end # If this looks fishy it's probably because of the # smell. def self.shuffle(data) data.shuffle end # This is how the EchoFormatter#format should have # been implemented. def self.echo(data) data.map do |line| line + " " + line end end end
Now we only have three functions (instead of four) because DefaultOrder
and DefaultFormatter
are literally doing the same thing. Why have two
different classes do the exact same thing, anyway2? Let's start using
our "functions":
class House DATA = ["the rat that ate", "the malt that lay in", "the house that Jack built"] attr_reader :formatter, :data def initialize(orderer: Functions.method(:identity), formatter: Functions.method(:identity)) @formatter = formatter @data = orderer.call(DATA) end def recite (1..data.length).map {|i| line(i)}.join("\n") end def line(number) "This is #{phrase(number)}.\n" end def phrase(number) parts(number).join(" ") end def parts(number) formatter.call(data.last(number)) end end
Notice how unwieldy it is to refer to the functions we wrote — we must look
up a function as a method on a module using ModuleName.method(...)
calls.
Also notice the change we had to do in initialize
and parts
methods to use
the functions instead of classes — all it took is to use the same call
method instead of order
and format
. Even though it does not matter if the
function is named call
or execute
3, consistency is good, and I
consider this an improvement!
Does it still work, though? Sure does:
puts House.new.recite
This is the house that Jack built.
This is the malt that lay in the house that Jack built.
This is the rat that ate the malt that lay in the house that Jack built.
puts House.new(orderer: Functions.method(:shuffle)).recite
This is the malt that lay in.
This is the house that Jack built the malt that lay in.
This is the rat that ate the house that Jack built the malt that lay in.
Unnecessary dependencies
Now if you recall from my initial list of issues with the House
class
there's the item about methods having implied dependencies. In programming
circles it is called coupling, and unnecessary coupling is bad for code
maintainability.
Let's start with the line
method. All it should be doing is prepending
"This is" and appending a period to a string. But instead of passing a
string, the calling code passes in a number, which is passed to another method
to obtain the string. Bad style, much coupling, very smell. One of the
things that movies have taught us is that knowing too much is harmful for
one's well-being (or indeed, survival). So let's get this poor method out of
the trouble:
module Functions def self.line(input) # Notice that we don't add a newline character # because newlines are part of output, not data. # And we don't know what the caller of this function # will do with the result of this function. "This is #{input}." end end
There, line
is independent and strong now, and does not know anything it
should not, so also safe.
Next up is recite
method. There is absolutely no need for this method to
know how many lines there are in the story. Which makes me realize that the
method is ill-conceived from the beginning. The "function" should generate
the output lines by adding new lines in front of the previous lines:
module Functions def self.recite(lines) seen = [] lines.map do |line| seen.prepend(line) seen.join(" ") end end end
And since this "function" now does one thing only (as all functions and methods should), we need another function to actually do the output:
module Functions def self.this_is(lines) lines.each do |str| # We add an extra newline for the purpose of this # article so that it's easy to see where the lines # start and end when they are wrapped. puts line(str), "\n" end end end
Using these functions is a bit cumbersome, and looks like this:
module Functions DATA = ["the house that Jack built", "the malt that lay in", "the rat that ate"] this_is(recite(DATA)) end
This is the house that Jack built.
This is the malt that lay in the house that Jack built.
This is the rat that ate the malt that lay in the house that Jack built.
Just look at that, it works! So what about the shuffling? Easy to add:
module Functions this_is(recite(shuffle(DATA))) end
This is the rat that ate.
This is the malt that lay in the rat that ate.
This is the house that Jack built the malt that lay in the rat that ate.
Makes no sense, but that's what the customer asked for. How about the echo thing?
module Functions this_is(recite(echo(DATA))) end
This is the house that Jack built the house that Jack built.
This is the malt that lay in the malt that lay in the house that Jack built the house that Jack built.
This is the rat that ate the rat that ate the malt that lay in the malt that lay in the house that Jack built the house that Jack built.
At this point we can chain shuffle
, echo
and recite
as we want, and it
will work as advertised. Even better, because we're not limited to just one
instance of each.
If you followed the code carefully you might have noticed that my "functional"
implementation of echo
is very different from the one in the presentation.
We could use the following definition (perfectly valid):
module Functions def self.echo1(data) data.zip(data).flatten end end
Can you guess what will happen if we use this function?
module Functions this_is(recite(shuffle(echo1(DATA)))) end
This is the malt that lay in.
This is the rat that ate the malt that lay in.
This is the house that Jack built the rat that ate the malt that lay in.
This is the malt that lay in the house that Jack built the rat that ate the malt that lay in.
This is the house that Jack built the malt that lay in the house that Jack built the rat that ate the malt that lay in.
This is the rat that ate the house that Jack built the malt that lay in the house that Jack built the rat that ate the malt that lay in.
May I invite you to figure out from the code why the EchoFormatter
works in
the version of the code from the talk, but not in our functional version? If
you don't want to figure it out yourself I can tell you — because the
supposedly proper object oriented code in the talk is so silly, coupled and
non-composable that it somehow manages to work… by accident! More details
below.
Idiomatic code
The "OO-inclined" readers reading this might be talking to their screens and saying I'm full of shit and how is this crap code I've come up with even supposed to be better than the original nice and clean and OO-done-right-according-to-best-practices-using-design-patterns-and-whatnot? Well, let's see what we can do about that.
We might try and invent a DSL by writing a new class and overriding the pipe
(|
) operator, but that's a whole new can of worms we're not going to touch
here.
There is room for improvement, though. What I want to achieve is something that looks like what I had in beginning of the article. Since we're still using Ruby, the following code looks close enough:
DATA.shuffle.echo.recite
In order to achieve that, we will have to modify (read: monkey-patch) the
Array
class. Even though everybody knows that monkey-patching is bad, it is
besides the point in this article. Just look at me moving all the functions
defined above into the Array
class4:
class Array # shuffle is already there. def echo self.map do |line| line + " " + line end end def recite seen = [] self.map do |line| seen.prepend(line) seen.join(" ") end end def this_is self.each do |str| puts Functions::line(str), "\n" end end # While we're here let's add a convenience method # since we're always ending our calls with # .recite.this_is (but we don't have to). def verse self.recite.this_is end end # Put this here so we have something to work with. DATA = ["the house that Jack built", "the malt that lay in", "the rat that ate"]
Did not take much effort, now did it? Let's try it out:
DATA.shuffle.verse
This is the rat that ate.
This is the house that Jack built the rat that ate.
This is the malt that lay in the house that Jack built the rat that ate.
Works as requested! Now, we can go wild and implement a couple more features before the customer asks for them:
- Ability to
duplicate
input lines. - Ability to
squash
neighbouring lines together.
In the House
class it should be a matter of writing a collaborator class or
two, right? Go on, try it! Continue reading when you're done. I'll be
extending my version in the meantime.
class Array def duplicate # Reminds you of something? Right, this is how the # EchoHouse was implemented in the talk. self.zip(self).flatten end def squash self.each_slice(2).map do |*lines| lines.join(" ") end end end
There, I'm done! So, how big of a change was it for you? Had to rewrite the
whole House
class? Oh, that's too bad — I wish there were some re-usable
pieces of code for you to use!
I'd also like to point out that there is not a single conditional in my code. I did not even have to jump through hoops to achieve that. Not that I'm scared of conditionals, but the code is so straight-forward I have not even had a need for them.
Looks too good to be true? I can prove it works:
DATA.duplicate.shuffle.squash.verse
This is the malt that lay in the house that Jack built.
This is the rat that ate the malt that lay in the malt that lay in the house that Jack built.
This is the house that Jack built the rat that ate the rat that ate the malt that lay in the malt that lay in the house that Jack built.
How about we keep the feature requests coming?
- an ability to reverse the story,
- or elide first N lines,
- or last N lines,
- or random N lines.
Would we need an "orderer"? A "formatter"? How many? How would they combine? How many collaborator classes would be needed?
In my case all of the newly introduced functions are one-liners:
class Array # OK, I lied: drop and reverse are zero-liners. Lucky # coincidence? I think not (remember shuffle). If # you think this is cheating I'd like to point out # that this is how code reuse actually looks like. # Theoretically we don't need this, because the same # can be achieved by: # # .reverse.drop(n).reverse # def drop_last(n) self.first(self.length - n) end # This is also not needed, because it's semantically # equivalent to: # # .shuffle.drop(n) # def drop_random(n) self.sample(self.length - n) end end
So in the end our methods turned out to be just convenience shortcuts. So much for charging the customer for our time…
Naming things
This might be a good time to revisit the "finding concepts and naming them" (27:00) part.
Back when the collaborator classes were introduced I could see how shuffling
things around is a kind of ordering, but what the hell is EchoFormatter
?
How is it formatting anything? The only thing resembling formatting in the
original House
class is the line
method… EchoFormatter
as a name does
not make much sense, if any at all. And the implementation is only one line,
and it has nothing to do with the name. That should have been a sign for the
author of the code that something is amiss.
But now that we have shuffle
and echo
instead of RandomOrder
and
EchoFormatter
, we see they are not different concepts — they are both
transformations! We could try and categorize the transformations we've
written so far:
- Changing the sequence of items (
shuffle
), - Modifying items (
echo
), - Generating items (
duplicate
), - Combining items (
squash
), - Eliding items (
drop_first
,drop_last
,drop_random
).
But in the end they are all just functions, working on a sequence of strings. There are no "orderers". No "formatters". Just functions, which, when done properly (i.e, not coupled), can be combined. And still no conditionals.
Testing
The presentation did not touch the topic of testing, which would probably be another talk like this. Except with more injection and mocking and stubbing and whatnot. Even thinking about testing this supposedly-proper-OO code makes me cringe. This article is already long enough, so I will not be talking about testing, but I'd like to invite you to think about how you would approach testing both versions of the code.
Summary
If I was to teach this stuff to a beginner programmer and I used the interface I came up with originally (using the shell syntax) I imagine they would not have trouble to take this example:
# Using "duplicate" because "echo" is already taken. cat lines.txt | shuffle | duplicate | recite
which would correspond to the following code in the context of the talk:
puts House.new(orderer: RandomOrer.new, formatter: RandomOrder.new).recite
and turn it into something that first duplicates the lines, then shuffles them:
cat lines.txt | duplicate | shuffle | recite
In the context of the talk, would the following work?
puts House.new(orderer: EchoFormatter.new, formatter: RandomOrder.new).recite
Of course it would not, you silly! Formatter is not an orderer, and orderer is not a formatter (even though they do not differ in that they take an array of strings as a parameter, and return a new array of strings). How would anybody in their right mind call whatever this is composition?
This is the problem I referred to early in the article (end of
unnecessary dependencies section): the
EchoFormatter
has an implied dependency (i.e., coupled to) the order of
execution: it only works after the items have been ordered. This is not how
composition works at all! Composition is when smaller things (i.e., building
blocks) can be composed together to make bigger things. If the smaller things
can only be put together in one way, they're not really smaller things:
they're one big thing cut into pieces. Functions compose well. Objects
usually don't5.
The EchoFormatter
could be fixed by implementing it like we did in echo
function. But the problem of composability would still be there: the House
class still dictates that there can be only one "orderer", and one "formatter"
and that "formatter" goes after "orderer." And if we want something else?
We're screwed6!
I've said it before, and I'll say it again: bad abstractions are worse than no abstractions. Just drop your OO hammer and see the world for what it is! All the fun is in doing stuff, not having things.
Footnotes:
A not-so-subtle reference to "Turn your hamster into a fighting machine" by Jared Purrington.
One might argue that the different classes, even if their effect is the same, are still semantically different. But that's besides the point here because if we want semantically different behaviour we just use a different functions (with the semantics we need).
What, you did not read Kingdom of Nouns? This is your reminder to do so now.
In a non-throwaway code scenario I'd use Ruby's refinements, but it so
happens that programming is hard, and tools change under one's feet, and irb
cannot deal with using M
statements, and org-mode
barely can work with irb
, and there is no support for pry
. Basically
everything is broken and I just want to be done with this article.
Especially when multiple inheritance, the approach that encourages creating re-usable classes (i.e., mixins), is dismissed outright.
As mentioned in the talk, inheritance is not a solution. But not
because inheritance is somehow bad or wrong, but because the methods of the
House
class are too coupled.