The semantics of map
We will get started by taking a look at one of the most used operations in these abstractions: map
.
We've been using map
for a long time, in order to transform sequences. Thus, instead of creating a new function name for each new abstraction, library designers simply abstract the map
operation over its own container type.
Imagine the mess that we would end up in if we had functions such as transform-observable
, transform-channel
, combine-futures
, and so on.
Thankfully, this is not the case. The semantics of map
are well understood, to the point that even if a developer hasn't used a specific library before, he will almost always assume that map
will apply a function to the value(s) contained within whatever abstraction the library provides.
Let's look at three examples that we have encountered in this book. We will create a new Leiningen project, in which to experiment with the contents of this appendix:
$ lein new library-design
Next, let's add a few dependencies to our project.clj
file, as follows:
... :dependencies [[org.clojure/clojure "1.9.0"] [com.leonardoborges/imminent "0.1.1"] [com.netflix.rxjava/rxjava-clojure "0.20.7"] [org.clojure/core.async "0.4.474"] [uncomplicate/fluokitten "0.3.0"]] ...
Don't worry about the last dependency; we'll get to that later on.
Now, start a repl
session, so that you can follow along. Type the following code:
$ lein repl
Then, enter the following into your REPL:
(require '[imminent.core :as i] '[rx.lang.clojure.core :as rx] '[clojure.core.async :as async]) (def repl-out *out*) (defn prn-to-repl [& args] (binding [*out* repl-out] (apply prn args))) (-> (i/const-future 31) (i/map #(* % 2)) (i/on-success #(prn-to-repl (str "Value: " %)))) (as-> (rx/return 31) obs (rx/map #(* % 2) obs) (rx/subscribe obs #(prn-to-repl (str "Value: " %)))) (def c (async/chan)) (def mapped-c (async/map< #(* % 2) c)) (async/go (async/>! c 31)) (async/go (prn-to-repl (str "Value: " (async/<! mapped-c)))) "Value: 62" "Value: 62" "Value: 62"
The three examples (using imminent
, RxClojure
, and core.async
, respectively) look remarkably similar. They all follow a simple recipe:
- Put the number
31
inside of their respective abstractions - Double that number by mapping a function over the abstraction
- Print its result to the REPL
As expected, this will provide the value 62
as the output to the screen, three times.
It would seem that map
performs the same abstract steps in all three cases—it applies the provided function, puts the resulting value in a fresh new container, and returns it. We could continue to generalize, but we would just be rediscovering an abstraction that already exists—functors.
Functors
Functors are the first abstraction that we will look at, and they are rather simple—they define a single operation, called fmap
. In Clojure, functors can be represented by using protocols, and they are used for containers that can be mapped over. Such containers include (but are not limited to) lists, futures, observables, and channels.
Note
The Algebra in the title of this appendix refers to abstract algebra, a branch of mathematics that studies algebraic structures. An algebraic structure is, to put it simply, a set with one or more operations defined on it. As an example, consider semigroups, which is one such algebraic structure. A semigroup is defined as a set of elements together with an operation. This operation combines any two elements of this set. Therefore, the set of positive integers, along with the addition operation, form a semigroup. Another tool that is used to study algebraic structures is called category theory, of which functors are a part[1]. We won't delve too deeply into the theory behind all of this, as there are plenty of books[1][2] available on the subject. It was, however, a necessary detour, to explain the title used in this appendix.
Does this mean that all of these abstractions implement a functor protocol? Unfortunately, that is not the case. As Clojure is a dynamic language and didn't have protocols built in (they were added in version 1.2 of the language), these frameworks tend to implement their own versions of the map
function, which doesn't belong to any protocol in particular.
The only exception is imminent
, which implements the protocols included in fluokitten
, a Clojure library providing concepts from category theory, such as functors[2].
The following is a simplified version of the functor protocol found in fluokitten
:
(defprotocol Functor (fmap [fv g]))
As we mentioned previously, functors define a single operation. fmap
applies the function g
to whatever value is inside of the container (functor), fv
.
However, implementing this protocol does not guarantee that we have actually implemented a functor. This is because, in addition to implementing the protocol, functors are also required to obey a couple of laws, which we will examine briefly.
The identity
law is as follows:
(= (fmap a-functor identity) (identity a-functor))
The preceding code is all that we need to verify this law. It simply says that mapping the identity
function over a-functor
is the same as simply applying the identity
function to the functor itself.
The composition law is as follows:
(= (fmap a-functor (comp f g)) (fmap (fmap a-functor g) f))
The composition law, in turn, says that if we compose two arbitrary functions, f
and g
, take the resulting function, and apply that to a-functor
, that is the same as mapping g
over the functor, and then mapping f
over the resulting functor.
No amount of text will be able to replace practical examples, so we will implement our own functor, which we will call option
. We will then revisit the laws, to ensure that we have respected them.
The option functor
As Tony Hoare once put it, null references were his one billion dollar mistake[3]. Regardless of your background, you will no doubt have encountered versions of the dreadful NullPointerException
. This usually happens when we try to call a method on an object reference that is null.
Clojure deals with null values due to its interoperability with Java. In this section we will learn how Clojure provides improved support for dealing with null values.
The core
library is packed with functions that do the right thing if passed a nil
value (Clojure's version of Java's null). For instance, how many elements are there in a nil
sequence? Consider the following code snippet:
(count nil) ;; 0
Thanks to conscientious design decisions regarding nil
, we can, for the most part, afford not to worry about it. For all other cases, the option
functor might be of some help.
The remaining examples in this appendix should be in a file called option.clj
, under library-design/src/library_design/
. You're welcome to try this in the REPL, as well.
Let's start our next example by adding the namespace declaration, as well as the data that we will be working with:
(ns library-design.option (:require [uncomplicate.fluokitten.protocols :as fkp] [uncomplicate.fluokitten.core :as fkc] [uncomplicate.fluokitten.jvm :as fkj] [imminent.core :as i])) (def pirates [{:name "Jack Sparrow" :born 1700 :died 1740 :ship "Black Pearl"} {:name "Blackbeard" :born 1680 :died 1750 :ship "Queen Anne's Revenge"} {:name "Hector Barbossa" :born 1680 :died 1740 :ship nil}]) (defn pirate-by-name [name] (->> pirates (filter #(= name (:name %))) first)) (defn age [{:keys [born died]}] (- died born))
As a Pirates of the Caribbean fan, I thought it would be interesting to play with pirates in this example. Let's suppose that we would like to calculate Jack Sparrow's age. Given the data and functions that we just covered, this is a simple task:
(-> (pirate-by-name "Jack Sparrow") age) ;; 40
However, what if we would like to know Davy Jones' age? We don't actually have any data for this pirate, so if we run our program again, the following is what we'll get:
(-> (pirate-by-name "Davy Jones") age) ;; NullPointerException clojure.lang.Numbers.ops (Numbers.java:961)
There it is. The dreadful NullPointerException
. This happens because, in the implementation of the age
function, we end up trying to subtract two nil
values, which is incorrect. As you might have guessed, we will attempt to fix this by using the option
functor.
Traditionally, option
is implemented as an algebraic data type—more specifically, a sum type with two variants: Some
and None
. These variants are used to identify whether a value is present, without using nils. You can think of both Some
and None
as subtypes of option
.
In Clojure, we will represent them by using records, as follows:
(defrecord Some [v]) (defrecord None []) (defn option [v] (if (nil? v) (None.) (Some. v)))
As you can see, Some
can contain a single value, whereas None
contains nothing. It's simply a marker indicating the absence of content. We have also created a helper function, called option
, which creates the appropriate record, depending on whether its argument is nil
.
The next step is to extend the Functor
protocol to both records, as follows:
(extend-protocol fkp/Functor Some (fmap [f g] (Some. (g (:v f)))) None (fmap [_ _] (None.)))
Here's where the semantic meaning of the option
functor becomes apparent—as Some
contains a value, its implementation of fmap
simply applies the function g
to the value inside of the functor, f
, which is of the type Some
. Finally, we put the results inside of a new Some
record.
Now, what does it mean to map a function over None
? You have probably guessed that it doesn't really make sense; the None
record holds no values. The only thing that we can do is return another None
. As you will see shortly, this gives the option
functor a short-circuiting semantic.
Note
In the fmap
implementation of None
, we could have returned a reference to this
, instead of a new record instance. I haven't done so, simply to make it clear that we need to return an instance of None
.
Now that we've implemented the functor protocol, we can try it out, as follows:
(->> (option (pirate-by-name "Jack Sparrow")) (fkc/fmap age)) ;; #library_design.option.Some{:v 40} (->> (option (pirate-by-name "Davy Jones")) (fkc/fmap age)) ;; #library_design.option.None{}
The first example shouldn't hold any surprises. We convert the pirate map that we get by calling pirate-by-name
into option
, and then we fmap
the age
function over it.
The second example is an interesting one. As we stated previously, we have no data about Davy Jones. However, mapping age
over it does not throw an exception any longer; instead, it returns None
.
This might seem like a small benefit, but the bottom line is that the option
functor makes it safe to chain operations together:
(->> (option (pirate-by-name "Jack Sparrow")) (fkc/fmap age) (fkc/fmap inc) (fkc/fmap #(* 2 %))) ;; #library_design.option.Some{:v 82} (->> (option (pirate-by-name "Davy Jones")) (fkc/fmap age) (fkc/fmap inc) (fkc/fmap #(* 2 %))) ;; #library_design.option.None{}
At this point, some readers might be thinking about the some->
macro (introduced in Clojure 1.5) and how it effectively achieves the same result as the option
functor. This intuition is correct, as demonstrated in the following code snippet:
(some-> (pirate-by-name "Davy Jones") age inc (* 2)) ;; nil
The some->
macro threads the result of the first expression through the first form if
it is not nil
. Then, if the result of that expression isn't nil
, it threads it through
the next form, and so on. As soon as any of the expressions evaluates to nil
, we see that some->
short-circuits and returns nil
immediately.
That being said, a functor is a much more general concept; as long as we are working with this concept, our code doesn't need to change, as we are operating at a higher level of abstraction:
(->> (i/future (pirate-by-name "Jack Sparrow")) (fkc/fmap age) (fkc/fmap inc) (fkc/fmap #(* 2 %))) ;; #<Future@30518bfc: #<Success@39bd662c: 82>>
In the preceding example, even though we are working with a fundamentally different tool (futures), the preceding code that returns a pirate by name did not have to change. This is only possible because both options and futures are functors, and they implement the same protocol provided by fluokitten
. We have gained composability and simplicity, as we can use the same API to work with various different abstractions.
Speaking of composability, this property is guaranteed by the second law of functors. Let's see whether our option
functor respects this and the first (identity
) laws:
;; Identity (= (fkc/fmap identity (option 1)) (identity (option 1))) ;; true ;; Composition (= (fkc/fmap (comp identity inc) (option 1)) (fkc/fmap identity (fkc/fmap inc (option 1)))) ;; true
We're done; our option
functor is a lawful citizen. The two remaining abstractions also come paired with their own laws. We will not cover the laws in this section, but I encourage you to read about them[4].