Date

The Kotlin standard library has a peculiarly long method called firstNotNullOfOrNull. The method maps a function to a collection and returns the first that evaluates to a non-null value. If there are no so elements, the method returns null. As an exercise I reproduced the method in Clojure.

A note on Runtime

A characteristic of any implementation is that the method should return early if an early element in the list evaluates to a non-null value. A more technical notion might be:

  • let the element e at index n be the first element where f(e) != null
  • the runtime of our method should not exceed O(f) * n

Solution 1: Sequence Functions

The first implementation is straightforward. Take the first element of the sequence mapped and filtered.

(defn first-not-nil-of-or-nil-simple [s fn]
  (first (->> s               ; Take the first element after...
              (map fn)        ; - Applying fn to each element
              (filter some?)  ; - Filtering out nil elements
              ))
  )

Examining Runtime

Does this implementation satisfy the above runtime? It looks like it might! Clojure's lazy sequences protect us from mapping the entire sequence before filtering and taking the first. The process is done lazily st. the following returns instantly:

(first-not-nil-of-or-nil-simple (range 100000000000000) identity)

But wait a minute, what happens when we feed it a non-lazy sequence? To do that we can use into to convert the range sequence into a vector.

(first-not-nil-of-or-nil-simple (into [] (range 100000000000000)) identity)

Now this command crashes the IDE. Great, so that's that... or is it?

Why are we crashing? It could be that constructing a vector of that length is the issue. To find this out we can introduce an expensive mapping function s.t. the mapping function dominates the runtime. We could do this with a quadratic algorithm, a network call, or a sleep, but in this case we can add a print.

Here's a helper method we can use to identify when our mapping method is being called:

(defn pident
  "Print and return the given value"
  [x]
  (do (println x) x))

And now let's use it:

; Prints x each time fn is evaluated
(first-not-nil-of-or-nil-simple (into [] (range 100000)) pident)

We can run the above and check the output. I'm expecting either a single print, or 100,000 prints. Our program has the desired output if it prints a single line.

(first-not-nil-of-or-nil-simple (into [] (range 100000)) (fn [x] (do (println x) x)))
0
1
...
30
31
=> 0

HUH! Did our program print a single line? No. Did our program print 100,000 lines? Also no. Our program printed 32 lines.

Surprise Laziness

It turns out library functions like map and filter convert a sequence into a lazy sequence. And these lazy sequences are 'chunked' into sub-groups. This is something I'll need to do more reading on.

For the purpose of this method, the runtime maintains the same big-O complexity (in terms of 'f'). That's because in the worst-case we perform a bounded number of additional evaluations of f.

Well enough of that, let's move on to another way to implement firstNotNullOfOrNull.

Solution 2: Transducers

My second solution relies on transducers. This is brand new territory for me. I still don't fully grok them, but I was able to jury-rig a transducer for this use-case. There are plenty of other reading materials on how to construct them, but my understanding is this:

  • They can be used to construct a stack of sequence computations
  • The stack is applied all at once to each element (this gives you good performance)
  • The stack may decide to drop or introduce new elements, like a reactive stream.
  • If necessary, the computations can be stateful, so that their behavior changes as they consume elements.
  • There's a reducing function at the end which combines the results however you wish.

Transducing firstNotNullOfNull

To build our sequence we know we want to map and filter like we did above. We also want to take the first element, but let's come back to that. Here's how you compose multiple transducing functions.

(comp
  (map fn)
  (filter some?))

To use this transducer we can do the following:

(def my-reducer conj)
(def my-sequence (range 100000))
(transduce
  (comp
    (map fn)
    (filter some?))
    my-reducer
    my-sequence)

We use conj as a reducing function to construct a vector.

Now we have a transducing function which returns a vector containing elements mapped and filtered to non-nil. How exactly do we get the first one here?

Using first in our transducer

Well, we can use the `first

(defn first-not-nil-of-or-nil-transducer-first-only [s fn]
  (first (transduce
           (comp
             (map fn)
             (filter some?)
             )
           conj
           s))
  )

...but it has non-optimal runtime (Even with lazy sequences):

(first-not-nil-of-or-nil-transducer-first-only (into [] (range 10000)) pident)
0
1
2
...
99999
=> 0

Using the take transducer

We now change our transducer to include a third operator, (take 1). This transducing function takes one element from the sequence. Let's try it now:

(defn first-not-nil-of-or-nil-transducer-take-only [s fn]
  (transduce
           (comp
             (map fn)
             (filter some?)
             (take 1)
             )
           conj
           s)
  )

This works, but it returns a vector containing the element we want.

(first-not-nil-of-or-nil-transducer-take-only (into [] (range 10)) pident)
0
=> [0]

So naturally we can combine the two approaches:

Combining the Above

(defn first-not-nil-of-or-nil-transducer-better [s fn]
  (first (transduce
    (comp
      (map fn)
      (filter some?)
      (take 1)
      )
    conj
    s))
  )

And it works as expected.

(first-not-nil-of-or-nil-transducer-better (into [] (range 10)) pident)
0
=> 0

Dénouement

Here's my code in full, I hope you find it useful. I've even included another approach to the transducer implementation.

(ns main)

(defn pident
  "Prints and return the given value"
  [x]
  (do (println x) x))

(defn- r-identity
  ([] nil)
  ([x] x)
  ([x y] (do
           (if (some? x) x y)))
  )

(defn first-not-nil-of-or-nil-transducer [s fn]
  (transduce
    (comp
      (map fn)
      (filter some?)
      (take 1)
      )
    r-identity
    s)
  )

(defn first-not-nil-of-or-nil-transducer-better [s fn]
  (first (transduce
    (comp
      (map fn)
      (filter some?)
      (take 1)
      )
    conj
    s))
  )

(defn first-not-nil-of-or-nil-transducer-take-only [s fn]
  (transduce
    (comp
      (map fn)
      (filter some?)
      (take 1)
      )
    conj
    s)
  )
(defn first-not-nil-of-or-nil-transducer-first-only [s fn]
  (first (transduce
           (comp
             (map fn)
             (filter some?)
             )
           conj
           s))
  )

(defn first-not-nil-of-or-nil-slow [s fn]
  (first (->> s
       (into [] (map fn))
       (filter some?)
       ))
  )

(defn first-not-nil-of-or-nil-lazy [s fn]
  (first (->> s
              (map fn)
              (filter some?)
              ))
  )

(defn first-not-nil-of-or-nil-lazy-with-first [s fn]
  (first (->> s
              (map fn)
              (filter some?)
              (take 1)
              ))
  )