Robot Has No Heart

Xavier Shay blogs here

A robot that does not have a heart

SICP Lisp interpreter in Clojure

On a lazy Sunday morning I can oft be found meandering through the classics of computer science literature. This weekend was no exception, as I put together a LISP interpreter in Clojure based off chapter 4 of The Structure and Interpretation of Computer Progams.

The code is on github, rather than including it inline here, since at 90 lines plus tests it’s getting a tad long for a snippet.

It differs from the SICP version in that the environment variable is immutable, so new versions have to be passed through to each function. This resulted in the “context” concept that encapsulates both the current expression and the environment that does with. It causes a small amount of clunky code (see map-reducer), but also allows easier managing of scoping for lambdas (see do-apply and env-extend). It matches the functional paradigm much better anyway. I also used some higher level primitives such as map and reduce that SICP doesn’t - SICP is demonstrating that they aren’t necessary, but that’s a point I’ve already conceeded and don’t feel I need to replicate.

Critique of my style warmly encouraged, I’m still new to Clojure.

OCR with Clojure and ImageMagick

Let’s write some Clojure to recognize hand-written digits. It will be fun. But first, some notes.

NOTE THE FIRST: If you actually want proper OCR with Clojure that is actually useful, perhaps try this blog post on using OpenCV and Tesseract. If you want to have some fun from first principles, come with me.

NOTE THE SECOND: This post was heavily inspired by Chapter 2 in Machine Learning in Action, which details the K nearest neighbour algorithm and pointed me to the dataset. If you dig this post, you should buy that book.

OK let’s go! Here’s what we’re going to do:

  • Take a snapshot of your handwriting.
  • Use ImageMagick to post-process it.
  • Convert the snapshot to a text format matching our training data.
  • Download and parse a training set of data.
  • Identify the digit written in the snapshot using the training data.

It’s going to be great.

Take a snapshot

Draw a single numeric digit on a piece of paper. Take a photo of it and get it on your computer. I used Photo Booth and the built-in camera on my Mac. Tight crop the picture around the number, so it looks something like:

Don’t worry if it’s a bit grainy or blurry, our classifier is going to be pretty smart.

Use ImageMagick to post-process it

The ImageMagick command line utility convert is one of those magic tools that once you learn you can never imagine how you did without it. It can do anything you need to an image. Anything. For instance, resize our image to 32×32 pixels and convert it into black and white.

1
2
3
4
5
6
7
(ns ocr.main
  (:use [clojure.contrib.shell-out    :only (sh)]))

(defn convert-image
  [in out]
  (sh "convert" in "-colorspace" "gray" "+dither" "-colors" "2"
      "-normalize" "-resize" "32x32!" out))

It took me a while to figure out this incantation. The user manual for quantize is probably the best reference you’ll find. Note that the exclamation mark in “32×32!” will stretch the dimensions of the image to be square. This is desirable since most people write too skinny, and maybe some write too fat, but we need the digits to be square otherwise everything will look like a “1”. Converting the above “5” will look like this:

I am shelling out from Clojure to transform the file. There are two other options: JMagick, which uses the C API directly using JNI, and im4java which still shells out but gives you a nice interface over the top of it. I couldn’t get the first one working (it looks like a pretty dead project, no updates for a few years), and the latter wouldn’t give me anything helpful in this case.

Convert the image into a text format

The convert program automatically formats the output file based on the file extension, you can easily convert between any graphic format you choose. For instance, convert JPG to PNG:

1
convert myfile.jpg myfile.png

As well as graphic formats though, it also supports the txt format, which looks like this:

1
2
3
4
# ImageMagick pixel enumeration: 32,32,255,rgb
0,0: (255,255,255)  #FFFFFF  white
1,0: (  0,  0,  0)  #000000  black
# etc...

That’s handy, because it can be easily translated into a bitmap with “1” representing black and “0” representing white. The “5” from above will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
10000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000001111111111
00000000000000111111111111111111
00000000000011111111111111111111
00000000000011111111111111111110
00000000000111111111100000000000
00000000000111100000000000000000
00000000001111100000000000000000
00000000001111000000000000000000
00000000011110000000000000000000
00000000111110000000000000000000
00000000111110000000000000000000
00000000111110000000000000000000
00000000111111111000000000000000
00000000111111111000000000000000
00000000001111111100000000000000
00000000000111111110000000000000
00000000000001111111000000000000
00000000000000111111000000000000
00000000000000011111000000000000
00000000000000001111000000000000
00000000000000000111100000000000
00000000000000000111100000000000
00000000000000011111000000000000
00011111111111111111000000000000
00011111111111111110000000000000
00011111111111111100000000000000
00000111111111111000000000000000
00000000001110000000000000000000
00000000000000000000000000000000

I used the duck-streams library found in clojure.contrib to read and write the file from disk, and applied some light processing to get the data into the required format. I also used a temporary file on disk to store the data - I’m pretty sure there would be a way to get convert to write to STDOUT then process that in memory, but I didn’t figure it out. It’s handy for debugging to have the file there anyways.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
(ns ocr.main
  (:use [clojure.contrib.shell-out    :only (sh)]))
  (:use [clojure.contrib.duck-streams :only (read-lines write-lines)]))

(defn read-text-image-line [line]
  (if (= "white" (last (split line #"[,:\s]+"))) "0" "1"))

(defn load-text-image
  [filename]
  (let [lines (vec (drop 1 (read-lines filename)))
        converted (map read-text-image-line lines) ]
    (map #(apply str %) (partition 32 converted))))

(defn convert-image
  [in out]
  (sh "convert" in "-colorspace" "gray" "+dither" "-colors" "2"
      "-normalize" "-resize" "32x32!" out)
  (write-lines out (load-text-image out)))

(def temp-outfile "/tmp/clj-converted.txt")

One more function is needed to be able to load that file up again into memory. This one doesn’t need to use read-lines, since the desired format for the classification below is actually just a vector of ones and zeros, so slurp is a quick alternative which is in the core libraries.

1
2
3
4
5
6
(defn load-char-file [file]
  (let [filename (.getName file)
        tokens   (split filename #"[_\.]")
        label    (first tokens)
        contents (parse-char-row (slurp file))]
    [label contents]))

Fetch some training data

The University of California Irving provides some sweet datasets if you’re getting into machine learning. In particular, the Optical Recognition of Handwritten Digits Data Set contains nearly 2000 labeled digits provided in the 32×32 text format the snapshot is now in. All digits are in one file, with a few header rows that can be dropped and ignored.

1
2
wget http://archive.ics.uci.edu/ml/machine-learning-databases/optdigits/optdigits-orig.tra.Z
gunzip optdigits-orig.tra.Z
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
(defn parse-char-row [row]
  (map #(Integer/parseInt %) (filter #(or (= % "1") (= % "0")) (split row #""))))

(defn parse-char-data [element]
  (let [label (trim (last element))
        rows  (take 32 element)]
    [label (vec (flatten (map parse-char-row rows)))]))

(defn load-training-data
  [filename]
  (let [lines (drop 21 (read-lines filename))
        elements (partition 33 lines)]
    (map parse-char-data elements)
  ))

(def training-set (load-training-data "optdigits-orig.tra"))

This code returns an array of all the training data, each element being an array itself with the first element a label (“0”, “1”, “2”, etc…) and the second element a vector of all the data (new lines ignored, they’re not important).

Note that I’m using vec throughout. This is to force lazy sequences to be evaluated, which is a required performance optimization for this program otherwise it won’t finish calculating.

Classify our digit

This is the exciting part! I won’t go into the algorithm here (buy the Machine Learning book!), but it’s called K Nearest Neighbour and it’s not particularly fancy but works surprisingly well. If you read my last blog post, you’ll note I’ve dropped the Incanter library. It was too much mucking about and didn’t provide any value for this project. Reading datasets is pretty easy with Clojure anyways.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
(defn minus-vector [& args]
  (map #(apply - %) (apply map vector args)))

(defn sum-of-squares [coll]
  (reduce (fn [a v] (+ a (* v v))) coll))

(defn calculate-distances [in]
  (fn [row]
    (let [vector-diff (minus-vector (last in) (last row))
          label       (first row)
          distance    (sqrt (sum-of-squares vector-diff))]
    [label distance])))

(defn classify [in]
  (let [k                  10
        diffs              (map (calculate-distances in) training-set)
        nearest-neighbours (frequencies (map first (take k (sort-by last diffs))))
        classification     (first (last (sort-by second nearest-neighbours)))]
    classification))

Now to tie it all together with a main function that converts all the snapshots you pass in as arguments.

1
2
3
4
5
6
7
(defn classify-image [filename]
  (convert-image filename temp-outfile)
  (classify (load-char-file (java.io.File. temp-outfile))))

(defn -main [& args]
  (doseq [filename args]
    (println "I think that is the number" (classify-image filename))))

That’s the lot. Use it like so:

1
2
> lein run myDigits/5_0.jpg
I think that is the number 5

Hooray! Here is the full script as a gist. Let me know if you do anything fun with it.

Profiling Clojure

Tonight I was so impressed by how easy it was to profile some Clojure code using built-in JVM tools that I had to share:


Profiling Clojure.

Today I also learned more about the Incanter API, and wrote some good code to transform columns, among other things.

Exploring data with Clojure, Incanter, and Leiningen

I’m working through Machine Learning in Action at the moment, and it’s done in Python. I don’t really know Python, but I’d prefer to learn Clojure, so I’m redoing the code samples.

This blog posts show how to read a CSV file, manipulate it, then graph it. Turns out Clojure is pretty good for this, in combination with the Incanter library (think R for the JVM). It took me a while to get an environment set up since I’m unfamiliar with basically everything.

Install Clojure

I already had it installed so can’t remember if there were any crazy steps to get it working. Hopefully this is all you need:

1
sudo brew install clojure

Install Leiningen

Leiningen is a build tool which does many things, but most importantly for me is it manages the classpath. I was jumping through all sorts of hoops trying to get Incanter running without it.

There are easy to follow instructions in the README

*UPDATE: * As suggested in the comments, you can probably just `brew install lein` here and that will get you Leiningen and Clojure in one command.

Create a new project

1
lein new hooray-data && cd hooray-data

Add Incanter as a dependency to the project.clj file, and also a main target:

1
2
3
4
5
6
(defproject clj "1.0.0-SNAPSHOT"
  :description "FIXME: write"
  :dependencies [[org.clojure/clojure "1.2.0"]
                 [org.clojure/clojure-contrib "1.2.0"]
                 [incanter "1.2.3-SNAPSHOT"]]
  :main hooray_data.core)

Add some Incanter code to src/hooray_data/core.clj

1
2
3
4
5
6
(ns hooray_data.core
  (:gen-class)
  (:use (incanter core stats charts io datasets)))

(defn -main [& args]
  (view (histogram (sample-normal 1000)))

Then fire it up:

1
2
lein deps
lein run

If everything runs to plan you’ll see a pretty graph.

Code

First, a simple categorized scatter plot. read-dataset works with both URLs and files, which is pretty handy.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
(ns hooray_data.core
  (:use (incanter core stats charts io)))

; Sample data set provided by Incanter
(def plotData (read-dataset 
            "https://raw.github.com/liebke/incanter/master/data/iris.dat" 
            :delim \space 
            :header true))

(def plot (scatter-plot
            (sel plotData :cols 0)
            (sel plotData :cols 1)
            :x-label "Sepal Length"
            :y-label "Sepal Width"
            :group-by (sel plotData :cols 4)))

(defn -main [& args]
  (view plot))

Second, the same data but normalized. The graph will look the same, but the underlying data is now ready for some more math.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
(ns hooray_data.core
  (:use (incanter core stats charts io)))

; Sample data set provided by Incanter
(def data (read-dataset 
            "https://raw.github.com/liebke/incanter/master/data/iris.dat" 
            :delim \space 
            :header true))

(defn extract [f]
  (fn [data]
     (map #(apply f (sel data :cols %)) (range 0 (ncol data)))))

(defn fill [n row] (map (fn [x] row) (range 0 n)))

(defn matrix-row-operation [operand row matrix] 
  (operand matrix 
    (fill (nrow matrix) row)))

; Probably could be much nicer using `reduce`
(defn normalize [matrix]
  (let [shifted (matrix-row-operation minus ((extract min) matrix) matrix)]
   (matrix-row-operation div ((extract max) shifted) shifted)))

(def normalized-data
  (normalize (to-matrix (sel data :cols [0 1]))))

(def normalized-plot (scatter-plot
            (sel normalized-data :cols 0)
            (sel normalized-data :cols 1)
            :x-label "Sepal Length"
            :y-label "Sepal Width"
            :group-by (sel data :cols 4)))

(defn -main [& args]
  (view normalized-plot))

I was kind of hoping the normalize function would have already been written for me in a standard library, but I couldn’t find it.

I’ll report back if anything else of interest comes up as I’m working through the book.

A pretty flower Another pretty flower