Tuesday, January 10, 2012

How to turn a Python loop into a Clojure 'reduce'

This is my answer to this Stack Overflow question.

I wrote a simple stack-based virtual machine in Python, and now I'm trying to rewrite it in Clojure, which is proving difficult as I don't have much experience with Lisp. This Python snippet processes the bytecode, which is represented as a list of tuples like so:


[("label", "entry"),  ("load", 0),  ("load", 1),  ("add",),  ("store", 0)]

pc = 0
for inst in code:
  if inst[0] == "label":
    label = inst[1]
    self.labels[label] = pc
  else:
    self.code.insert(pc, inst)
    pc += 1


I'm going to give you a general solution for these kind of problems.
Most loops can be done effortlessly with a strait forward mapfilter or reduce, and if your data structure is recursive, naturally the loop will be a recursion.
Your loop, however, is a different kind of loop. Your loop accumulates a result -- which would suggests using reduce -- but the loop also carries a local variable along (pc), so it's not a strait reduce.
It's a reasonably common kind of loop. If this was Racket, I would use for/fold1, but since it's not, we will have to shoehorn your loop onto reduce.
Let's define a function called load which returns two things, the processed code and the processed labels. I will also use a helper function called is-label?.
(defn load [asm]
  (defn is-label? [x] (= (first x) :label))
  {:code <<< CODE GOES HERE >>>

   :labels
   <<< CODE GOES HERE >>>
  })
Right now, your loop does two things, it processes the code, and it processes the labels. As much as possible, I try to keep loops to a single task. It makes them easier to understand, and it often reveals opportunities for using the simpler loop constructs.
To get the code, we simply need to remove the labels. That's a call to filter.
  {:code (filter (complement is-label?) asm)

   :labels
   <<< CODE GOES HERE >>>
  }
Reduce normally has only one accumulator, but your loop needs two: the result, and the local variablepc. I will package these two into a vector which will be immediately deconstructed by the body of the loop. The two slots of the vector will be my two local variables.
The initial values for these two variables appear as the 2nd argument to reduce.
   (first
    (reduce
     (fn [[result, pc] inst]

        << MORE CODE >>

     [{} 0] asm))
(Note how the initial values for the variables are placed far from their declaration. If the body is long this can be hard to read. That's the problem Racket's for/fold1 solves.)
Once reduce returns, I call first to discard to the local variable pc and keep just the result.
Filling the body of the loop is straight forward. If the instruction is a label, assoc into the result, otherwise increase pc by one. In either case, I construct a vector containing new values for all the local variables.
     (fn [[result, pc] [_ arg :as inst]]
       (if (is-label? inst)
         [(assoc result arg pc) pc]
         [result (inc pc)]))
This technique can be used to convert any accumulator-with-locals loop into a reduce. Here's the full code.
(defn load [asm]
  (defn is-label? [x] (= (first x) :label))
  {:code (filter (complement is-label?) asm)

   :labels
   (first
    (reduce
     (fn [[result, pc] [_ arg :as inst]]
       (if (is-label? inst)
         [(assoc result arg pc) pc]
         [result (inc pc)]))
     [{} 0] asm))})
(println
  (load
   [[:label :entry]
    [:load 0]
    [:load 1]
    [:label :exit]
    [:add]
    [:store 0]]))

No comments: