Loops in Functional Languages
Adventures in Programming
In this article, we are going to look at how loops are used, first in an imperative language, like Python, then in a pure functional language, like Elm. While the particular languages used are not as important as the ideas, these are languages that I have used a lot, and which I like a lot, though for different purposes. The first few examples will be mathematical, but involve nothing more exotic that adding numbers. The very last example is about text processing.
The article is organized as follows:
- Imperative Loops recalls how we do loops in an imperative language like Python, with a side note on the notion of infinity in the fourteenth century.
- A Primer on Functional Languages reviews some basic notions, that functions have types
a -> b
ora -> b -> c
. Skip if you know this already - Reduce and fold presents a standard way of doing loops in functional languages. Can be skipped if you know about
reduce
andfold
. - Functional Loops is the fun part. We will talk about the type
Step state a
and all the goodness that flows from it.
1 Imperative Loops
Let’s go back to the days when we first learned to program, computing things like the sum 1 + 2 + 3 + … + 9. This is a task perfectly suited to the for loop
, as in the Python example below.
>>> sum = 0
>>> for i in range(1,10):
... sum = sum + i
>>> sum
45
For loops are good when we know the number of times the loop body will be executed. However, that is not always the case. Consider the so-called harmonic sum,
h(n) = 1 + 1/2 + 1/3 + ... + 1/n
It was known to the mathematician|bishop Nicole Oresme (ca. 1323–1382) that this sum diverges, so that if you add all the terms, you get infinity. Since neither you nor a computer can do infinitely many additions, what this really means is the following. Choose a number L. Then there is always an n such that h(n) > L. We can write a program to find n given L:
>>> sum = 0
>>> n = 0
>>> L = 10
>>> while sum < L:
... n = n + 1
... sum = sum + 1.0/n
...
>>> n
12367
2 A Primer on Functional languages
Let’s do a quick review of some notions common in typed functional languages like Haskell and Elm. (Skip if you already know this.) Consider first this little fragment.
add : Int -> Int -> Int
add x y = x + y
It defines a function which adds two numbers:
> add 2 3
5
But what does the funny lookingInt -> Int -> Int
mean? Two points to explain this. First, every piece of data has a type. Thus 2 has type Int
and “hello” has type String
. Second, functions are data, so they must also have a type. Consider this snippet:
> String.length "hello"
5
The function String.length
takes a String
as input and produces an Int
as output. Thefore it has type String -> Int
. Upping our game, the function add
takes two Int’s
as input and produces an Int
as output, so it has type Int -> Int -> Int
. This notation suggests something. What is the meaning of the expression add 1
? Well, it is a function of type Int -> Int
, namely the function that adds 1 to its input. By applying add
to the integer one, we obtain a function of type Int -> Int
from a function of type Int -> Int -> Int
. This sort of things, called “currying” by some and “partial application” by others, is quite useful, as we see next.
Mapping a Function over a List
Consider the task of incrementing all the integers in a list. We can do it this way:
> List.map (add 1) [1, 3, 5, 7]
[2,4,6,8]
We used List.map
to apply the function add 1
to each element of the list [1, 3, 5 7]
, obtaining the new list [2, 4, 6, 8]
. Quiz time! Since List.map
is a function, it must have a type. What is it? Well, it is something more exotic than we have seen so far:
(a -> b) -> List a -> List b
This means that List.map
takes two arguments. The first argument is a function of type a -> b
. This function (let’s call it f
) takes things of type a
and returns things of type b
. The second argument is a list of things of type a
. Thus List.map
applies f
to every element a list of things of type a
to produce a list of things of type b
.
3 Reduce and Fold
One way to do loop computations in a language of pure functions is to use reduce or a fold. We will use List.foldl
(“fold left”) in Elm. This is a function which takes three arguments:
- a reducer, which is a function type
a -> b -> b
- an initial value of type
b
- a list of things of type
a
The result is a thing of type b
. As an example, we find the total number of characters in a list of strings. First, make the definition
> reducer str acc = acc + String.length str
Thus reducer "hello" 2
has the value 5 + 2 = 7. Now do this:
> List.foldl reducer 0 ["the", "little", "frog", "jumped", "high"]
23
The general form is List.foldl reducer accumulator inputList
. List.foldl
operates by repeatedly “eating” elements of inputList
, using reduce
to updateaccumulator
each time. When the list is empty, it returns the accumulator. Done!
The sum 1 + 2 + … + n
Let’s redo some of the computations of the first section using pure functions. To add the numbers 1, 2, … 9, we do this:
> List.foldl (\i acc -> i + acc) 0 (List.range 1 9)
45
Here \i acc -> i + acc
is the “anonymous” function which adds its arguments i
and acc
. Naturally enough, List.range
is used to create lists of integers:
> List.range 1 9
[1,2,3,4,5,6,7,8,9]
Harmonic sums
Let’s compute the harmonic sum
h(n) = 1 + 1/2 + 1/3 + ... + 1/n
To this end, define a function that makes a list of integers [1, 2, ..., n]
:
> seq n = List.range 1 n
<function> : Int -> List Int> seq 4
[1,2,3,4]
Then set
> h n = List.foldl (\i acc -> 1/(toFloat i) + acc) 0 (seq n)
<function> : Int -> Float
A brief test:
> h 1
1 : Float> h 2
1.5 : Float
Looks, good, so now we can do a real computation:
> h 1000
7.485470860550343
4 Functional Loops
The last example showed how we can compute the harmonic sum h n
using List.foldl
. But how do we do computations where the number of loop iterations is not known in advance? This is where a new type definition comes in:
type Step state a
= Loop state
| Done a
The state
will be used to carry some, well, state through the computation, with the final value being of type a
. Both state
and a
are type variables, which makes this a very general definition, applicable to many situations. A value of this type can be either Loop state
or Done a
. In the first case, Loop
, there is more computation to be done using the value state
. In the second case, Done
, the computation is complete, and is a value of type a
.
The loop
function works with Step state a
and a function
nextState : state -> Step state a
to drive the computation:
loop : state -> (state -> Step state a) -> a
loop s nextState =
case nextState s of
Loop s_ ->
loop s_ nextState
Done b ->
b
The function nextState : state -> Step state
a computes the next state (or final value) from the current state. Let’s illustrate this with the problem of computing the sum 1 + 2 + … + n. Define a type alias ST
to hold two values: a counter
, which keeps track of how much computation remains to be done, and a value
, in which the final result is built up, step by step:
type alias ST =
{ counter : Int, value : Int }
The next-state function is defined below:
f : ST -> Step ST Int
f st =
case st.counter of
0 ->
Done st.value _ ->
Loop { st | counter = st.counter - 1
, value = st.value + st.counter }
To see how it works, consider these two examples:
> f { counter = 0, value = 4}
Done 4
and
> f {counter = 2, value = 4}
Loop { counter = 1, value = 6 }
The effect of loop
is much like that of List.fold
. At each stage of the computation, the counter
is added to the current value
and then the counter
is decreased by 1. This continues until the counter reaches 0, at which point value
is returned. Note that the loop
construct requires some care. What would happen if we said counter = st.counter + 1
?
Harmonic sums again
The same construct applies with almost no change to compute harmonic sums. First, a slight redefinition
type alias ST2 =
{ counter : Int, value : Float }
Then a new next-state function:
f2 : ST2 -> Step ST2 Float
f2 st =
case st.counter of
0 ->
Done st.value _ ->
Loop { st | counter = st.counter - 1
, value = st.value + 1 / toFloat st.counter }
Finally, the computation:
> h n = loop { counter = n, value = 0} f2
<function> : Int -> Float> h 2
1.5 : Float> h 100
5.1873775176396215
Functional While Loops
At long last we give the functional version of the computation
>>> sum = 0
>>> n = 0
>>> L = 10
>>> while sum < L:
... n = n + 1
... sum = sum + 1.0/n
...
>>> n
12367
To begin, define a type alias ST3
which holds the necessary information: the limit value L, the current value of n, the current sum:
type alias ST3 =
{ limit : Float, n : Int, sum : Float }
Then define the next-state function that either terminates the computation or advances it:
f3 : ST3 -> Step ST3 Int
f3 st =
if st.value >= st.limit then
Done st.index else
Loop { st | index = st.index + 1
, value = st.value + 1 / toFloat (st.index + 1) }
Finally use loop
to run the next-state function on some initial data:
> loop { limit = 10.0, n = 0, sum = 0} f3
12367
We can roll this work into the following definition:
numberOfTerms : Float -> Int
numberOfTerms l = loop { limit = l, n = 0, sum = 0} f3
Now comes the interesting question: is numberOfTerms
really a function? That is, given a number l: Float
, will int always return an Int
? The answer cannot be found by inspecting the code. Rather, one needs another piece of information, Oresme’s result that the harmonic series diverges. Given this, one can say that numberOfTerms l
will return a result given a enough time. The catch, however, is that the time may be longer than the lifetime of the computer on which the program runs. This make the loop
construct fundamentally different from fold
and reduce
, which are guaranteed to terminate.
Text Processing
We end with an optional, somewhat long example of text processing. Imagine an array of strings (“lines”) in which lines have no newline character and in which paragraphs are separated by empty strings. The goal is to find the paragraph boundaries corresponding to a line with given array index. To this end, set
type ParagraphBoundary
= BeginParagraph
| EndParagraph
The function we seek looks like this:
> boundary : ParagraphBoundary -> Int -> Array String -> Int
boundary boundary lineNumber lines = ...
Thus the function function call boundary BeginParagraph 5 lines
will find the beginning of the paragraph containing the line with array index 5. For example, if
someLines = ["a", "b", "", "c", "d", "e", "f", "", "g", "h"]
the boundary BeginParagraph 5 someLines = 3
, while boundary EndParagraph 5 = 6
. We present the implementation of the boundary
function below with little comment. They main point is that by using the loop
construct, one achieves an efficient implementation that generally requires scanning only a small part of the array.
Implementation
Since we need to be able to search both forwards and backwards, we make this definition:
type Direction
= Forward
| Backward
The state is defined by
type alias ST4 =
{ lastIndex : Int, lines : Array String, currentLine : Int }
with boundary
defined like this:
boundary : Boundary -> Position -> Array String -> Int
boundary boundary position lines =
let
initialState =
{ lastIndex = Array.length lines - 1
, lines = lines
, currentLine = position.line }
in
case boundary of
BeginParagraph ->
loop initialState (next Backward)
EndParagraph ->
loop initialState (next Forward)
Finally, here is the next-state function:
next : Direction -> ST4 -> Step ST4 Int
next direction st =
case Array.get st.currentLine st.lines == Just "" of
True ->
Done st.currentLine
False ->
case direction of
Forward ->
case st.currentLine < st.lastIndex of
True ->
Loop { st | currentLine = st.currentLine + 1 } False ->
Done st.lastIndex
Backward ->
case st.currentLine == 0 of
True ->
Done 0
False ->
Loop { st | currentLine = st.currentLine - 1 }