Purely functional programming according to wikipedia
designates a programming paradigm that treats all computation as the evaluation of mathematical functions. Purely functional programing may also be defined by forbidding changing state and mutable data.
The importance of the purely functional approach cannot be overstated: it eliminates entire classes of mutation-related bugs; encourages composable abstraction; pure code can be reasoned about equationally; the explicit segregation of side-effects acts as a forcing function for better abstraction. At the same time, immutable persistent data structures can be cheaply incremented while shared safely across multiple threads. Pure programs also admit compile-time optimizations that would be tricky to achieve in the presence of state.
Benefits aside, purity is an attribute of expressions and their formal semantics. A pure program would still end up executing on regular computer hardware, where it could demonstrate the following effects: high CPU utilization, excessive heap allocations, stack overflows, divizion by zero errors, etc. Thus the connection of pure code to functions of the mathematical variety should not be considered a literal one, otherwise all correct Fibonacci implementations would be equal by virtue of extensionality. Being able to reason how code —pure or otherwise— executes on the underlying hardware is critically important for the working programmer.
An ML Tradition
Transforming purely functional declarations into equivalent, optimal imperative code is a holy grail, still very much a topic of active research. In the general case though, performance can only be attained by forsaking purity. This is very much compatible with the ML philosophy, which has traditionally embraced imperative features in its language core. Quoting from Real World OCaml
Pure code is the default in OCaml, and for good reason—it’s generally easier to reason about, less error prone and more composable. But imperative code is of fundamental importance to any practical programming language, because real-world tasks require that you interact with the outside world, which is by its nature imperative. Imperative programming can also be important for performance. While pure code is quite efficient in OCaml, there are many algorithms that can only be implemented efficiently using imperative techniques.
OCaml offers a happy compromise here, making it easy and natural to program in a pure style, but also providing great support for imperative programming.
This also applies particularly well to F#, an OCaml dialect for the .NET framework. In my personal experience, performance gains when switching from purely functional to imperative code in F# can be particularly dramatic. Strategically adopting imperative implementations in performance-critical components is often vital to ensuring sustainability of a system.
Imperative Programming done right
So what does this all mean? Should we just give up on purely functional code in the interest of performance? Is F# really just a way to write C# programs in ML-style syntax?
Well, not really. I do claim though that there is a sweet spot where impure features can be utilized to combine the benefits of pure FP with a lot of the efficiency of procedural code. Defining that sweet spot is hard, however I cannot think of a more concise example than Nessos Streams, which provides a highly efficient functional streams implementation in just over 40 lines of code. Other examples could include FParsec and Hopac, whose core implementations have been written in C# out of performance considerations.
If one had to nominate a key guiding priciple for using imperative features in F# code, then surely we would single out referential transparency. Very roughly, a function is referentially transparent if it behaves like a pure function, even if its implementation might use imperative constructs. For example, the function
let incr (cell : int ref) : unit = cell := !cell + 1
is not referentially transparent because substituting a call to
incr with the the result of the computation is not a behaviour-preserving transformation, in general.
We can still use imperative features to define functions that are in fact referentially transparent. A good example is implementing
groupBy for lists. A purely functional
groupBy could be written as follows:
let groupBy (proj : 'T -> 'Key) (ts : 'T list) : ('Key * 'T list) list = let folder (map : Map<'Key, 'T list>) (t : 'T) = let key = proj t let grouped = defaultArg (Map.tryFind key map)  Map.add key (t :: grouped) map ts |> List.fold folder Map.empty |> Map.toList |> List.map (fun (k,v) -> k, List.rev v)
This implementation may be pure, however performance-wise it is not ideal. We can still produce a more efficient, imperative equivalent of the same function:
let groupBy' (proj : 'T -> 'Key) (ts : 'T list) : ('Key * 'T list) list = let dict = new Dictionary<'Key, ResizeArray<'T>>() for t in ts do let key = proj t let mutable array = null if not <| dict.TryGetValue(key, &array) then array <- new ResizeArray<'T>() dict.Add(key, array) array.Add t dict |> Seq.map (function KeyValue(k,v) -> k, Seq.toList v) |> Seq.toList
The second function provides order-of-magnitude performance gains, but is also referentially transparent: it can be substituted with the pure implementation or the resulting value without affecting behaviour of the overall program. In fact, this approach very much reflects how the core library implements its own functions.
Purely functional programming in F# provides a very effective means of reasoning about code, however in the interest of efficiency it is often preferable to leverage the imperative aspects of the language. When done carefully, it can result in better performing code that sacrifices few if any of the benefits of the purely functional approach.
In fact, my claim is that this is very much the idiomatic way of using ML in general and F# in particular. Pretending that F# is a purely functional language à la Haskell is highly counterproductive, particularly when authoring production-grade projects.