A declarative argument parser for F#

When it comes to command line argument parsing in the F# projects I work on, my library of choice so far has been the argument parser available with the F# powerpack. While ArgParser is a simple implementation that works well, I always felt that it didn’t offer as declarative an experience as I would like it to have: extracting the parsed results can only be done through the use of side-effects.

It becomes even uglier when you need to combine this with configuration provided from App.Config, resulting in arduous code that (in most cases) adheres to the following pattern:

  1. Parse configuration file for parameter “foo”.
  2. Parse command line arguments for parameter “foo”.
  3. Configuration file is overriden if declared otherwise in command line.

I felt that this configuration parsing scheme is a pattern that can and should be handled transparently. After a bit of experimentation, I ended up with a library of my own, which you can find uploaded in github. What follows is an informal walkthrough of what it does.

The Basic Idea

The library is based on the simple observation that configuration parameters can be naturally described using discriminated unions. For instance:

UnionArgParser takes such discriminated unions and generates a corresponding argument parsing scheme. For example, the parser generated from the above template recognizes the syntax

--working-directory /var/run --listener localhost 8080 --detach

yielding outputs of type Argument list. The syntax is infered from the union type without the need to specify any additional metadata.

The parser will also look for the following keys in the AppSettings section of the application config file:

As mentioned previously, command line arguments will override their corresponding configuration file entries. This default behaviour can be changed, however.

Usage

A minimal example using the above union can be written as follows:

While getting a single list of all parsed results might be useful for some cases, it is more likely that you need to query the results for specific parameters:

Querying using quotations enables a simple and type safe way to deconstruct parse results into their constituent values.

Customization

The parsing behaviour of the configuration parameters can be customized by fixing attributes to the union cases:

In this case,

  • Mandatory: parser will fail if no configuration for this parameter is given.
  • NoCommandLine: restricts this parameter to the AppSettings section.
  • AltCommandLine: specifies an alternative command line switch.

The following attributes are also available:

  • NoAppConfig: restricts to command line.
  • Rest: all remaining command line args are consumed by this parameter.
  • Hidden: do not display in the help text.
  • GatherAllSources: command line does not override AppSettings.
  • ParseCSV: AppSettings entries are given as comma separated values.
  • CustomAppSettings: sets a custom key name for AppSettings.

Post Processing

It should be noted here that arbitrary unions are not supported by the parser. Union cases can only contain fields of certain primitive types, that is int, bool, string and float. This means of course that user-defined parsers are not supported. For configuration inputs that are non-trivial, a post-process facility is provided.

This construct is useful since exception handling is performed by the arg parser itself.

Final Remarks

I have used this library extensively in quite a few of my projects and have so far been satisfied with its results. As always, you are welcome to try it out for yourselves and submit your feedback. I anticipate that some people might complain that this isn’t a very idiomatic implementation, since most of it depends on reflection. Using parser records would have been much more straightforward implementation-wise, but, there is a certain elegance in declaring unions of parameters that simply cannot be ignored 🙂

EDIT: A NuGet package has now been uploaded.

A declarative argument parser for F#

Parametric open recursion, Pt. 2

In my previous post, I demonstrated how one can define poly-variadic fixpoint combinators in a strict language like F# using references. Today, we are going to expand on these ideas to construct a fixpoint combinator that allows open recursion over an indexed family of functions:

In this implementation, recursive bindings are made possible by early memoization of function references. This implies that an equality semantics is required for the domain of parameters. Also, just as in the poly-variadic case, the evaluation of references is delayed through the use of eta expansion.

Example: compiling regular expressions

Consider the following regular expression language:

Suppose I need to define a function Regex<'T> -> 'T [] -> bool which, given a regular expression, precomputes its recognizing predicate. There are many ways to write this, but the parametric combinator provides a particularly elegant solution:

Needless to say, the above will not work with the traditional Y combinator.

Extending the combinator

In the previous entry I described how the poly-variadic fixpoint can be extended beyond function spaces to any type that supports the notion of “delayed dereferencing”. The same pattern is observed with the parametric fixpoint. In fact, this construct can be generalized to a point where its type signature becomes essentially the same as that of the traditional Y combinator:

A similar version to the above can be found in the implementation of FsCoreSerializer.

Parametric open recursion, Pt. 2

Parametric open recursion, Pt. 1

Consider the following situation: I need to define a function Param -> a -> b so that given a parameter p : Param my algorithm precomputes a function a -> b. Emphasis here goes to precomputation, which you can assume is an expensive operation. There are a few examples of such a pattern, this could be a regular expression parser (or indeed any parser), a denotational semantics, or, as it happens in my case of interest, a function that takes a type and returns a serializer for that type.

All this might sound a bit trivial, but what happens if I need to add recursion into the mix? What if I needed to define the Kleene star in terms of itself or if I needed to define serialization rules for (mutual) recursive types? One would be inclined to point out that this is easily solvable by applying the traditional fixpoint combinator, but this does not play well with our original assumption: the so-called precomputation stage will have to be performed every time a recursive call is made, which renders it neither efficient nor a precomputation.

Poly-variadic fix-point combinators

It is a well-known fact that mutual recursion can be encoded into the traditional Y combinator. This idea is reflected in certain constructs that allow the definition of mutual recursive functions without utilizing explicit language support for such. These are known as poly-variadic fixpoint combinators. An interesting example lies with Haskell, whose non-strict semantics permit a very succinct implementation:

How do you define the same combinator in a strict language like F#? There a few ways you could go about doing that, like wrapping around lazy types or fixing the arity of your defined functions, but it turns out you can preserve the original type signature with a bit of cheating:

This really works, and you can try it out for yourself at F# snippets. In fact, as my colleague Nick Palladinos has pointed out, it is significantly more efficient than your standard Y combinator (even though it remains orders of magnitude slower than idiomatic tail recursion).

A key observation to be made about this implementation is the use of “eta expansion” over the function references. The resulting functions have the property of preserving the content of the enclosed ref cell at the time of invocation. I would call this transformation “delayed dereferencing”, but I leave it to the more PLT inclined to suggest a better name for this. It could be further noted that similar combinators are possible in other domains that support this notion of delay, such as tuples of functions or even lazy types.

In my next entry I will be describing how the above ideas can be applied to solve the parametric recursion problem in F#.

Parametric open recursion, Pt. 1

Serializing F# types

One of the many performance challenges we have faced during our development of {m}brace is that of serialization. As you might have heard[1,2,3] {m}brace is a declarative cloud programming framework -currently under development- that aspires to become the “next big thing” in big data analytics.

Serialization performance of F♯ types with the available .NET formatters is not exactly ideal, and this problem is amplified in our setting where transmission of arbitrary user-defined objects is commonplace. Some cases like trees are simply too slow, whereas others such as huge lists are outright fatal, leading to potential stack overflows.

Our solution to this problem is an implementation that at its core combines two ideas already out there:

  1. The excellent serializer by Anton Tayanovskyy (http://fssnip.net/6u) that compositionally constructs its formatter functions by recursively traversing the reflected types of input objects.
  2. The FsReflect library by Kurt Schelfthout (https://bitbucket.org/kurt/fsreflect) that uses dynamic methods to make the F# reflection API blazing fast.

The combination of the two yields a remarkably fast serializer when it comes to large tree structures and quotations. I have observed performance of up to 10-20x faster than BinaryFormatter or NetDataContractSerializer. Additionally, this implementation supports object caching, fallback serialization for non F# types, dynamic resolution of nested untyped fields and declaration of custom formatters. All round, this works as a general-purpose binary serializer.

I have named it quite unoriginally FsCoreSerializer and the source code has been uploaded to github. As always, your remarks and feedback would be very much appreciated.

EDIT: A nuget package is now available.

Serializing F# types