 Plasma GitLab Archive Projects Blog Knowledge  BLOG ON CAMLCITY.ORG: O'Caml Explained

# Mixing Apples And Pears

Using Polymorphic Variants - by Gerd Stolpmann, 2008-02-05

It is one of the coolest language constructs, but its conception leads sometimes to confusion. O'Caml allows it to form ad-hoc unions of tagged values, the so-called polymorphic variants. They are the free-style counterpart of the "normal" variant types. We want to shed some light on this construction in this article, and encourage programmers to try it out.

The most baffling property of the polyvariants is that one can mix tags that come from different pieces of code. We'll give an example of that later in the text, but first let's explain some foundations. Syntactically, the tags are written with a leading reverse apostrophe, so for example ``Apple` is a (value-less) tag. Like the normal variants the tags can have attached values, so for instance ``Pear "it's sweet man"` is a tag with a string value. It is not required to declare polyvariant types, one can simply start creating such tagged values in the code.

## Data Analysis Step-By-Step

Imagine we want to analyze a string. In a first step, we would like to classify every character of the string, and determine whether it is a letter, a digit, or something else. Using polyvariants, this function does the job:

```let classify_chars s = let rec classify_chars_at p = if p < String.length s then let c = s.[p] in let cls = match c with | '0'..'9' -> `Digit c | 'A'..'Z' | 'a'..'z' -> `Letter c | _ -> `Other c in cls :: classify_chars_at (p+1) else [] in classify_chars_at 0 ```

So this function would return this list for the input string "a56*":

```[ `Letter 'a'; `Digit '5'; `Digit '6'; `Other '*' ] ```

Note that there is no type declaration! If you enter this function into the O'Caml toploop, you see that its type is inferred like this:

```val classify_chars : string -> [> `Digit of char | `Letter of char | `Other of char ] list = <fun> ```

Read this as: The function returns tagged values with tags ``Digit`, ``Letter`, or ``Other`, and every tag has an attached character. Note the "greater than" sign at the beginning of the tag list. It means that this tag list is compatible with being mixed with completely unrelated tags. There are also polyvariant types where this sign is reversed (like in `[<...]`), or completely missing. We'll come back to that later.

Back to our string analysis example. We are now interested in recognizing integer numbers in the list of classified characters. We assume our input is a list that contains ``Digit` tags, but also other tags. In the output list, sequences of ``Digit` are replaced by ``Number`:

```let recognize_numbers l = let rec recognize_at m acc = match m with | `Digit d :: m' -> let d_v = Char.code d - Char.code '0' in let acc' = match acc with | Some v -> Some(10*v + d_v) | None -> Some d_v in recognize_at m' acc' | x :: m' -> ( match acc with | None -> x :: recognize_at m' None | Some v -> (`Number v) :: x :: recognize_at m' None ) | [] -> ( match acc with | None -> [] | Some v -> (`Number v) :: [] ) in recognize_at l None ```

The inferred type of this function is now really strange:

```val recognize_numbers : ([> `Digit of char | `Number of int ] as 'a) list -> 'a list = ```

Basically, it says that there is a tagged list as input, and that the output list has the same tags. Furthermore, the ">" sign again signals extensibility, so we can not only use the tags mentioned in the function, but any other tag as well. Especially, we are free to pass ``Letter` and ``Other` tags in:

```recognize_numbers [ `Digit '1' ; `Digit '3'; `Letter 'a'; `Digit '2'; `Other '*' ] ``` yields ```[ `Number 13; `Letter 'a'; `Number 2; `Other '*'] ```

Note that the type of the `recognize_numbers` function does not reflect all what we could know about the function. We can be sure that the function will never return a ``Digit` tag, but this is not expressed in the function type. We have run into one of the cases where the O'Caml type system is not powerful enough to find this out, or even to write this knowledge down. In practice, this is no real limitation - the types are usually a bit weaker than necessary, but it is unlikely that weaker types cause problems.

The really great thing about the polymorphic variants is that it is possible to mix tags that come from different contexts. So in our example we can combine the two functions, and apply one after the other:

```let analyze s = recognize_numbers (classify_chars s) ```

It is no problem that `classify_chars` emits tags that are completely unknown to `recognize_numbers`. And both functions can use the same tag, ``Digit`, without having to declare in some way that they are meaning the same. It is sufficient that the tag is the same, and that the attached value has the same type.

This may have big advantages for structuring programs. Of course, our example of string analysis already benefits from the loose type correspondence the polyvariants make possible. The problem can now be divided into several steps, and every step needs only to know the tags it operates on. There is no global type all steps have to agree upon, rather every step sees only the fraction of type information that is needed for the task.

## Limiting Tags

Compare these two functions:

```let number_value1 t = match t with | `Number n -> n | `Digit d -> Char.code d - Char.code '0' let number_value2 t = match t with | `Number n -> n | `Digit d -> Char.code d - Char.code '0' | _ -> failwith "This is not a number" ```

The difference is that the second version explicitly catches the case of "any other tag" whereas the first version leaves this unspecified. This leads to different typings:

```val number_value1 : [< `Digit of char | `Number of int ] -> int val number_value2 : [> `Digit of char | `Number of int ] -> int ```

So in the first version we have a "<" sign! This sign usually only appears for input arguments, and means that the function can only process these tags (or less tags), but no other tags. The type checker prevents that any other tag can be passed in:

```# number_value1 (`Letter 'a');; This expression has type [> `Letter of char ] but is here used with type [< `Digit of char | `Number of int ] ```

In the second version of the function, this case is handled at runtime. From a typing perspective, the ">" sign signals that the function can also accept other tags than the mentioned ones. However, this function actually raises an exception.

```# number_value2 (`Letter 'a');; Exception: Failure "This is not a number". ```

So the "<" sign is a way to limit the number of tags a function can process. The programmer gets it by not adding a "catch all" case to pattern matchings. This kind of polyvariant is useful to enforce some strictness in programming, and the type checker catches the cases that would otherwise only be handed at runtime.

## Giving Polyvariants Names

Although the mantra of this article is that we don't need declarations, it is of course possible to define named polymorphic variants. For example, we could introduce these named types:

```type classified_char = [ `Digit of char | `Letter of char | `Other of char ] type number_token = [ `Digit of char | `Number of int ] ```

Note that there is no ">" or "<" sign in such definitions - it would not make sense to say something about whether more or less tags are possible than given, because the context is missing.

Using these names, one could simplify the typings of our functions:

```val classify_chars : string -> [> classified_char ] val recognize_numbers : ( [> number_token ] as 'a) list -> 'a list val number_value1 : [< number_token ] -> int val number_value2 : [> number_token ] -> int ```

As you see, the ">" or "<" sign can still appear in function types using these names. Actually, there is a special syntax behind this notation. If you just say `number_token` in a type expression, exactly the definition applies (without sign). But you can also construct new polyvariants from existing ones. For example,

```[ classified_char | number_token ] ```

would mean a type that combines the tags of both types, i.e. this is the same as if all four tags were enumerated. The syntax `[<number_token ]` is only a special case of this type constructor, where the new type also gets a sign.

## Matching Variants

Try to compile this piece of code that ought to sum up all ``Digit` and ``Number` tags of a list of any tags:

```let rec sum l = match l with | x :: l' -> ( match x with | `Digit _ | `Number _ -> number_value1 x + sum l' | _ -> sum l' ) | [] -> 0 ```

It compiles, but there is a little surprise. The compiler emits a warning that the `_ -> sum l'` case of the matching is unused, and the inferred type is just

```val sum : [ `Digit of char | `Number of int ] list -> int ```
i.e. the ">" sign is missing that would allow us to pass any tags in. What's wrong?

This is a pitfall one quickly runs into when using polyvariants. The type checker assumes that the `x` in `number_value1 x` has the same type as the `x` that is matched again. It is not sufficient to use a matched variable in the expression of the matched case to restrict its type. Fortunately, there is a special syntax for that (look at the bold "as" clause):

```let rec sum l = match l with | x :: l' -> ( match x with | (`Digit _ | `Number _) as y -> number_value1 y + sum l' | _ -> sum l' ) | [] -> 0 ```

Here, `y` is the value `x` for the case that the match applies, so `y` can have a stricter type than `x`.

Alternatively, the match condition could also have been written as:

```match x with | #number_token as y -> ... ```
i.e. one can use named polyvariants in matchings. This is just a convenience notation for the former.

## Behind The Scene

The polyvariants might be cool, but many programmers suspect that the performance of their programs suffer when they use them. Well, although there are some runtime cost, these are very small, and not noticeable for many programs.

Internally, the tags are represented by hash values of the names of the tags. So the tags are simply reduced to integers at runtime. Compared with the normal variant types, there is some additional overhead for tags with values. In particular, for storing ``X value` one extra word is needed in comparison with the normal variant `X value`.

It is possible that the hash values of variants collide, e.g.

```# type x = [ `jagJhn | `oZshTt ];; Variant tags `jagJhn and `oZshTt have same hash value. Change one of them. ```
As you see, the compiler checks for this rare case. I've never seen it in practice.

Despite rumours, there is nothing special done at link time. The tags are already cut down to integers at this point of compiling.

## Conclusion

I hope I've shown how elegant code looks that uses polyvariants to represent data cases. But there is some more to say, especially if you look at other programming languages.

The author of this article thinks that polymorphic variants are one of the features that makes O'Caml so different in comparison with mainstream languages like Java. In particular, there are competing approaches for representing data cases, and one of the radical ideas of object orientation has always been that combining data and program cases into a single class construct is the best way to deal with the problem. However, I think something has been overlooked - data and algorithms do not always walk hand in hand, and classes are inflexible if only a loose correlation between both is needed. In contrast, polyvariants give the programmer the maximum of freedom in this respect. Thus it is believed that polyvariants are a serious alternative for representing data cases.

Gerd Stolpmann works as O'Caml consultant 