<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>Blog on camlcity.org</title>
    <link>http://blog.camlcity.org</link>
    <language>en</language>
    <description>Articles by Gerd Stolpmann about O'Caml</description>

    
        <item>
          <title>Ocamlnet 3 finally released</title>
          <guid>http://blog.camlcity.org/blog/ocamlnet3_release.html</guid>
          <link>http://blog.camlcity.org/blog/ocamlnet3_release.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;What&#38;#39;s new in Ocamlnet 3&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
So, finally it is there:
&#60;a href=&#34;http://projects.camlcity.org/projects/ocamlnet.html&#34;&#62;Ocamlnet 
3.0.0&#60;/a&#62;. After almost 3 years of development, many parts of Ocamlnet
have been touched and extended while keeping most of the existing APIs.
It is not immediately visible what the striking new features are, so a
bit of explanation is necessary.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
When renovating a building, it is common to do this floor by floor. In
this sense, Ocamlnet 3.0.0 focused on the foundation and the first
floor. Also, the renovation is not yet finished - many features still
need to be added, like supporting SSL for more protocols. This is now
easier thanks to some new basic APIs that have been introduced in the
first step.

&#60;/p&#62;&#60;h2&#62;Netsys&#60;/h2&#62;

&#60;p&#62;
One of the parts that got most attention is &#60;code&#62;Netsys&#60;/code&#62;, the
library adding the missing links to the operating system (OS). One of
the driving forces was the port to Win32. This lead to the introduction
of generalized versions of &#60;code&#62;Unix.read&#60;/code&#62; and &#60;code&#62;Unix.write&#60;/code&#62;
calls (defined in &#60;code&#62;Netsys&#60;/code&#62;):

&#60;/p&#62;&#60;pre&#62;
val gread : fd_style -&#38;#62; Unix.file_descr -&#38;#62; string -&#38;#62; int -&#38;#62; int -&#38;#62; int
val gwrite : fd_style -&#38;#62; Unix.file_descr -&#38;#62; string -&#38;#62; int -&#38;#62; int -&#38;#62; int
&#60;/pre&#62;

For getting some Win32-specific emulations right, it is sometimes
required to call other functions instead of &#60;code&#62;Unix.read&#60;/code&#62;
and &#60;code&#62;Unix.write&#60;/code&#62;, e.g. &#60;code&#62;Netsys_win32.pipe_read&#60;/code&#62;
and &#60;code&#62;Netsys_win32.pipe_write&#60;/code&#62;. In order to avoid that such
case distinctions are scattered over the whole library, the idea of
defining these generic functions was born. In &#60;code&#62;fd_style&#60;/code&#62; the
user passes in how to handle the
descriptor. Usually &#60;code&#62;fd_style&#60;/code&#62; is automatically determined
by another function &#60;code&#62;get_fd_style&#60;/code&#62; (this requires a few
system calls and is factored out because of this). Although targeting mostly
at Win32, there are already
some benefits for POSIX systems, e.g. the &#60;code&#62;fd_style&#60;/code&#62; already
encodes whether a descriptor is a socket, and whether it is connected,
which is sometimes quite useful information. In the future, this system
will be extended:

&#60;ul&#62;
  &#60;li&#62;&#60;p&#62;Seekable files are currently not well supported by the
         asynchronous I/O layer. The reason is that the &#60;code&#62;select&#60;/code&#62;
         and &#60;code&#62;poll&#60;/code&#62; system calls cannot predict whether I/O would be 
         blocking or non-blocking (and thus always say non-blocking).
         This can be improved by using special AIO calls of the OS.
         Of course, files for which AIO is to be used need to be
         flagged specially, and a new &#60;code&#62;fd_style&#60;/code&#62; could
         do so.
  &#60;/p&#62;&#60;/li&#62;&#60;li&#62;&#60;p&#62;There are also some ideas for labeling SSL sockets by a special
         &#60;code&#62;fd_style&#60;/code&#62;. This would make it a bit easier to
	 support SSL thoughout the library. This is a bit more work
	 than just calling &#60;code&#62;Ssl.read&#60;/code&#62; and &#60;code&#62;Ssl.write&#60;/code&#62;,
	 though, because the SSL protocol allows renegotiations at any
	 time, and a read may also require writes on the socket level,
	 and vice versa.
&#60;/p&#62;&#60;/li&#62;&#60;/ul&#62;

Another new idea on the &#60;code&#62;Netsys&#60;/code&#62; level is a little object
definition called &#60;code&#62;pollset&#60;/code&#62;:

&#60;pre&#62;
class type pollset =
object
  method find : Unix.file_descr -&#38;#62; Netsys_posix.poll_req_events
  method add : Unix.file_descr -&#38;#62; Netsys_posix.poll_req_events -&#38;#62; unit
  method remove : Unix.file_descr -&#38;#62; unit
  method wait : float -&#38;#62; 
                ( Unix.file_descr * 
                  Netsys_posix.poll_req_events * 
                  Netsys_posix.poll_act_events ) list
  method dispose : unit -&#38;#62; unit
  method cancel_wait : bool -&#38;#62; unit
end
&#60;/pre&#62;

A &#60;code&#62;pollset&#60;/code&#62; represents a set of file descriptor events one
wants to poll. Again, this data structure was originally required
for the Win32 port (because Win32 is very different in this respect),
but there are also advantages for Unix systems. Nowadays, there are
various improved APIs for polling such as Linux epoll or BSD kqueue.
The &#60;code&#62;pollset&#60;/code&#62; abstraction will make it very easy to support
these - the user simply selects one of the advanced implementations
of &#60;code&#62;pollset&#60;/code&#62;, and thanks to dynamic binding of object
methods it is automatically used everywhere. (One of the next
versions of Ocamlnet will allow this.)

&#60;p&#62;Another word about polling. The Ocaml runtime only provides
&#60;code&#62;select&#60;/code&#62;. Although not as bad as claimed by some people,
it imposes artificial limitations, especially about the number of
supported file descriptors. Because of this, &#60;code&#62;Netsys_posix&#60;/code&#62;
includes now a binding of the &#60;code&#62;poll&#60;/code&#62; system call which is not
suffering from this disease. Of course, &#60;code&#62;poll&#60;/code&#62; is now the
only poll API used throughout Ocamlnet (and, as noted, even better
APIs will be supported in one of the next releases).

&#60;/p&#62;&#60;p&#62;Other additions on the OS level for Unix systems:

&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;&#60;code&#62;Netsys_posix.spawn&#60;/code&#62; is a new way of starting subprograms,
with special support for monitoring the subprocesses asynchronously
  &#60;/li&#62;&#60;li&#62;There are now bindings for syslog in &#60;code&#62;Netsys_posix&#60;/code&#62;
  &#60;/li&#62;&#60;li&#62;The system calls &#60;code&#62;fsync&#60;/code&#62; and &#60;code&#62;fdatasync&#60;/code&#62; are
supported
  &#60;/li&#62;&#60;li&#62;If the OS provides this call, &#60;code&#62;fadvise&#60;/code&#62; can be invoked
to control the page cache
  &#60;/li&#62;&#60;li&#62;There is also &#60;code&#62;fallocate&#60;/code&#62; to allocate disk space, so far
the OS provides it
  &#60;/li&#62;&#60;li&#62;POSIX semaphores are supported, so far the OS provides the complete
interface (i.e. named semaphores for synchronization between unrelated
processes)
  &#60;/li&#62;&#60;li&#62;There is a coordinator module for signals, &#60;code&#62;Netsys_signal&#60;/code&#62;,
so that various users of signals do not mutually override their handlers
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;
For all systems, &#60;code&#62;Netsys&#60;/code&#62; implements:

&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;Wrappers for multicasting system calls on sockets
  &#60;/li&#62;&#60;li&#62;In &#60;code&#62;Netsys_mem&#60;/code&#62; there is now special support for
using bigarrays of chars as efficient I/O buffers. Such
bigarray-backed buffers are called &#60;code&#62;memory&#60;/code&#62; (reminding us
to the fact that these buffers are not relocatable like strings, but
bound to fixed memory addresses). There are functions for allocating
page-aligned or cache-line-aligned &#60;code&#62;memory&#60;/code&#62; buffers. Also,
there is experimental support for copying Ocaml values into buffers
(used by the Camlbox module, see below). Finally, there are also
versions of &#60;code&#62;read&#60;/code&#62;, &#60;code&#62;write&#60;/code&#62;, &#60;code&#62;recv&#60;/code&#62;
and &#60;code&#62;send&#60;/code&#62; operating on memory buffers rather than strings.
These versions open the door to zero-copy network I/O (if supported by
the OS).
  &#60;/li&#62;&#60;li&#62;For better support of multi-threading there is now a version of
the thread API that even exists when the thread library is not linked in,
so that especially critical sections are emulated as no-ops in the 
single-threaded case. It is hoped that more functions can be made
thread-safe by this new feature (in &#60;code&#62;Netsys_oothr&#60;/code&#62;).
  &#60;/li&#62;&#60;li&#62;The exception registry &#60;code&#62;Netexn&#60;/code&#62; is now almost outdated,
because the Ocaml standard library recently introduced a similar
feature (yes, sometimes feature wishes are honoured :-).
&#60;/li&#62;&#60;/ul&#62;

&#60;h2&#62;Equeue&#60;/h2&#62;

&#60;p&#62;
As &#60;code&#62;Netsys&#60;/code&#62; uses now pollsets to manage
polling, &#60;code&#62;Equeue&#60;/code&#62; had to be rewritten to take advantage of
this. In particular, there is now &#60;code&#62;Unixqueue_pollset&#60;/code&#62; which
is a port of the old &#60;code&#62;Unixqueue&#60;/code&#62; API around pollsets. For
the user, there is absolutely no difference.

&#60;/p&#62;&#60;p&#62;
What&#38;#39;s more important is the extension of the engine API. Ocamlnet 2
introduced engines as a way of expressing a suspended I/O possibility,
but there was only limited support for it in the library. This has now
changed - engines are now a first class member of Ocamlnet. In particular,
there are now much more synchronization primitives (e.g.
&#60;code&#62;stream_seq_engine&#60;/code&#62; for executing an open number of engines
in sequence, or &#60;code&#62;msync_engine&#60;/code&#62; for waiting for the
completion of multiple engines). This development was mostly driven by
another project of mine: Plasma (see other blog articles on this
site). Plasma uses engines for all kinds of concurrent execution of
I/O code, and while I was developing Plasma, I extended the Ocamlnet
engine API step by step.

&#60;/p&#62;&#60;p&#62;
There is also now a way to call RPC procedures with an engine:
&#60;code&#62;Rpc_proxy.ManagedClient.rpc_engine&#60;/code&#62;. This function has
originally also been developed for the Plasma project.

&#60;/p&#62;&#60;p&#62;
For simpler I/O needs, I added &#60;code&#62;Uq_io&#60;/code&#62;. It contains 
&#38;#34;engineered&#38;#34; versions of simple I/O functions like &#60;code&#62;input&#60;/code&#62;,
&#60;code&#62;input_line&#60;/code&#62; or &#60;code&#62;flush&#60;/code&#62;. &#60;code&#62;Uq_io&#60;/code&#62; is
not limited to file descriptors, but works also on top of a number
of other I/O devices (including virtual ones).

&#60;/p&#62;&#60;p&#62;
The operators &#60;code&#62;++&#60;/code&#62; and &#60;code&#62;&#38;#62;&#38;#62;&#60;/code&#62; have been
introduced as abbreviations for sequential execution, and result
mapping of engines, respectively. For example, the synchronous
code

&#60;/p&#62;&#60;pre&#62;
let line1 = input_line ch_in in
let line2 = input_line ch_in in
output_string ch_out (line1 ^ line2 ^ &#38;#34;\n&#38;#34;)
&#60;/pre&#62;

would now look in &#38;#34;engineered&#38;#34; code:

&#60;pre&#62;
Uq_io.input_line_e d_in ++
  (fun line1 -&#38;#62;
    Uq_io.input_line_e d_in ++
      (fun line2 -&#38;#62;
        Uq_io.output_string_e d_out (line1 ^ line2 ^ &#38;#34;\n&#38;#34;)
      )
  )
&#60;/pre&#62;

Not bad, if you compare with the previous solution (hand-coding a
scanner for lines, writing the event handler routines, etc., adding
up to 100-200 lines of code).


&#60;h2&#62;Netplex&#60;/h2&#62;

The development in the &#60;code&#62;Netplex&#60;/code&#62; area was focused on easing
multi-processing. With &#60;code&#62;Netplex&#60;/code&#62; it is very easy to run
code in several worker processes, e.g. for network servers. What was
missing up to now, however, was an easy way to manage the
collaboration of the processes.

&#60;p&#62;
&#60;code&#62;Netplex&#60;/code&#62; worker processes got now a number of ways to
talk to each other:

&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;It is now possible to store variables in a common place, so that
each process can get and set these (&#60;code&#62;Netplex_sharedvar&#60;/code&#62;).
Of course, this mechanism is typed.
  &#60;/li&#62;&#60;li&#62;There are mutexes and semaphores for synchronization
(&#60;code&#62;Netplex_mutex&#60;/code&#62; and &#60;code&#62;Netplex_semaphore&#60;/code&#62;)
  &#60;/li&#62;&#60;li&#62;Each process can be directly contacted via a private channel,
the so-called container socket. This is also an RPC mechanism, but
unlike normal RPC servers the caller directly addresses a process
(and not only a service in general, and the Netplex machinery automatically
selects the destination process). There is also a directory so that
processes can see which other processes exist.
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;
The implementation of these mechanisms is not yet optimal, but the APIs
are defined and backed by simple but robust modules. It is expected that
in the future more sophisticated implementations will become available,
e.g. the &#60;code&#62;Netplex_sharedvar&#60;/code&#62; code use a shared memory object
if the OS supports that.

&#60;/p&#62;&#60;p&#62;
Another addition are &#38;#34;levers&#38;#34;. This kind of handle exists within the
Netplex master process, but can be activated from the child processes.
It is a kind of little RPC function for a special purpose: Sometimes
the process model requires that certain functionality must be done
within the scope of the master process. An example would be the start
of another child process. By doing that via a lever, this action can
also be triggered from any child process.

&#60;/p&#62;&#60;p&#62;
Besides that there are numerous smaller enhancements. Especially
the module &#60;code&#62;Netplex_cenv&#60;/code&#62; has been extended, e.g. there
are now timers that can be attached to the Netplex event queue.

&#60;/p&#62;&#60;h2&#62;RPC&#60;/h2&#62;

The development went into two directions: First, it was aimed at a
more powerful RPC client implementation, and second, performance
performance performance.

&#60;p&#62;
The improved client is called &#60;code&#62;Rpc_proxy&#60;/code&#62;. All experience
went in that I made at my Ocaml job at Mylife.com - lots of RPC calls
in an unreliable environment (if you have hundreds of machines, one
box is always down). Clients can now be recycled, they can react
better on errors, and even load balancing and fail-over to alternate
endpoints are now supported. (See the other blog posting, &#38;#34;The next
server, please!&#38;#34;.)

&#60;/p&#62;&#60;p&#62;
Performance improvements were achieved by two means: First, the XDR
encoding and decoding was optimized. This has not yet come to an end
yet, but certain XDR types like arrays of strings are now processed a
lot faster. The other strategy was to replace many string buffers by
bigarrays of char (see under &#38;#34;memory&#38;#34; above). This allows it to get rid
of a number of copy operations, especially when large strings are
transmitted via RPC. This new string representation is even accessible
by user code via a new XDR type &#60;code&#62;_managed string&#60;/code&#62;. This
may avoid even more copies.

&#60;/p&#62;&#60;h2&#62;Shell&#60;/h2&#62;

The API of &#60;code&#62;Shell&#60;/code&#62; is mostly the same - only a few
suspicious functions have been removed. The implementation, however,
has changed a lot.

&#60;p&#62;
&#60;code&#62;Shell&#60;/code&#62; now uses the new &#60;code&#62;Netsys&#60;/code&#62; functions for
starting subprocesses. As these functions are written in C, one gets
some immediate benefits: &#60;code&#62;Shell&#60;/code&#62; is now officially supported
for multi-threaded programs because it is possible to do the signal
handling right in C (but still, this is notoriously difficult). Also,
there is now no risk anymore that the Ocaml garbage collector wants to
clean up in the worst moment, namely between fork and exec.

&#60;/p&#62;&#60;p&#62;
Another benefit is that &#60;code&#62;Shell&#60;/code&#62; works now also under Win32.
The C part is completely different, though.


&#60;/p&#62;&#60;h2&#62;Netcgi&#60;/h2&#62;

Not much has changed here, only that the old version &#60;code&#62;Netcgi1&#60;/code&#62;
is gone now.


&#60;h2&#62;Camlboxes&#60;/h2&#62;

An exciting but still experimental addition are Camlboxes. They are
designed as a fast way of sending messages between unrelated
processes. Camlboxes use shared memory for communication.

&#60;p&#62;
This works as follows: If process 1 want to send process 2 a message,
both have to map the same memory pages into their address space. The
message is orignally an Ocaml value somewhere in the private memory
of process 1. With the help of Camlbox this value is now copied to
shared memory so that, and this is the pivotal point, process 2 can
directly access the value without additional decoding step. This
reduces greatly the overhead of message sending - actually only a
relatively fast value copy is done, bypassing any kernel-controlled
I/O devices.

&#60;/p&#62;&#60;p&#62;
For passing a short message, this takes now only a few microseconds.
Most of that time is spent for synchronization, of course, not for
copying. (On the hardware level, the synchronization is mostly done by
moving cache lines from one CPU core to the other, so this is some
kind of hidden copying. It is worth noting that Camlboxes are way
faster on single-core machines than on multi-cores because this
low-level synchronization is not required then.)

&#60;/p&#62;&#60;p&#62;
Camlboxes have one downside, though. They are not perfectly integrated
into the garbage collecting machinery, and because of this, one has to
follow some programming rules. In particular, there is no way to
recognize that a message (or part of it) is no longer referenced, so
messages are manually deleted, and there is of course the danger that
bad code keeps references to (or into) deleted messages. For fixing
this, we would need more help by the Ocaml GC.

&#60;/p&#62;&#60;p&#62;
Another problem is missing integration with &#60;code&#62;Equeue&#60;/code&#62;.
Camlboxes are synchronous by design - that&#38;#39;s the price for their speed.


&#60;/p&#62;&#60;h2&#62;Where to get Ocamlnet 3&#60;/h2&#62;

Look at the &#60;a href=&#34;http://projects.camlcity.org/projects/ocamlnet.html&#34;&#62;
project page&#60;/a&#62; for the newest version and links to the manual, mailing
list, etc.
&#60;br/&#62; &#60;br/&#62;

&#60;img src=&#34;/files/img/blog/ocamlnet3_release_bug.gif&#34; width=&#34;1&#34; height=&#34;1&#34;/&#62;


&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant.
&#60;a href=&#34;http://www.gerd-stolpmann.de/buero/work_ocaml_search.html.en&#34;&#62;
He is accepting new customers!&#60;/a&#62;


&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Plasma build simplified</title>
          <guid>http://blog.camlcity.org/blog/plasma2.html</guid>
          <link>http://blog.camlcity.org/blog/plasma2.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;Plasma MapReduce and PlasmaFS&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
There is now a script that simplifies the Plasma build:
&#60;a href=&#34;http://download.camlcity.org/download/plasma_install.sh&#34;&#62;
plasma_install.sh&#60;/a&#62;. It just bootstraps GODI and puts a complete
file tree into /opt/plasma - including ocaml and all required
libraries.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
The script has even built-in knowledge about certain Linux
distributions, and installs missing system packages. Nobody
should stumple over libpq-dev. It can also configure PostgreSQL
to some extent.

&#60;/p&#62;&#60;p&#62;
I hope this script is especially for non-ocamlers a big help,
because one cannot expect knowledge about how ocaml
software is usually installed. And Plasma is still missing
in distros.

&#60;/p&#62;&#60;p&#62;More information about Plasma can be found on the
&#60;a href=&#34;http://projects.camlcity.org/projects/plasma.html&#34;&#62;project page&#60;/a&#62;.

&#60;img src=&#34;/files/img/blog/plasma2_bug.gif&#34; width=&#34;1&#34; height=&#34;1&#34;/&#62;

&#60;/p&#62;
&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Plasma: Map/Reduce for Ocaml</title>
          <guid>http://blog.camlcity.org/blog/plasma1.html</guid>
          <link>http://blog.camlcity.org/blog/plasma1.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;Plasma MapReduce and PlasmaFS&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
I&#38;#39;m very proud to announce the public availability of Plasma
MapReduce, a map/reduce compute framework, and PlasmaFS, the
underlying distributed filesystem. All of this is written in
Ocaml and makes it now possible to develop map/reduce programs
in a functional programming language.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
Plasma MapReduce is a distributed implementation of the map/reduce
algorithm scheme. In a sentence, map/reduce performs a parallel
List.map on an input file, sorts and splits the output by some
criterion into partitions, and runs a List.fold_left on each
partition. Only that it does not do that sequentially, but in a
distributed way, and chunk by chunk. Because of this Plasma MapReduce
can process very large files, and if run on enough computers, this
also will work in reasonable time. Of course, map and reduce are Ocaml
functions here.

&#60;/p&#62;&#60;p&#62;
This all works on top of a distributed filesystem, PlasmaFS. This is a
user-space filesystem that is primarily accessed over RPC (but it is
also mountable as NFS volume). Actually, most of the effort went
here. PlasmaFS focuses on reliability and speed for big blocksizes. To
get this, it implements ACID transactions, replicates data and
metadata with two-phase commit, uses a shared memory data channel if
possible, and monitors itself. Unlike other filesystems for
map/reduce, PlasmaFS implements the complete set of usual file
operations, including random reads and writes. It can also be used as
unspecialized global filesystem.

&#60;/p&#62;&#60;p&#62;Both pieces of software are bundled together in one download. Here is the
&#60;a href=&#34;http://projects.camlcity.org/projects/plasma.html&#34;&#62;project page&#60;/a&#62;.

&#60;/p&#62;&#60;p&#62;
This is an early alpha release. A lot of things work already, and you
can already run map/reduce jobs. However, it is in no way complete.

&#60;img src=&#34;/files/img/blog/plasma1_bug.gif&#34; width=&#34;1&#34; height=&#34;1&#34;/&#62;

&#60;/p&#62;
&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Cluster Computing at Mylife.com</title>
          <guid>http://blog.camlcity.org/blog/omeeting2010.html</guid>
          <link>http://blog.camlcity.org/blog/omeeting2010.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;Slides of the talk in Paris&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
For anybody who is interested, here are the slides of the talk I
gave in Paris yesterday.

&#60;/div&#62;

&#60;div&#62;
  
&#60;a href=&#34;/download/omeeting2010.pdf&#34;&#62;Cluster Computing at Mylife.com&#60;/a&#62;


&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Ocamlnet-3.0test2</title>
          <guid>http://blog.camlcity.org/blog/ocamlnet3_test2.html</guid>
          <link>http://blog.camlcity.org/blog/ocamlnet3_test2.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;Second testing version&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
A second testing version of Ocamlnet 3 has been released:
&#60;a href=&#34;http://download.camlcity.org/download/ocamlnet-3.0test2.tar.gz&#34;&#62;Ocamlnet-3.0test2&#60;/a&#62;.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
This version mainly includes a large number of bug fixes compared to
the first version, but also a few additions:

&#60;/p&#62;&#60;ul&#62;
&#60;li&#62;netcamlbox: a fast ipc mechanism for sending ocaml values to
      another process. Netcamlbox is shared-memory based, and works
      well on multi-cores (see
      &#60;a href=&#34;http://projects.camlcity.org/projects/dl/ocamlnet-3.0test2/doc/html-main/Netcamlbox.html&#34;&#62;Netcamlbox.html&#60;/a&#62; for doc)
&#60;/li&#62;&#60;li&#62;netplex adds per-process sockets, so one can send messages to
individual containers, and not only to the whole service
&#60;/li&#62;&#60;li&#62;wrappers for POSIX semaphores
&#60;/li&#62;&#60;li&#62;wrappers for syslog
&#60;/li&#62;&#60;li&#62;performance optimizations (serialization, page-aligned I/O)
&#60;/li&#62;&#60;li&#62;updated documentation
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;
Already in the first test version:

&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;Port to Win32 (as outlined in &#60;a href=&#34;http://blog.camlcity.org/blog/ocamlnet3_win32.html&#34;&#62;Stranger an a strange land&#60;/a&#62;)
  &#60;/li&#62;&#60;li&#62;The new &#60;code&#62;Rpc_proxy&#60;/code&#62; layer (as described in &#60;a href=&#34;http://blog.camlcity.org/blog/ocamlnet3_ha.html&#34;&#62;The next server, please!&#60;/a&#62;)
  &#60;/li&#62;&#60;li&#62;Extensions of Netplex (see especially &#60;a href=&#34;http://blog.camlcity.org/blog/ocamlnet3_mp.html&#34;&#62;Mastering Multi-processing&#60;/a&#62;)
  &#60;/li&#62;&#60;li&#62;New implementation of the Shell library for starting subprocesses
  &#60;/li&#62;&#60;li&#62;Uniform debugging with &#60;code&#62;Netlog.Debug&#60;/code&#62;
  &#60;/li&#62;&#60;li&#62;Exception printers (&#60;code&#62;Netexn&#60;/code&#62;)
  &#60;/li&#62;&#60;li&#62;Introduction of pollsets (&#60;code&#62;Netsys_pollset&#60;/code&#62;); removal of
      &#60;code&#62;Unix.select&#60;/code&#62; (i.e. more than 1024 file descriptors)
  &#60;/li&#62;&#60;li&#62;The &#60;code&#62;netcgi1&#60;/code&#62; library has been dropped in favor of
      &#60;code&#62;netcgi2&#60;/code&#62;
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;
I&#38;#39;ve quickly checked that the library builds on linux, freebsd-7.2,
open solaris, and Win32 (MinGW). Nevertheless, testers are especially
encouraged to check whether Ocamlnet 3 still works on all platforms,
because a lot of new platform-specific code has been added.

&#60;/p&#62;&#60;p&#62;Download etc:

&#60;/p&#62;&#60;ul&#62;
&#60;li&#62;&#60;a href=&#34;http://projects.camlcity.org/projects/ocamlnet.html&#34;&#62;Homepage&#60;/a&#62;
&#60;/li&#62;&#60;li&#62;&#60;a href=&#34;http://download.camlcity.org/download/ocamlnet-3.0test2.tar.gz&#34;&#62;Source&#60;/a&#62;
&#60;/li&#62;&#60;li&#62;&#60;a href=&#34;http://projects.camlcity.org/projects/dl/ocamlnet-3.0test2/doc/html-main/index.html&#34;&#62;Manual&#60;/a&#62;
&#60;/li&#62;&#60;li&#62;&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/TODO&#34;&#62;My &#38;#34;scratch pad&#38;#34; describing changes, plans, etc
&#60;/a&#62;&#60;/li&#62;&#60;li&#62;&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/&#34;&#62;Subversion&#60;/a&#62;
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;
There is a GODI package, but you have to enable a special repository
to get it: Add

&#60;/p&#62;&#60;blockquote&#62;
GODI_BUILD_SITES+=http://www.ocaml-programming.de/godi-build/ocamlnet3/
&#60;/blockquote&#62;

to godi.conf to see the new packages in godi_console. This works first
after the bootstrap is finished (godi_console cannot be built with
ocamlnet3 yet). Keep in mind that this is development code, and there
is no easy way to downgrade to ocamlnet2. Best is you do this only for
new GODI installations.

&#60;p&#62;Special thanks to everybody who helped me to produce this new
version - by reporting bugs, or even sending fixes, or by maintaining
subtrees (Christophe Troestler).

&#60;/p&#62;&#60;p&#62;
More blog postings will follow describing the highlights.

&#60;/p&#62;&#60;p&#62;
Please report results to gerd@gerd-stolpmann.de
&#60;/p&#62;

&#60;img src=&#34;/files/img/blog/ocamlnet3_test2_bug.gif&#34; width=&#34;1&#34; height=&#34;1&#34;/&#62;



&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant.
&#60;a href=&#34;http://www.gerd-stolpmann.de/buero/work_ocaml_search.html.en&#34;&#62;
He is accepting new customers!&#60;/a&#62;


&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Mastering Multi-processing</title>
          <guid>http://blog.camlcity.org/blog/ocamlnet3_mp.html</guid>
          <link>http://blog.camlcity.org/blog/ocamlnet3_mp.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;What&#38;#39;s new in Ocamlnet 3: Synchronization primitives in Netplex&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
Ocamlnet is being renovated, and there is already a first testing
version of Ocamlnet 3. The author, Gerd Stolpmann, explains in a
series of articles what is new, and why Ocamlnet is the best
networking platform ever seen. When it comes to parallelism, many
people react skeptical on multi-processing - mostly because this is
not the way of concurrent programming the &#38;#34;big players&#38;#34; prefer.
However, with the help of a good multi-processing framework like
Netplex, this style can be easily mastered, and delivers stable
results in a more quicker way than other styles such as
multi-threading. Ocamlnet 3 adds important synchronization primitives
like semaphores.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
Netplex is a framework for creating network services. In this model,
the most important role of processes is to serve as containers for
protocol interpreters that accept and deal with network connections
(&#38;#34;worker processes&#38;#34;). In the traditional multi-processing world (like
the &#60;code&#62;inetd&#60;/code&#62; master server in Unix) this is taken to the
extent that each connection is served by its own process. Netplex,
fortunately, relaxes this 1:1 relationship - a process can handle more
than one connection, either serially or even concurrently. Netplex was
introduced in Ocamlnet 2, and got a lot of attention since then. What
remained unanswered, however, is the question how to deal with tasks
that are not immediately network-driven, but nevertheless have to run
in parallel with the existing processes. In this article, I present
a way to control a global timer for a network service, so one can
run a certain task periodically - as an example how to use the new
synchronization primitives.
&#60;/p&#62;

&#60;p&#62;
Before going into detail, let me clarify on one important thing.
These synchronization primitives are not designed to be fast. They are
designed to be always available, and to behave well even under
abnormal operating circumstances (like crashing processes). Usually,
the operating system provides better implementations, but often these
implementations require certain arrangements (like knowing global
names, or certain process relationships). For many uses, it is more
important to have an immediately available mechanism than to have
the fastest one. Also, such a basic mechanism can be helpful for
negotiating faster ones between processes.

&#60;/p&#62;&#60;h2&#62;The example: Competing for running the timer&#60;/h2&#62;

&#60;p&#62;
In the Netplex process model one of the processes has a special role:
The master process is the first process, and the one that serves as
parent of all the other processes.  The master process is not involved
in CPU-intensive tasks, and is usually very responsive to incoming
requests. Because of this, it is the ideal instance to perform helper
tasks. Ocamlnet 3 adds the concept of plugins to the controller object
that runs in the master process. Plugins are little RPC servers that
can be attached to the main server the controller object provides.
The synchronization primitives are implemented as such plugins. Before
using a plugin, it needs to be added to the controller &#60;code&#62;ctrl&#60;/code&#62;,
such as in

&#60;/p&#62;&#60;pre&#62;
ctrl # add_plugin Netplex_semaphore.plugin
&#60;/pre&#62;

&#60;p&#62;
This is best done early in the program. Fortunately, a good moment for
doing so is provided as a service hook by Netplex. All these hooks
are bundled together as a hook object, and this object configures some
aspects of a Netplex network service:

&#60;/p&#62;&#60;pre&#62;
class type processor_hook =
object
  method post_add_hook : socket_service -&#38;#62; controller -&#38;#62; unit
  method post_rm_hook : socket_service  -&#38;#62; controller -&#38;#62; unit
  method pre_start_hook : socket_service -&#38;#62; controller -&#38;#62; container_id -&#38;#62; unit
  method post_start_hook : container -&#38;#62; unit
  method pre_finish_hook : container -&#38;#62; unit
  method post_finish_hook : socket_service -&#38;#62; controller -&#38;#62; container_id -&#38;#62; unit
  method receive_message :
            container -&#38;#62; string -&#38;#62; string array -&#38;#62; unit
  method receive_admin_message :
            container -&#38;#62; string -&#38;#62; string array -&#38;#62; unit
  method system_shutdown : unit -&#38;#62; unit
  method shutdown : unit -&#38;#62; unit
  method global_exception_handler : exn -&#38;#62; bool
end
&#60;/pre&#62;

&#60;p&#62;
Such a hook object can always be passed as an additional parameter to
the protocol interpreter implementation (no matter which: http, the
various CGI variants, Sun RPC, ICE as provided by Hydro). The
&#60;code&#62;add_plugin&#60;/code&#62; call is best done in the
&#60;code&#62;post_add_hook&#60;/code&#62; hook which is executed once after the
socket service is configured in the controller.  We will also use
other hooks for our timer - even more radical, we will only use hooks
to attach the timer to the network service. This means that we don&#38;#39;t
assume much about the service itself, and because of this the solution
is quite generic.

&#60;/p&#62;&#60;p&#62;
A semaphore is a resource counter. The counter can be atomically
increased and decreased, but can never fall below zero. An attempt to
do so either fails immediately, or the execution of the process blocks
until the counter is positive again. The idea behind that can be
illustrated by a queue supplying a number of processes with tasks. The
counter reflects the length of the queue. When a process needs another
task, it checks the counter, and tries to decrease it.  Either this
works immediately, which means that the process gets the task, and has
the right to fetch it from the queue. Or the queue is empty, and the
process has to wait until a new task is again posted to the queue.  Of
course, the latter is the interesting case, because it can happen that
several processes want to get new tasks from the empty queue.  The
semaphore then works as an arbitration mechanism, as it selects which
process will be considered first.

&#60;/p&#62;&#60;p&#62;
In our example, the resource is the timer, or better the possibility
of running the timer. We have a number of processes, but the timer
should only run in one of them (&#38;#34;it is owned by one process only&#38;#34;).
The counter value 1 means that the timer is not (!) running (i.e. 1 =
the timer can be started once more), and 0 means that the timer is
already running (0 = the timer cannot be started for an additional
time).

&#60;/p&#62;&#60;p&#62;
This function tries to get the ownership of the timer, and returns
true if that could be achieved:

&#60;/p&#62;&#60;pre&#62;
let sem_name = &#38;#34;some_name&#38;#34;

let own_timer = ref false

let acquire_timer() =
  (* precondition: the semaphore value can be 1 or 0 *)
  !own_timer || (
    let v =
      Netplex_semaphore.decrement sem_name in
    (* By convention, v=-1 means that the value was already 0. It is not
       further decreased, however. v=0 means that it was decreased from
       1 to 0, and we own the semaphore now.
     *)
    own_timer := (v = 0);
    !own_timer
  )
&#60;/pre&#62;

&#60;p&#62;
Note that we don&#38;#39;t pass &#60;code&#62;~wait:true&#60;/code&#62; to the
&#60;code&#62;decrement&#60;/code&#62; call. This causes the operation to return -1 when the
counter is already 0 rather than to wait for becoming positive. Also,
semaphores have Netplex-wide global names. Instead of &#38;#34;some_name&#38;#34; one
should pass something more intelligent, e.g. a name derived from the
service name (&#38;#34;service.semaphore0&#38;#34;).

&#60;/p&#62;&#60;p&#62;
Although semaphores are automatically created at the time of the first
use, they are often initialized wrong (counter value of 0). In our
example, the initial value must be 1 - meaning that the timer is 
unacquired:

&#60;/p&#62;&#60;pre&#62;
let create_sem() =
  let success =
    Netplex_semaphore.create ~protected:true sem_name 1L in
  ()
&#60;/pre&#62;

&#60;p&#62;This function creates the semaphore if it does not exist, and sets
the initial value to 1. Also, it is a so-called protected semaphore:
If the process terminates in an abnormal way, Netplex ensures that the
increment and decrement operations called by the process are reverted,
i.e. if the process happens to own the timer, and the counter is 0,
it will be automatically set back to 1, so another process has the chance
to take over the ownership. 

&#60;/p&#62;&#60;p&#62;
When all processes call &#60;code&#62;create_sem&#60;/code&#62; when they start up it
can be ensured that the semaphore exists.

&#60;/p&#62;&#60;p&#62;
The timer itself is started like this:

&#60;/p&#62;&#60;pre&#62;
let timer_running = ref None

let start_timer() =
  timer_running :=
    Some(Netplex_cenv.create_timer
          (fun timer -&#38;#62; ...; true)
          60.0
        )
&#60;/pre&#62;

&#60;p&#62;The function body is run every 60 seconds (the result value of true
means that the timer is restarted after each timeout). Such timers are
entered into the main event loop of the Netplex process
container. This means it is not guaranteed that they are run as often
as demanded - if some regular computation takes too long and the event
loop cannot run then the timer activation will be deferred.  For many
timers, especially when runnning only infrequently, this way of
activation is reliable enough. Timers are automatically stopped at
shutdown time. We can, however, also enfore an earlier stop:

&#60;/p&#62;&#60;pre&#62;
let stop_timer() =
  match !timer_running with
    | None -&#38;#62; ()
    | Some timer -&#38;#62;
        Netplex_cenv.cancel_timer timer;
        timer_running := None
&#60;/pre&#62;

&#60;p&#62;
Finally, there is the case that the process terminates and has to
give up the ownership of the timer. This is easily done by incrementing
the counter. However, we have to do more in this example, because the
decrement operation does not wait: We have to notify other processes
so that they again try to acquire the now free semaphore. Netplex
has a simple built-in messaging system that we can use here:

&#60;/p&#62;&#60;pre&#62;
let release_timer() =
  if !own_timer then (
    let v = Netplex_semaphore.increment sem_name in
    let cont = Netplex_cenv.self_cont() in
    cont # send_message &#38;#34;*&#38;#34; &#38;#34;release_semaphore&#38;#34; [| sem_name |]
  )
&#60;/pre&#62;

&#60;p&#62;
(The destination address of &#38;#34;*&#38;#34; means that the message is broadcasted
to all receivers. This could be optimized by sending only to the
processes of the same service.)

&#60;/p&#62;&#60;p&#62;
Now, let&#38;#39;s put everything together: When the Netplex system starts up,
we add the plugin. When a process starts, we first try to create the
semaphore, and then try to acquire it. If this is successful, we start
the timer. For getting the shutdown right, we stop the timer and release
the ownership before the process exits. Also, we listen on messages,
and if the right one arrives, we look whether the semaphore is orphaned:

&#60;/p&#62;&#60;pre&#62;
class our_hooks() =
  object(self)
    inherit Netplex_kit.empty_processor_hooks()

    method post_add_hook _ ctrl =
      ctrl # add_plugin Netplex_semaphore.plugin

    method post_start_hook cont =
      create_sem();
      if acquire_timer() then
        start_timer()

    method pre_finish_hook cont =
      stop_timer();
      release_timer()

    method receive_message cont msg_name msg_args =
      if msg_name = &#38;#34;release_sempahore&#38;#34; &#38;#38;&#38;#38; msg_args = [| sem_name |] then (
        if acquire_timer() then
          start_timer()
      )
  end
&#60;/pre&#62;

&#60;p&#62;
There is still a possible improvement. As pointed out, a protected
semaphore automatically adjusts the counter value when the process
crashes. But this is only the first action of
&#60;code&#62;release_timer&#60;/code&#62;.  The second is to notify the other
processes of the vacant timer ownership. This could be done from the
master process (which is complicated), or by simply starting another
timer that continuously tries to get the semaphore:

&#60;/p&#62;&#60;pre&#62;
    method post_start_hook cont =
      create_sem();
      if acquire_timer() then
        start_timer();
      ignore(Netplex_cenv.create_timer 
               (fun _ -&#38;#62; 
                  if acquire_timer() then
                    start_timer();
                  true
               )
               60.0)
&#60;/pre&#62;

&#60;h2&#62;Other IPC primitives&#60;/h2&#62;

So far, Netplex does not only have semaphores and message broadcasting
as primitives, but also:

&#60;ul&#62;
  &#60;li&#62;Shared variables: The plugin &#60;code&#62;Netplex_sharedvar&#60;/code&#62;
      allows the processes to access variables in the master process.
      This is a very limited mechanism, though: Accesses are quite
      expensive, and the variables should not become big (because they
      would have to be copied at &#60;code&#62;fork&#60;/code&#62; time). Nevertheless,
      this is a useful mechanism to spread dynamic information in a
      Netplex process system.
  &#60;/li&#62;&#60;li&#62;Mutexes: The plugin &#60;code&#62;Netplex_mutex&#60;/code&#62; can be used to
      protect critical code sections (e.g. when accessing shared variables).
      Again, this mechanism is slow.
  &#60;/li&#62;&#60;li&#62;Accessing Unix Domain sockets: It is possible to look up the
      address of Unix Domain sockets that are served by other services
      in the same Netplex system: the &#60;code&#62;lookup&#60;/code&#62; method of
      containers. This can be used to establish a fast IPC channel
      between processes.
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;This list is going to be extended (e.g. message queues are on my
list). Also, it is tried to speed up the mechanisms when Netplex finds
out that the OS has native support for these primitives (e.g. use
POSIX semaphores for implementing the &#60;code&#62;Netplex_semaphore&#60;/code&#62;
plugin if available).

&#60;/p&#62;&#60;h2&#62;Why using processes?&#60;/h2&#62;

&#60;p&#62;Some final words on why multi-processing is an interesting option
for implementing network servers. This is important to explain because
there are a number of myths about multi-processing (e.g. it is slow,
hard to master, not that flexible), and especially non-tech people see
it as a technology of the past. There are also big companies who are
interested in spreading these myths, e.g. Microsoft, because their OS
is a bad choice when it comes to multi-processing, or Sun, because
Java is heavily optimized for multi-threading and JIT is not really
compatible with multi-processing. The truth is that the big players
simply did not invest in this technology. Actually, there are a number
of advantages, and some make multi-processing superior:

&#60;/p&#62;&#60;p&#62;
First, there is the &#60;b&#62;stability&#60;/b&#62; of the resulting programs. Even if
a process crashes, the remaining processes of the system can continue
to run. Netplex is designed with crash resilience in mind (crashed
processes are restarted).

&#60;/p&#62;&#60;p&#62;
Second, multi-processing systems can more easily benefit from
&#60;b&#62;multicores&#60;/b&#62; because shared memory only plays a subordinate role
for them which leads to more effective caching of RAM. Multi-processing
systems often use uni- and bidirectional messaging instead. For 
programs written in Ocaml this is even more the case (but not for other
languages) because Ocaml does not allow multi-threaded programs to
profit from multicores at all.

&#60;/p&#62;&#60;p&#62;
Third, multi-processing systems can be more easily extended to
&#60;b&#62;clusters&#60;/b&#62; (i.e. they are running on several computers). The
messaging mechanisms can often be easily implemented as network
protocols.

&#60;/p&#62;&#60;p&#62;
Finally, many synchronization primitives are also available for the
multi-processing case (as demonstrated). More or less, it is only
shared memory which is really harder to master - e.g. Ocaml does not
provide any form of memory management for explicitly allocated shared
memory, so one could put Ocaml values directly into such memory blocks
for easy and quick access. (However, I&#38;#39;m working on partial solutions
for this.)


&#60;/p&#62;&#60;h2&#62;Where to get Ocamlnet 3&#60;/h2&#62;

&#60;p&#62;There is no final release yet. The current testing version
is &#60;a href=&#34;http://download.camlcity.org/download/ocamlnet-3.0test1.tar.gz&#34;&#62;Ocamlnet-3.0test1&#60;/a&#62;. Look at
&#60;a href=&#34;http://blog.camlcity.org/blog/ocamlnet3_test1.html&#34;&#62;this
article for a list of changes&#60;/a&#62;.

&#60;/p&#62;&#60;p&#62;Alternately, one can also check out
&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/&#34;&#62;the Subversion repository&#60;/a&#62; (use the &#60;code&#62;svn&#60;/code&#62; command
on this URL, or click on it to view it with your web browser - most
of the discussed code lives in &#60;code&#62;src/netplex&#60;/code&#62;).
&#60;/p&#62;

&#60;img src=&#34;/files/img/blog/ocamlnet3_mp_bug.gif&#34; width=&#34;1&#34; height=&#34;1&#34;/&#62;


&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant.
&#60;a href=&#34;http://www.gerd-stolpmann.de/buero/work_ocaml_search.html.en&#34;&#62;
He is accepting new customers!&#60;/a&#62;


&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Ocamlnet-3.0test1</title>
          <guid>http://blog.camlcity.org/blog/ocamlnet3_test1.html</guid>
          <link>http://blog.camlcity.org/blog/ocamlnet3_test1.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;First testing version&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
A first testing version of Ocamlnet 3 has been released:
&#60;a href=&#34;http://download.camlcity.org/download/ocamlnet-3.0test1.tar.gz&#34;&#62;Ocamlnet-3.0test1&#60;/a&#62;.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
The idea of this release is to make this version available to a larger
audience for testing, and to allow everybody to check whether code
using this library still works. It is not yet ready for production
environments.

&#60;/p&#62;&#60;p&#62;
List of major changes:

&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;Port to Win32 (as outlined in &#60;a href=&#34;http://blog.camlcity.org/blog/ocamlnet3_win32.html&#34;&#62;Stranger an a strange land&#60;/a&#62;)
  &#60;/li&#62;&#60;li&#62;The new &#60;code&#62;Rpc_proxy&#60;/code&#62; layer (as described in &#60;a href=&#34;http://blog.camlcity.org/blog/ocamlnet3_ha.html&#34;&#62;The next server, please!&#60;/a&#62;)
  &#60;/li&#62;&#60;li&#62;Extensions of Netplex
  &#60;/li&#62;&#60;li&#62;New implementation of the Shell library for starting subprocesses
  &#60;/li&#62;&#60;li&#62;Uniform debugging with &#60;code&#62;Netlog.Debug&#60;/code&#62;
  &#60;/li&#62;&#60;li&#62;Exception printers (&#60;code&#62;Netexn&#60;/code&#62;)
  &#60;/li&#62;&#60;li&#62;Introduction of pollsets (&#60;code&#62;Netsys_pollset&#60;/code&#62;); removal of
      &#60;code&#62;Unix.select&#60;/code&#62; (i.e. more than 1024 file descriptors)
  &#60;/li&#62;&#60;li&#62;The &#60;code&#62;netcgi1&#60;/code&#62; library has been dropped in favor of
      &#60;code&#62;netcgi2&#60;/code&#62;
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;
There are also a lot of minor changes. Some of the changes are incompatible
with code written for Ocamlnet 2.

&#60;/p&#62;&#60;p&#62;
Testers are especially encouraged to check whether Ocamlnet 3 still
works on all platforms, because a lot of new platform-specific code
has been added.  I mainly tested with Linux and the MinGW port for
Win32.

&#60;/p&#62;&#60;p&#62;
The library is not yet available via GODI. I&#38;#39;m working on this.

&#60;/p&#62;&#60;p&#62;
More blog postings will follow describing the highlights.

&#60;/p&#62;&#60;p&#62;
Please report results to gerd@gerd-stolpmann.de
&#60;/p&#62;

&#60;img src=&#34;/files/img/blog/ocamlnet3_test1_bug.gif&#34; width=&#34;1&#34; height=&#34;1&#34;/&#62;



&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant.
&#60;a href=&#34;http://www.gerd-stolpmann.de/buero/work_ocaml_search.html.en&#34;&#62;
He is accepting new customers!&#60;/a&#62;


&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>The next server, please!</title>
          <guid>http://blog.camlcity.org/blog/ocamlnet3_ha.html</guid>
          <link>http://blog.camlcity.org/blog/ocamlnet3_ha.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;What&#38;#39;s new in Ocamlnet 3: Highly-available RPC&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
Ocamlnet is being renovated, and there will be soon a first testing
version of Ocamlnet 3. The author, Gerd Stolpmann, explains in a
series of articles what is new, and why Ocamlnet is the best
networking platform ever seen. Experience with Hydro, another RPC
implementation for Ocaml using the ICE protocol, has shown that it is
advantageous to provide a layer that automatically manages the reaction
of the RPC system in case of machine failures. Ocamlnet 3 adds
such a layer for SunRPC clients.
 
&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
The SunRPC client implementation included in past versions of Ocamlnet
does not deal in any way with connection failures. The socket error
was simply passed through to the caller, and the RPC client had then
to be shut down. It was up to the caller how to react on such
incidents.  For example, one could indicate to the end user that the
operation cannot be carried out, or one could retry the
call. Programming practice showed that it is cumbersome to develop
anew the right reaction for every type of RPC connection used in a
larger system. The upcoming Ocamlnet 3 release tries to be better
here. On top of the existing &#60;code&#62;Rpc_client&#60;/code&#62; module the
programmer may now install an &#60;code&#62;Rpc_proxy&#60;/code&#62; module that
handles socket errors much more nicely.

&#60;/p&#62;&#60;p&#62;
However, before going into detail here, let us first step back, and
look at distributed systems that try to handle machine failures. The
assumption here is that a socket error is caused by a malfunctioning
(usually dead) machine, and that a simple repetition of the RPC call
to the same machine is not the most prospective reaction. Distributed
systems that can cope with failures of one or several nodes are called
&#60;b&#62;highly available&#60;/b&#62; (HA). The development of software that has
built-in HA capabilities is known be very difficult, and it is
especially useful when there is library support for dealing with
machine failures.


&#60;/p&#62;&#60;h2&#62;Examples of fail-over scenarios&#60;/h2&#62;

&#60;img src=&#34;/files/img/blog/ocamlnet3_ha_1.gif&#34; width=&#34;389&#34; height=&#34;326&#34; align=&#34;right&#34; hspace=&#34;10&#34; vspace=&#34;10&#34;/&#62;

A frequent case is the simple fail-over from one server to an
alternate server (see picture). Ideally, the alternate server is a
copy of the primary server and operates on exactly the same data set.
It is beyond the scope of this article how this can be achieved
- possible solutions include hardware-based approaches like switching
disks from one machine to the other, or software approaches like
replicating all data modifications by means of commit protocols.
For Ocamlnet it is only important that the client needs a criterion
for when to switch to the other server, and instructions how to
switch.

&#60;p&#62;When to switch: Basically, we have the case that the client can
autonomously decide whether to contact the secondary server instead of
the primary one, and the case that the client needs external help.
For example, the client could look at socket errors, and if the number
of such errors for a certain server exceeds a maximum, the server is
declared to be dead, and the client switches autonomously to the
alternate server. This is convenient, but it assumes that the
alternate server is at any time a replacement for the primary one.
Especially when the fail-over is hardware-based this is not valid -
before the alternate server is ready a special fail-over procedure
must be run that disconnects the disks from the old server and
attaches them to the replacement machine. Also, the old server needs
to be isolated (i.e. &#38;#34;zombie machines&#38;#34; are ensured to be unreachable
from the rest of the network). There is another reason for relying on
external information about when to switch: Often, it is resaonable to
monitor the cluster machines by special monitoring daemons. These are
in a better position to collect and aggregate information about the
liveliness of the cluster machines. For example, such services can
contiuously ping the nodes of the cluster and also check which network
ports are actually reachable. So it is often advisable to factor out
the decision whether a node is alive or dead even if the client could
do this on its own.

&#60;/p&#62;&#60;p&#62;How to switch: RPC calls that can be simply repeated without
changing the meaning are called idempotent. For example, if the client
just wants to get the value of a remote variable, the RPC call doing
so can be simply repeated as often as necessary to obtain the result.
In contrast to that an RPC procedure that modifies a remote variable
can often not be repeated (e.g. imagine an integer variable is
increased by 1). This case is way more complicated to handle, and one
ends often up with a so-called commit protocol. This means that the
modification is encapsulated as a transaction that is either
completely executed or aborted meaning that the effects are completely
nullified. The commit protocol ensures that both the client and the
server know whether the transaction is finally done or not done. Also,
the commit protcol makes it possible to keep data sets synchronized
that are stored on several servers. For Ocamlnet this means that the
right reaction is either trivial (redirection of the failed idempotent
call to the alternate server), or so complex that generic library
support is impossible.

&#60;/p&#62;&#60;h3&#62;A HA scheme for partitioning data&#60;/h3&#62;

&#60;p&#62;
&#60;img src=&#34;/files/img/blog/ocamlnet3_ha_2.gif&#34; width=&#34;446&#34; height=&#34;282&#34; align=&#34;right&#34; hspace=&#34;10&#34; vspace=&#34;10&#34;/&#62;

Often, HA is not the only reason for using a cluster of computers.
Another motivation is to increase the total capacity of the system by
spreading the load over several systems. In the picture there is an
example where the load is shared by four servers. For pure
load-balancing there is no need to check for and manage connection
failures - before doing an RPC call, the right destination machine is
simply selected, and the call is directed to it.

&#60;/p&#62;&#60;p&#62;However, load-balancing can be combined with HA. The depicted
system allows the client to read-access a dataset that is split into
four partitions - the read, blue, gree, and brown partition. Each
server has one of these partitions as its primary dataset. For
example, when the client needs to read a data item from the blue
partition it can go to server 2 because this server has all &#38;#34;blue
data&#38;#34; (shown as blue square). In addition to that, all servers also
store replicated data from other servers to provide a back-up in the
case that a node fails. An easy way to do this would be to store the
replica for server k on server k&#38;#39;=(k+1) mod N. However, this leads to
the problem that the server k&#38;#39; gets double load when server k fails,
foiling the increase of capacity by sharing load. The system in the
picture implements a better scheme. Here, the partitions are further
subdivided so that equal fractions of any partition are replicated on
all machines. For example, the blue partition is split into three
smaller parts which are then replicated on the three alternate
nodes. When server 2 fails, the client can still find all the &#38;#34;blue
data&#38;#34; on the remaining three machines, and because all remaining
machines get an equal part of the fail-over load the overall capacity
of the system shrinks only by 25%.

&#60;/p&#62;&#60;p&#62;This scheme has to be considered for the Ocamlnet library in so far
it is no longer always the same node that replaces a failing node.  It
depends now on the data item which server is the right alternate
machine if the primary server fails.


&#60;/p&#62;&#60;h2&#62;Detecting node outages&#60;/h2&#62;

When a node goes down, we assume that any pending TCP connection to it
hangs, and also that attempts for creating new connections are not
responded.  After some time, the router usually detects that the node
is unavailable, and emits special ICMP packets notifying machines in
the network about the outage. However, this takes usually a few
minutes. Until then, the cluster system needs other mechanims to
detect the unavailablility of the node.

&#60;p&#62;In the literature (including this text) a machine is usually either
up or down, and there is nothing in between. In practice, however, the
problem of zombie machines really exists, and is not as rare as many
think. Often, the cause for hovering between life and death are bad
disks - disk accesses still work, but take a multitude of the usual
time to complete. Of course, such bad nodes are unusable, and should
be counted to the dead nodes, but the problem is that the network
ports are still responsive. This does not render the network
connectivity tests useless, but one should keep in mind that these
tests do not cover all reasons for an outage.

&#60;/p&#62;&#60;p&#62;The most basic mechanism for recognizing failed nodes are timeouts.
Due to the nature of timeouts it takes some time until such a test can
indicate a failure, and this limits the usefulness of timeouts.
Nevertheless, timeouts are important even when there are better
criteria because they are often the only way to interrupt already
established but now hanging TCP connections. We have to distinguish
between two kinds of timeouts: Firstly, there are timeouts on
individual send and receive operations. In many implementations of
network protocols these are the only kind of timeouts. Essentially,
these timeouts specify a lower bound on the bandwidth of the
connection, and they work independently of the length of the exchanged
messages. Secondly, one can also set an upper bound on the total time
for the RPC call (including the time for both the request and the
response). As RPC is usually run on fast LAN&#38;#39;s, and specifications of
RPC systems often require upper limits on the latency as seen by the
end user (something like &#38;#34;95% of the user requests must be responded
within 0.5 seconds&#38;#34;), this second interpretation is the more
interesting one. Actually, Ocamlnet only implements this second notion
of timeout.

&#60;/p&#62;&#60;p&#62;The question is whether there are faster ways of detecting failed
nodes. One method is surprisingly simple: Each node running an RPC
client pings the nodes with the RPC servers the clients are connected
to. A repeatedly failed ping is considered as a node outage. Ocamlnet
3 does not (yet) implement such a component, but the new
&#60;code&#62;Rpc_proxy&#60;/code&#62; layer is prepared to take external
information about outages into account. (For an implementation look at
&#60;a href=&#34;http://oss.wink.com/hydro/hydro-0.7.1/doc/html/Hydrodoc_hydromon.html&#34;&#62;Hydromon&#60;/a&#62;. Although
written for ICE instead of SunRPC it is straight-forward to port
Hydromon to Ocamlnet&#38;#39;s SunRPC implementation.) Even on busy servers
this method is able to detect bad nodes within a few seconds.

&#60;/p&#62;&#60;p&#62;Another idea is to share information about good and bad nodes as
much as possible. If one connection runs into a timeout and the node
is considered as failed, this information should be made available
to all threads, and if possible even to all processes of the node.
This avoids further attempts to connect to the failed node, and one
can even cancel pending calls to this node.

&#60;/p&#62;&#60;h2&#62;Proxies: Configurable connection management&#60;/h2&#62;

After this lengthy foreword, let&#38;#39;s look at the actual implementation
of connection management in &#60;code&#62;Rpc_proxy&#60;/code&#62;. The name, &#38;#34;proxy&#38;#34;,
reminds us of the ideal of RPC: A remote call should be as simple and
safe to conduct as a local procedure call, and the &#38;#34;proxy&#38;#34; is the
local agent dealing with most of the complexities of remote calls,
so that they look as much as possible like local calls.

&#60;p&#62;This module is divided into three parts: 
&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;&#60;code&#62;Rpc_proxy.ReliabilityCache&#60;/code&#62; remembers which nodes
and which network ports failed in the past, and decides on this
information which nodes are considered to be down.
  &#60;/li&#62;&#60;li&#62;&#60;code&#62;Rpc_proxy.ManagedClient&#60;/code&#62; is an encapsulation of
the simpler &#60;code&#62;Rpc_client&#60;/code&#62; that is able to reconnect to a
service. Effectively, a &#60;code&#62;ManagedClient&#60;/code&#62; is a single TCP 
connection to a remote service that can be reestablished after an
error.
  &#60;/li&#62;&#60;li&#62;&#60;code&#62;Rpc_proxy.ManagedSet&#60;/code&#62; is a set of 
&#60;code&#62;ManagedClient&#60;/code&#62;s for the same RPC service that controls how
 many RPC calls are
routed over each managed client, and that also implements the logic
for skipping bad clients.
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;The generator for the language mapping, &#60;code&#62;ocamlrpcgen&#60;/code&#62;,
has also been extended. It emits now a functorized version of the
client wrappers. For example, for the RPC program &#60;code&#62;P&#60;/code&#62; the
generator would output a functor

&#60;/p&#62;&#60;pre&#62;
module Make&#38;#39;P : Rpc_client.USE_CLIENT -&#38;#62; sig ... end
&#60;/pre&#62;

where the returned module includes for every procedure &#60;code&#62;p&#60;/code&#62;
a call wrapper

&#60;pre&#62;
val p : C.t -&#38;#62; argument -&#38;#62; result
&#60;/pre&#62;

Here, &#60;code&#62;C&#60;/code&#62; is the input module
&#60;code&#62;Rpc_client.USE_CLIENT&#60;/code&#62;. This means one can inject any
client implementation that at least implements
&#60;code&#62;Rpc_client.USE_CLIENT&#60;/code&#62;, and is not restricted to the
default implementation &#60;code&#62;Rpc_client&#60;/code&#62;. Of course, the
enhanced client implementation &#60;code&#62;Rpc_proxy&#60;/code&#62; also provides
the &#60;code&#62;Rpc_client.USE_CLIENT&#60;/code&#62; functionality, so it can be
substituted here:

&#60;pre&#62;
module P = Make&#38;#39;P(Rpc_proxy.ManagedClient)
&#60;/pre&#62;

The user is free, however, to pass any module as input here that meets
the formal requirements - for example, one could pass further improved
or derived versions of &#60;code&#62;Rpc_proxy.ManagedClient&#60;/code&#62;. (Of
course, the output generated by the new &#60;code&#62;ocamlrpcgen&#60;/code&#62; is
compatible with what earlier versions emitted. For users it is not
required to switch to the functorized version.)

&#60;h3&#62;Comparing basic clients and managed clients&#60;/h3&#62;

Basic clients as provided by &#60;code&#62;Rpc_client&#60;/code&#62; live as long as the
underlying TCP connection exists. This means the connection is created
at the beginning of the client&#38;#39;s lifetime, and the client becomes
unsuable when the connection ends (either programmatically, or because
of socket errors). In contrast to this, a managed client can establish
the connection again when needed - but only to the same host and port.
(Fail-overs to other hosts/ports are implemented in &#60;code&#62;ManagedSet&#60;/code&#62;,
see below.)

&#60;p&#62;The basic clients are configured after being created - by invoking
configuration functions like &#60;code&#62;configure&#60;/code&#62;,
&#60;code&#62;set_exception_handler&#60;/code&#62; or &#60;code&#62;set_auth_methods&#60;/code&#62;.
This has the advantage that the configuration can be changed during the
lifetime of the client. For managed clients, however, we require that the
configuration must be fully given at creation time. The point is here
that the managed client internally creates a series of basic clients,
and immediately needs to know how these clients are to be configured.

&#60;/p&#62;&#60;p&#62;For example, this piece of code creates a managed client:

&#60;/p&#62;&#60;pre&#62;
let mconfig = 
  Rpc_proxy.ManagedClient.create_mclient_config
    ~programs:[ P._program ]
    ~msg_timeout: 10.0
    ()
let mclient =
  Rpc_proxy.ManagedClient.create_mclient
    (Rpc_client.Internet(Unix.inet_addr_of_string &#38;#34;10.3.2.55&#38;#34;, 9005))
    (esys : Unixqueue.event_system)
&#60;/pre&#62;

&#60;p&#62;Note that the TCP connection is not immediately created, but only
when the first remote procedure is called. This is on line with the
automatic reconnection feature: When the connection is closed or
crashes, the next procedure call triggers that the connection is 
established again.

&#60;/p&#62;&#60;p&#62;In this example, we have only configured a message timeout - the
client will signal a timeout when the response takes more than 10
seconds to arrive. Normally, the timeout does not mean that the 
connection is considered as broken. We have to explicitly configure
this feature - the proxy layer cannot know by itself what possible
reasons for timeouts are. This is just another argument of 
&#60;code&#62;create_mclient_config&#60;/code&#62;:

&#60;/p&#62;&#60;pre&#62;
Rpc_proxy.ManagedClient.create_mclient_config
   ... ~msg_timeout_is_fatal:true ... ()
&#60;/pre&#62;

&#60;p&#62;A fatal error is recorded, and may, depending on the configuration
of the &#60;code&#62;ReliabilityCache&#60;/code&#62;, disable the server host and/or
server port for some time. The &#60;code&#62;ManagedSet&#60;/code&#62; layer will
recognize the error situation, and take this information into account
when selecting good server endpoints for future RPC calls - but more
on that below. On the level of &#60;code&#62;ManagedClient&#60;/code&#62; it is only
important to define which malfunctions are considered as fatal errors
that may trigger the fail-over to other server nodes. As a
&#60;code&#62;ManagedClient&#60;/code&#62; is always only connected with one server
node it is pointless to decide what to do in case of node failures.

&#60;/p&#62;&#60;p&#62;There are two more connection management features:

&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;One can set an idle timeout. Unused TCP connections are closed 
      after the idle timeout period is over. The idea is to save
      server resources by closing unused connections - which often also
      means that the thread or process in the server can be released.
      Also, some network configurations do not tolerate that connections
      remain unused for longer periods of time.
  &#60;/li&#62;&#60;li&#62;One can demand to ping the service before sending the first
      RPC request. The background is that servers usually accept TCP
      connections before knowing whether there are really enough resources
      to handle them (i.e. free threads or processes). So it may happen
      that a TCP connection can be established, but it turns out to be
      immediately dead. By requesting one ping before really using the
      connection one can detect this problem.
&#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;Once the &#60;code&#62;ManagedClient&#60;/code&#62; is configured and created, it
can be used in a similar way as the older &#60;code&#62;Rpc_client&#60;/code&#62;. For
example, to call &#60;code&#62;do_something&#60;/code&#62; we would write

&#60;/p&#62;&#60;pre&#62;
let result = P.do_something mclient argument
&#60;/pre&#62;

(given that &#60;code&#62;P&#60;/code&#62; is the applied functor from above). The
asynchronous version, &#60;code&#62;P.do_something&#38;#39;async&#60;/code&#62; is also
available. 

&#60;h3&#62;The &#60;code&#62;ReliabilityCache&#60;/code&#62;&#60;/h3&#62; 

The reliability cache is the instance that decides whether a server
host or server port is available or not. The cache implements a logic
for disabling servers when RPC clients recently reported fatal errors.
Higher RPC levels such as &#60;code&#62;ManagedSet&#60;/code&#62; can then interpret
this information and fail-over to alternate servers. In the bigger
picture outlined in the first half of this article the cache plays
primarily the role of the autonomous fail-over trigger. In addition to
that, there is also a hook for plugging in external sources for
recognizing dead nodes.

&#60;p&#62;The reliability cache is notified by the managed clients whether
RPC calls lead to fatal errors or not. The cache disables the host or
port of the server when a certain number of fatal errors occur in
sequence. As we don&#38;#39;t have a better idea for how long to disable, the
cache simply implements a time-based logic where the length of the
time period the server is assumed to be down depends on the number of
fatal errors: Following fatal errors double the duration of the
assumed unavailability until a maximum duration is reached.  (A
justification for this logic are server overloads: When a server
cannot be contacted because of too much load, it is reasonable to
delay the addition of load, and to consider the delay as a function of
past slowness.)

&#60;/p&#62;&#60;p&#62;There is usually only one reliability cache per process (the
&#38;#34;global&#38;#34; cache). This is reasonable because the information about
failures should be spread as far as possible. Nevertheless, each
&#60;code&#62;ManagedClient&#60;/code&#62; can configure its own criteria when to
disable servers. The function

&#60;/p&#62;&#60;pre&#62;
val derive_rcache : rcache -&#38;#62; rcache_config -&#38;#62; rcache
&#60;/pre&#62;

makes this possible: A new cache is created with a new configuration,
but the cache shares error data with a parent cache. For example, this
could be used to set the hook for getting external liveliness
information:

&#60;pre&#62;
let rconfig =
  Rpc_proxy.ReliabilityCache.create_rcache_config
    ~availability:(fun rcache sockaddr -&#38;#62; ...)
    ()
let rc =
  Rpc_proxy.ReliabilityCache.derive_rcache
    (Rpc_proxy.ReliabilityCache.global_rcache())
    rconfig
&#60;/pre&#62;

&#60;p&#62;Right now, there is no mechanism to make the mentioned error
counters accessible across process boundaries. This may be added in
future versions of Ocamlnet.


&#60;/p&#62;&#60;h3&#62;The &#60;code&#62;ManagedSet&#60;/code&#62; of RPC clients&#60;/h3&#62;

Finally, we come to the layer that represents possible connections to
several server nodes, and that decides what to do when one of the
servers dies.

&#60;p&#62;A managed set of clients is configured by specifying an array of
remote addresses (usually corresponding to the available server
nodes). For each of the addresses the set may create a managed
client. Basically, there are two scenarios: Either the addresses are
seen as alternate server endpoints (when the first one fails try the
second etc.), or they are seen as equivalent, and it is tried to
distribute the load over all server endpoints that are alive.  This
translates into the two policies managed sets can operate under:
&#60;code&#62;`Failover&#60;/code&#62; and &#60;code&#62;`Balance_load&#60;/code&#62;.

&#60;/p&#62;&#60;p&#62;For example, this managed set specifies a fail-over policy:
When 10.3.2.55 fails, it tries to contact 10.3.3.55 instead.

&#60;/p&#62;&#60;pre&#62;
let sconfig = 
  Rpc_proxy.ManagedSet.create_mset_config
    ~policy:`Failover
    ()
let mset =
  Rpc_proxy.ManagedSet.create_mset
    sconfig
    [| Rpc_client.Inet(Unix.inet_addr_of_string &#38;#34;10.3.2.55&#38;#34;, 9009), 100;
       Rpc_client.Inet(Unix.inet_addr_of_string &#38;#34;10.3.3.55&#38;#34;, 9009), 100;
    |]
    (esys : Unixqueue.event_system)
&#60;/pre&#62;

&#60;p&#62;The number 100 is the maximum number of simultaneous connections to
this endpoint. The managed set automatically increases the number of
connections to the selected endpoint when the load grows (where the
load is the number of simultaneously submitted RPC calls - remember
this is an asynchronous RPC implementation, and it is able to handle
several RPC calls at the same time). When the maximum is reached,
further connections attempts are not rejected, but the alternate
endpoint is then used instead. (So &#60;code&#62;`Failover&#60;/code&#62; does not
prevent that the alternate endpoint is used, it only specifies the
preference to use the first endpoint as long as possible.)

&#60;/p&#62;&#60;p&#62;If we had passed &#60;code&#62;~policy:`Balance_load&#60;/code&#62; instead, the
managed set would create connections to both servers, and try to
achieve that the number RPC calls directed to each of the servers is
roughly the same.

&#60;/p&#62;&#60;p&#62;The user code calls &#60;code&#62;mset_pick&#60;/code&#62; to get the next managed
client according to these selection rules:

&#60;/p&#62;&#60;pre&#62;
let mclient, index = 
  Rpc_proxy.ManagedSet.mset_pick mset
&#60;/pre&#62;

There is also the optional &#60;code&#62;from&#60;/code&#62; argument allowing to
restrict the endpoints from which the client must be chosen. E.g.

&#60;pre&#62;
let mclient, index = 
  Rpc_proxy.ManagedSet.mset_pick ~from:[2;3] mset
&#60;/pre&#62;

would take either the endpoint at index 2 or the endpoint at index 3
from the endpoint array, and return the actually selected index.  This
is useful for fail-over scenarios where the node is a function of the
input data - like in the presented partitioning example.

&#60;p&#62;Managed sets do not repeat failed RPC calls by themselves. They
only record failures in the &#60;code&#62;rcache&#60;/code&#62;, and if enough
failures happened, the problematic server is no longer considered as
alive when submitting new calls. For idempotent calls, however, there
is some support for automatic repetition. But first let&#38;#39;s have a
closer look at how the clever partitioning scheme from the above
picture could be implemented.


&#60;/p&#62;&#60;h3&#62;Sketch how to implement partitioned data sets&#60;/h3&#62;

In this example, we had four servers:

&#60;pre&#62;
let server1 = 
  Rpc_client.Inet(Unix.inet_addr_of_string &#38;#34;10.0.0.1&#38;#34;, 7654)
let server2 = 
  Rpc_client.Inet(Unix.inet_addr_of_string &#38;#34;10.0.0.2&#38;#34;, 7654)
let server3 = 
  Rpc_client.Inet(Unix.inet_addr_of_string &#38;#34;10.0.0.3&#38;#34;, 7654)
let server4 = 
  Rpc_client.Inet(Unix.inet_addr_of_string &#38;#34;10.0.0.4&#38;#34;, 7654)
&#60;/pre&#62;

In order to access the data item with key &#60;code&#62;k : key&#60;/code&#62; we define
two hash functions, one for the primary partitioning scheme, and
one for the secondary scheme:

&#60;pre&#62;
val h_primary   : key -&#38;#62; int      (* actually: key -&#38;#62; {0..3} *)
val h_secondary : key -&#38;#62; int      (* actually: key -&#38;#62; {0..3} *)
&#60;/pre&#62;

For example, one could define these as:

&#60;pre&#62;
let h_primary (k : string) =
  (Char.code (Digest.string k).[0]) land 3

let h_secondary (k : string) =
  let h1 = h_primary k in
  let rec attempt j =
    let k&#38;#39; = k ^ string_of_int j in
    let h2 = (Char.code (Digest.string k&#38;#39;).[0]) land 3 in
    if h2 = h1 then (
      if j = 4 then
        (h1+1) mod 4     (* ensure that the loop terminates *)
      else
        attempt (j+1)
    ) else 
        h2
  in
  attempt 0
&#60;/pre&#62;

Now, we create our managed set:

&#60;pre&#62;
let sconfig = 
  Rpc_proxy.ManagedSet.create_mset_config
    ~policy:`Failover
    ()
let mset =
  Rpc_proxy.ManagedSet.create_mset
    sconfig
    [| server1, 100;
       server2, 100;
       server3, 100;
       server4, 100;
    |]
    (esys : Unixqueue.event_system)
&#60;/pre&#62;

When trying to get the record for key &#60;code&#62;k&#60;/code&#62;, we first compute
the two nodes that store this record:

&#60;pre&#62;
let n1 = h_primary k
let n2 = h_secondary k
&#60;/pre&#62;

Finally, we pick a managed client from the managed set. We have to restrict
the set of available servers to those where the paritioning functions say
they store the record. So we get

&#60;pre&#62;
let mclient, index =
  Rpc_proxy.ManagedSet.mset_pick ~from:[n1;n2] mset
&#60;/pre&#62;

&#60;p&#62;The &#60;code&#62;mclient&#60;/code&#62; can then be used for calling a remote
procedure. Remember that managed clients/sets do not automatically
repeat failed calls, so we still can get socket errors etc. The error
is only recorded, and when the server looks too faulty, it is finally
disabled, and the managed set will no longer pick it. Such a call
could look like:

&#60;/p&#62;&#60;pre&#62;
let result = P.get_file mclient file_id
&#60;/pre&#62;

&#60;p&#62;Idempotent calls can be repeated without changing the semantics.
For this reason, managed sets have a little helper to do so:

&#60;/p&#62;&#60;pre&#62;
let result =
  Rpc_proxy.ManagedSet.idempotent_sync_call
    ~from:[n1;n2]
    mset
    P.get_file&#38;#39;async
    file_id
&#60;/pre&#62;

&#60;p&#62;This would repeat the &#60;code&#62;get_file&#60;/code&#62; call in case of
errors (up to a configurable maximum). Note that we have to pass
the asynchronous version of &#60;code&#62;get_file&#60;/code&#62; as argument although
the resulting call is synchronous.

&#60;/p&#62;&#60;p&#62;Summarized, the proxy module achieves that the actual fail-over
logic is implemented within &#60;code&#62;ManagedSet&#60;/code&#62; and the layers
below it. The user code can rather focus, as shown, on describing
which server to chose, and does not need to think in actions (&#38;#34;if this
fails then try that&#38;#34;).

&#60;/p&#62;&#60;h2&#62;Where to get Ocamlnet 3&#60;/h2&#62;

&#60;p&#62;There is no official release yet, not even an alpha release for
developers. In order to get it, one has to check out
&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/&#34;&#62;the Subversion repository&#60;/a&#62; (use the &#60;code&#62;svn&#60;/code&#62; command
on this URL, or click on it to view it with your web browser - most
of the discussed code lives in &#60;code&#62;src/rpc&#60;/code&#62;).
&#60;/p&#62;


&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant.
&#60;a href=&#34;http://www.gerd-stolpmann.de/buero/work_ocaml_search.html.en&#34;&#62;
He is accepting new customers!&#60;/a&#62;


&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Stranger in a strange land</title>
          <guid>http://blog.camlcity.org/blog/ocamlnet3_win32.html</guid>
          <link>http://blog.camlcity.org/blog/ocamlnet3_win32.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;What&#38;#39;s new in Ocamlnet 3: The Win32 port&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
Ocamlnet is being renovated, and there will be soon a first testing
version of Ocamlnet 3. The author, Gerd Stolpmann, explains in a
series of articles what is new, and why Ocamlnet is the best
networking platform ever seen. One of the fundamental and intriguing
improvements is the port to Win32 - after all, Win32 isn&#38;#39;t well-known
as good platform for asynchronous programming, but Ocamlnet is
based on this programming style. A number of hard problems had to be
tackled.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
In the POSIX world (let me use &#38;#34;POSIX&#38;#34; as general term for
Unix/Linux/BSD) the &#60;code&#62;select()&#60;/code&#62; system call is known as the
linchpin for dispatching file descriptor events. Generally, a program
using &#60;code&#62;select()&#60;/code&#62; looks like

&#60;/p&#62;&#60;pre&#62;
(* Event loop: *)
while &#38;#60;something to do&#38;#62; do
  &#38;#60;find out interesting descriptors&#38;#62;
  select();
  &#38;#60;interpret events&#38;#62;
done
&#60;/pre&#62;

and the crucial point is that all kinds of descriptors can be passed
as input to &#60;code&#62;select()&#60;/code&#62;. This makes it possible to wait
simultaneously for very different events, like socket events, pipeline
events, or events for devices. There is no such universal
&#60;code&#62;select()&#60;/code&#62; in Win32. The &#60;code&#62;select()&#60;/code&#62; call
provided by Win32 only works for sockets. There are other kinds of
mechanisms for event handling, though, but they are less systematic,
and one faces the problem that different ways of watching for events
need to be integrated into the single event loop.

&#60;p&#62;
Since version 3.11, Ocaml includes a &#38;#34;fancy&#38;#34; version of
&#60;code&#62;select()&#60;/code&#62; in its standard library, which actually tries to
emulate the POSIX semantics by combining several Win32 event-handling
approaches in a quite tricky way. The reader might ask what the whole
point for Ocamlnet is then - one could simply have relied on this
emulation. However, there are a number of drawbacks. First, some of
the used emulation techniques are incredibly simplistic. For example,
if one of the descriptors references the output side of a pipe, the
implementation falls back to a form of busy waiting (wasting CPU
time). The input side of a pipe always signals that the pipe has space
for new data (which may block the program). There is no special
support for named pipes, although Win32 supports them much better than
anoynmous pipes.  The second problem is that the emulation is limited
to 64 descriptors only - far too less for servers [please see update
below]. For all these reasons, Ocamlnet does not make use of this
&#60;code&#62;select()&#60;/code&#62; emulation, but follows its own, more ambitious
approach.  Actually, the basic problem of this emulation is that
&#60;code&#62;select()&#60;/code&#62; is not the right level of abstraction for
combining multiple ways of waiting for events into a single operation.

&#60;/p&#62;&#60;p&#62;
&#60;b&#62;Update:&#60;/b&#62; I&#38;#39;ve got a message from Sylvain Le Gall, the
implementor of this emulation code. He explained that actually 4096
handles can be waited for (which is true, sorry for my mistake). Also,
he points out that he does not know how to handle anonymous pipes
better, and that even Cygwin uses the same technique as his code. The
reason for not treating named pipes specially is that the Ocaml
standard library does not support them anyway. - I think his
implementation decisions are perfectly rational for a general-purpose
and drop-in &#60;code&#62;select()&#60;/code&#62; emulation, and the standard library
is now way better than before his contribution. However, Ocamlnet has
special needs (like using named pipes as substitute for Unix Domain
sockets), and by switching to pollsets (see below) as the API for
handling events the system resources can be better managed.

&#60;/p&#62;&#60;h2&#62;Overlapped I/O&#60;/h2&#62;

If you asked Microsoft whether Windows supported asynchronous
programming well, they would point you proudly to overlapped
I/O. Actually, overlapped I/O is a form of asynchronous I/O, but it is
nevertheless very different from &#60;code&#62;select()&#60;/code&#62;-style I/O. It
works a lot like a non-blocking TCP connect: One has to start the I/O
operation, and it is signaled to the caller when the operation is
completed. The signal can be a callback (APC), or it can be an event
variable that is set to signaled state. The difference to
&#60;code&#62;select()&#60;/code&#62; is that the latter indicates &#60;b&#62;in advance&#60;/b&#62;
whether an I/O operation would be possible (and non-blocking) before
the operation is started whereas overlapped I/O requires that the
operation is actually initiated, and one can only wait asynchronously
for its completion. (Look &#60;a href=&#34;http://msdn.microsoft.com/en-us/library/aa365683(VS.85).aspx&#34;&#62;
here&#60;/a&#62; for how Microsoft explains overlapped I/O.)

&#60;p&#62;This article is not about which style is better, but how to port
programs using Ocamlnet that assume &#60;code&#62;select()&#60;/code&#62; loops as
their basic construction principle. So the question here is: Can we
get some kind of emulation of &#60;code&#62;select()&#60;/code&#62; for overlapped
I/O?

&#60;/p&#62;&#60;p&#62;Before answering, let us look at for what we would need overlapped
I/O. Essentially, one can use it for files, sockets, and a few IPC
mechanisms. Ocamlnet does not have support for reading or writing
files asychronously anyway, and for sockets it is easy to use the
Win32 call &#60;code&#62;WSAWaitForMultipleEvents()&#60;/code&#62; in order to combine
waiting for sockets with other events. Actually, Ocamlnet is mostly
interested in overlapped I/O for named pipes, because named pipes are
a good replacement for the otherwise missing Unix Domain sockets and
socketpairs. (Note that Win32 named pipes are a different IPC
mechanism than POSIX named pipes, and support connection multiplexing
like TCP sockets.)

&#60;/p&#62;&#60;p&#62;The Win32 functions &#60;code&#62;ReadFile()&#60;/code&#62; and
&#60;code&#62;WriteFile()&#60;/code&#62; can be used to start an overlapped I/O
request by passing an &#60;code&#62;OVERLAPPED&#60;/code&#62; struct to them (for a
non-overlapped request the argument would remain NULL).  This causes
that the function returns immediately, and that the completion of the
operation is signaled to the caller by an IPC mechanism. In Ocamlnet
Win32 events are used for that purpose. A Win32 event is a
synchronization primitive that works similar to a condition variable:
It can be in unsignaled or in signaled state, and it is possible to
suspend the program until it enters signaled state. Win32 events have
the big advantage that they provide the required level of genericity:
Many Win32 objects are actually either subtypes of Win32 events, and
implement the event interface directly, or they can be connected to
Win32 events. For example, a process handle is also an event, and it
is signaled when the referenced process is terminated - so waiting for
the process handle as event means to wait until the process is
finished. It is also possible to wait until one of several events
enter signaled state (&#60;code&#62;WaitForMultipleObjects()&#60;/code&#62;).  By
putting an event into the &#60;code&#62;OVERLAPPED&#60;/code&#62; struct Windows
notifies the user about the completion of the operation by signalling
the event.

&#60;/p&#62;&#60;p&#62;Collecting events and waiting until one of them is signaled sounds
already a lot like &#60;code&#62;select()&#60;/code&#62;. Still, the problem remains
that one has to start the operation before one can wait for it. There
is no way around it on the Win32 level. The only chance for porting
Ocamlnet was to change the level of abstraction the user code
sees. Actually, it was possible to provide an emulation, but the price
is that the user code must no longer invoke the generic read/write
operations, but special wrappers that do the required impedance
transformation. So &#60;code&#62;Unix.read&#60;/code&#62; and &#60;code&#62;Unix.write&#60;/code&#62;
are forbidden when dealing with named pipes, and instead the special
wrapper functions &#60;code&#62;Netsys_win32.pipe_read&#60;/code&#62; and
&#60;code&#62;Netsys_win32.pipe_write&#60;/code&#62; have to be called. Ocamlnet
provides buffers for input and output, so that &#60;code&#62;pipe_read&#60;/code&#62;
only reads from the input buffer (and raises &#60;code&#62;EWOULDBLOCK&#60;/code&#62;
if the buffer is empty), and that &#60;code&#62;pipe_write&#60;/code&#62; only writes
to the output buffer (and raises &#60;code&#62;EWOULDBLOCK&#60;/code&#62; if the
buffer is full). In addition to that, Ocamlnet organizes that
overlapped I/O operations are started in the background when data
needs to be pumped from the named pipe to the input buffer, or from
the output buffer to the named pipe. This way, the overlapped
operation is hidden from the user, and Ocamlnet can provide a view so
that an event signals when a &#60;code&#62;pipe_read&#60;/code&#62; or a
&#60;code&#62;pipe_write&#60;/code&#62; can actually process data.

&#60;/p&#62;&#60;p&#62;Here is a small subset of the named pipe API provided by Ocamlnet
in the module &#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/src/netsys/netsys_win32.mli&#34;&#62;&#60;code&#62;Netsys_win32&#60;/code&#62;&#60;/a&#62;:

&#60;/p&#62;&#60;pre&#62;
type w32_event   (* Win32 event objects *)
type w32_pipe    (* A pipe endpoint *)
type pipe_mode = Pipe_in | Pipe_out | Pipe_duplex

val pipe_pair : pipe_mode -&#38;#62; w32_pipe * w32_pipe    
  (* like socketpair *)

val pipe_read : w32_pipe -&#38;#62; string -&#38;#62; int -&#38;#62; int -&#38;#62; int
  (* like Unix.read *)

val pipe_write : w32_pipe -&#38;#62; string -&#38;#62; int -&#38;#62; int -&#38;#62; int
  (* like Unix.write *)

val pipe_shutdown : w32_pipe -&#38;#62; unit
  (* like Unix.shutdown *)

val pipe_rd_event : w32_pipe -&#38;#62; w32_event
val pipe_wr_event : w32_pipe -&#38;#62; w32_event
  (* get the events notifying about read/write possibility *)

val wsa_wait_for_multiple_events : 
      w32_event array -&#38;#62; int -&#38;#62; int option
  (* wait for a number of events, or until a timer times out *)
&#60;/pre&#62;

&#60;p&#62;For instance, this code reads from two named pipes p1 and p2
simultaneously, and outputs the data to stdout:

&#60;/p&#62;&#60;pre&#62;
let s = String.create 1024

let try_read p =
  try 
    let n = Netsys_win32.pipe_read p s 0 1024 in
    if n=0 then raise Exit;   (* deal somehow with eof *)
    print_string (String.sub s 0 n)
  with Unix.Unix_error(Unix.EWOULDBLOCK,_,_) -&#38;#62; ()

let loop() =
  try
    let e1 = Netsys_win32.pipe_rd_event p1 in
    let e2 = Netsys_win32.pipe_rd_event p2 in
    while true do
      match Netsys_win32.wsa_wait_for_multiple_events [| e1; e2 |] (-1) with
        | None -&#38;#62; ()
        | Some _ -&#38;#62;
            try_read p1;  (* always try both for simplicity of the example *)
            try_read p2
    done
  with
    | Exit -&#38;#62; ()
&#60;/pre&#62;

&#60;p&#62;Note that p1 and p2 have type &#60;code&#62;Netsys_win32.w32_pipe&#60;/code&#62;,
and &#60;b&#62;not&#60;/b&#62; &#60;code&#62;Unix.file_descr&#60;/code&#62;.

&#60;/p&#62;&#60;p&#62;This example has the shape of the &#60;code&#62;select()&#60;/code&#62; loop
outlined at the beginning of the article. There are still differences,
though, to the POSIX way of doing it: The file handle provided by the
OS is hidden by the &#60;code&#62;Netsys_win32&#60;/code&#62; layer, and cannot be
directly used by the program (because this could break the
abstraction). Also, one first has to create event objects (here by
calling &#60;code&#62;pipe_rd_event&#60;/code&#62;) in order to set up waiting. Last
but not least the emulation itself is also not free of subtle
artefacts introduced by the &#60;code&#62;Netsys_win32&#60;/code&#62; layer.  In
particular, there is no way of cancelling the overlapped I/O
operations performed under the hood of the emulation (one can only
close/disconnect the pipe to stop them). This can be an issue when the
file descriptor is passed on to other processes.  (N.B. Windows Vista
promises to solve the cancellation issue, but I had not yet a chance
to test it.)

&#60;/p&#62;&#60;p&#62;Of course, this approach only works when the watched Win32 file
object implements overlapped I/O. If not, one can only read and write
synchronously, and Ocamlnet provides special helper threads for
dealing with this issue. This is discussed in more detail below.
First lets look how to generalize &#60;code&#62;select()&#60;/code&#62; so it can also
be backed by the Win32 call &#60;code&#62;WSAWaitForMultipleEvents()&#60;/code&#62;.


&#60;/p&#62;&#60;h2&#62;The pollset class type&#60;/h2&#62;

The &#60;code&#62;select()&#60;/code&#62; call is reflected by the Ocaml standard
library as a function

&#60;pre&#62;
val Unix.select :
  file_descr list -&#38;#62; file_descr list -&#38;#62; file_descr list -&#38;#62; float -&#38;#62;
    file_descr list * file_descr list * file_descr list
&#60;/pre&#62;

&#60;p&#62;
This interface has a few disadvantages. First, in every round of
waiting one has to pass all descriptors to &#60;code&#62;select()&#60;/code&#62;.
This is time-consuming, and the reason for the bad reputation of
&#60;code&#62;select()&#60;/code&#62; with regards to performance (although in reality
is not as bad as some bloggers pretend). Second, there is no way to
cancel an already started &#60;code&#62;select()&#60;/code&#62; from a different
thread. This is important for multi-threaded programs, because a
second thread may want to change the list of descriptors the first
thread is watching.

&#60;/p&#62;&#60;p&#62;
Ocamlnet uses now a different interface for polling descriptors,
so-called 
&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/src/netsys/netsys_pollset.mli&#34;&#62;pollsets&#60;/a&#62;:

&#60;/p&#62;&#60;pre&#62;
class type pollset =
object
  method find : Unix.file_descr -&#38;#62; poll_req_events
  method add : Unix.file_descr -&#38;#62; poll_req_events -&#38;#62; unit
  method remove : Unix.file_descr -&#38;#62; unit
  method wait : float -&#38;#62; 
                ( Unix.file_descr * 
                  poll_req_events * 
                  poll_act_events ) list
  method dispose : unit -&#38;#62; unit
  method cancel_wait : bool -&#38;#62; unit
end
&#60;/pre&#62;

&#60;p&#62;
These sets are used in this way: Descriptors may be added and removed
from the set, and for each descriptor one can specify which events to
watch for (reading or writing). When the set is ready, the user can
invoke &#60;code&#62;wait&#60;/code&#62; to start waiting for the specified events.
The function returns the events that are actually signalled by the
OS. It is possible to cancel waiting at any time by calling
&#60;code&#62;cancel_wait true&#60;/code&#62;.

&#60;/p&#62;&#60;p&#62;
I had not only the Win32 port in mind when designing the pollsets, but
also POSIX-type OS. For example on Linux there is the epoll API that
operates on a similar data structure, and that can easily back a
pollset implementation.

&#60;/p&#62;&#60;p&#62;
The Win32 implementation of pollsets is done in two layers. The basic
class &#60;code&#62;Netsys_pollset_win32.pollset&#60;/code&#62; already supports all
kinds of descriptors Ocamlnet needs, but is restricted to watch at
most 64 Win32 event objects (corresponding to 63 sockets, or 31 named
pipes). This restriction is abandoned by
&#60;code&#62;Netsys_pollset_win32.threaded_pollset&#60;/code&#62;. However, the
latter class requires that the program is multi-threaded.

&#60;/p&#62;&#60;p&#62;
Essentially, the implementation works by inspecting the descriptors to
be watched, and by looking up the required helper objects (like
calling &#60;code&#62;Netsys_win32.pipe_rd_event&#60;/code&#62; to get the Win32 event
object reflecting the read status of a pipe). After that,
&#60;code&#62;WSAWaitForMultipleEvents()&#60;/code&#62; is invoked to start waiting,
and when events happen, they are mapped back from the signaled event
objects to the connected file descriptors. The
&#60;code&#62;cancel_wait&#60;/code&#62; feature is supported by always adding an
additional Win32 event object to the set of watched events which is
set to signaled state when &#60;code&#62;cancel_wait&#60;/code&#62; is called.

&#60;/p&#62;&#60;p&#62;
Of course, this is only a rough sketch of the algorithm. It is quite
complicated which helper objects are actually needed, and how they
affect the central &#60;code&#62;WSAWaitForMultipleEvents()&#60;/code&#62; call. Of
course, this depends very much on the type of the descriptors put into
the pollset, and it would go too far to fully present these details in
this article.

&#60;/p&#62;&#60;p&#62;
However, one thing should not remain &#38;#34;magic&#38;#34; to the reader: In the
above paragraphs, I pointed out that the representation of Win32
objects like named pipes is complex (e.g. it includes buffers,
&#60;code&#62;OVERLAPPED&#60;/code&#62; structs, and Win32 event objects), and that an
opaque type like &#60;code&#62;Netsys_win32.w32_pipe&#60;/code&#62; needs to hide the
details of the representation from the user. Also, I mentioned that
using the &#60;code&#62;Unix.file_descr&#60;/code&#62; of the named pipe handle would
break the abstraction, and that the handle is made unavailable to user
code for this reason. However, pollsets nevertheless use file
descriptors for passing system objects around. How does this fit
together?

&#60;/p&#62;&#60;p&#62;
Ocamlnet does not give up on &#60;code&#62;Unix.file_descr&#60;/code&#62; as the
central type for referencing system objects - switching to a different
type for this purpose would break tons of user code. Instead, a tricky
mechanism has been added allowing us to keep
&#60;code&#62;Unix.file_descr&#60;/code&#62; but also to attach further management
objects to such descriptors. This is explained in detail below.  The
crucial idea is that Ocamlnet introduces artificial descriptors that
are only used for identifying system objects but that cannot be used
for actually performing I/O. So the descriptor handed out to user
code for a named pipe is not the Win32 handle for the named pipe
(which would allow to do I/O and to break abstractions), but it is
an additionally allocated handle that only exists for the purpose
of identifying the system object. This handle, now called proxy
descriptor, is the value passed to pollsets and other interfaces
assuming &#60;code&#62;Unix.file_descr&#60;/code&#62; as the type for referencing
system objects.


&#60;/p&#62;&#60;h2&#62;Handling I/O with helper threads&#60;/h2&#62;

Before looking at proxy descriptors, I should briefly present how
Ocamlnet deals with other file handles than named pipes.

&#60;p&#62;
For &#60;b&#62;sockets&#60;/b&#62; everything is very easy. As mentioned, the pollset
implementation is based upon &#60;code&#62;WSAWaitForMultipleEvents()&#60;/code&#62;
which is actually a Winsock function. It supports sockets directly -
no tricky emulation layers are required.

&#60;/p&#62;&#60;p&#62;
Win32 distinguishes &#60;b&#62;anonymous pipes&#60;/b&#62; as returned by
&#60;code&#62;Unix.pipe&#60;/code&#62; from named pipes. Anonymous pipes do not
support overlapped I/O. As this kind of pipes is important for
starting subprocesses, Ocamlnet nevertheless tries to provide an
asynchronous API for them. Because only synchronous I/O is possible
helper threads need to be created which implement buffers in much the
same way than it is done for overlapped I/O: The helper threads pump
data from the buffer to the pipe, or from the pipe to the buffer
(depending on the direction of I/O). The user code only accesses the
buffer in a non-blocking way, and Win32 event objects are used to
signal the state of the buffer (empty or full). The resulting API
looks very much like the API for named pipes, and it is also required
that the special &#60;code&#62;read&#60;/code&#62; and &#60;code&#62;write&#60;/code&#62; functions of
the API are called by user code instead of &#60;code&#62;Unix.read&#60;/code&#62; and
&#60;code&#62;Unix.write&#60;/code&#62;, and there are also proxy descriptors.  As the
implementation is done by helper threads, there is the difficulty how
to stop these threads when there is no more interest in watching the
descriptors. Unfortunately, this is not possible in the general case -
when the pipe &#38;#34;hangs&#38;#34; the thread will also hang, and there is no means
to interrupt it (there are no signals (software interrupts) in Win32,
and thread cancellation is a hot issue). As anonymous pipes are mostly
used for driving external processes this seems to be acceptable (there
is always the fallback solution to kill the process).

&#60;/p&#62;&#60;p&#62;
The Win32 &#60;b&#62;consoles&#60;/b&#62; are supported in the same way as anonymous
pipes.

&#60;/p&#62;&#60;p&#62;
Even &#60;b&#62;processes&#60;/b&#62; can be waited for. Although there is no direct
data flow (neither &#60;code&#62;read&#60;/code&#62; nor &#60;code&#62;write&#60;/code&#62; make sense
in any way), processes are referenced by means of file handles. When
the handle is set to signaled state, this means that the process has
terminated. So process handles can be added to pollsets, and this
makes it easy to wait for the termination of a subprocess in parallel
to managing the I/O over the pipes that are connected with the
process.

&#60;/p&#62;&#60;p&#62;
For other types of file handles there is no good support yet (except
one creates the mentioned helper threads). Of course, adding support
would be easy for all handles where Win32 allows overlapped
I/O. However, this seems not to be urgent.


&#60;/p&#62;&#60;h2&#62;Proxy descriptors&#60;/h2&#62;

Back to the trick Ocamlnet uses to keep &#60;code&#62;Unix.file_descr&#60;/code&#62;
while having complex management objects for controlling asynchronous
I/O. For example, one can get a proxy descriptor for a named pipe by
calling:

&#60;pre&#62;
val pipe_descr : w32_pipe -&#38;#62; Unix.file_descr
&#60;/pre&#62;

The returned descriptor cannot be used for anything except for looking
up the attached named pipe up:

&#60;pre&#62;
val lookup_pipe : Unix.file_descr -&#38;#62; w32_pipe
&#60;/pre&#62;

(There is also a slightly more general lookup function that can be used
for any type of Win32 object using proxy descriptors.)

&#60;p&#62;
The proxy descriptors are backed by real file handles (otherwise it
could happen that the next &#60;code&#62;open()&#60;/code&#62; returns the same
handle, and the proxy descriptor would no longer be identifiable as
such), but a cheap kind of handle was chosen to avoid too much
resource consumption. There is a hidden global table that maps proxy
descriptors to the referenced complex objects, and by GC trickery it
is ensured that the table shrinks when proxy descriptors are freed by
GC runs (note that &#60;code&#62;Unix.file_descr&#60;/code&#62; is a heap-allocated
value for Win32, so we can add finalisers).

&#60;/p&#62;&#60;p&#62;
Of course, user code has to close the proxy descriptors when they are
no longer needed (but only when they were actually requested). This
means they have the same &#38;#34;lifetime&#38;#34; as normal file descriptors which
also need to be closed after use.

&#60;/p&#62;&#60;h2&#62;Some kind of generic API&#60;/h2&#62;

The consequence of the chosen emulation approach is that for each kind
of file object a different set of I/O functions need to be called.
This may be acceptable for special operations like
&#60;code&#62;connect&#60;/code&#62; where a generic approach is hard to get right,
but is totally impractical for simple reading and writing. It would be
required to call different functions that have very similar
signatures, e.g. (read case) &#60;code&#62;pipe_read&#60;/code&#62; for named pipes,
&#60;code&#62;input_thread_read&#60;/code&#62; for objects managed by helper threads,
and of course the well-known &#60;code&#62;Unix.recv&#60;/code&#62; for sockets and
&#60;code&#62;Unix.read&#60;/code&#62; for normal files.

&#60;p&#62;
In the &#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/src/netsys/netsys.mli&#34;&#62;&#60;code&#62;Netsys&#60;/code&#62;&#60;/a&#62;
module a simple generic approach of handling read and writes is
available. There is a function inspecting the kind of file descriptor,
and a set of generic functions for actually performing read/write:

&#60;/p&#62;&#60;pre&#62;
type fd_style
  (* indicates the kind of descriptor (details omitted here) *)

val get_fd_style : Unix.file_descr -&#38;#62; fd_style
  (* get the file descriptor style *)

val gread : fd_style -&#38;#62; Unix.file_descr -&#38;#62; string -&#38;#62; int -&#38;#62; int -&#38;#62; int
  (* generic read: call the right implementation function depending
     on the fd style *)

val blocking_gread : fd_style -&#38;#62; Unix.file_descr -&#38;#62; string -&#38;#62; int -&#38;#62; int -&#38;#62; int
  (* similar to gread, but it is blocked until at least one byte can
     be read *)

val really_gread : fd_style -&#38;#62; Unix.file_descr -&#38;#62; string -&#38;#62; int -&#38;#62; int -&#38;#62; unit
  (* similar to gread, but it is blocked until exactly the passed number
     of bytes are read *)

(* similar functions are available for writing, for shutting down, and
   for closing
*)
&#60;/pre&#62;

If the descriptors are proxy descriptors, these functions
automatically look up the underlying complex management object and
invoke the right I/O function. If the descriptors are sockets,
they call socket functions like &#60;code&#62;Unix.recv&#60;/code&#62;. Otherwise,
they fall back to &#60;code&#62;Unix.read&#60;/code&#62; or &#60;code&#62;Unix.write&#60;/code&#62;.

&#60;p&#62;
Large parts of Ocamlnet have been ported so they use this generic
layer instead of directly calling &#60;code&#62;Unix.read&#60;/code&#62; or
&#60;code&#62;Unix.write&#60;/code&#62;. For example, the class
&#60;code&#62;Netchannels.input_descr&#60;/code&#62; wraps a netchannel object around
a file descriptor, and it has been changed so it can now also deal
with all kinds of descriptors supported by &#60;code&#62;gread&#60;/code&#62;.


&#60;/p&#62;&#60;h2&#62;High-level I/O&#60;/h2&#62;

Many users don&#38;#39;t want to see all these implementation details I have
reported so far. They want to just use the high-level I/O functions
like &#60;code&#62;Http_client&#60;/code&#62;. The question is what can be supported on
Win32.

&#60;p&#62;
Fortunately, the answer is - thanks to dealing with these details
carefully - that almost everything works! Although not every module
has been fully tested yet, the difficult modules could be ported, and
there is now the conviction that the simpler ones are not in any way
problematic.

&#60;/p&#62;&#60;p&#62;
The most difficult case was &#60;b&#62;Netplex&#60;/b&#62;. Of course, there is no way
to support multi-processing as there is no &#60;code&#62;fork()&#60;/code&#62;
equivalent in Win32. However, multi-threading works well. The
socketpairs connecting the containers with the controller have been
replaced by pairs of connected named pipes. For Unix Domain sockets
there is the possibility of using either named pipes, or Internet
sockets bound to localhost.

&#60;/p&#62;&#60;p&#62;
As Netplex uses &#60;b&#62;SunRPC&#60;/b&#62; as base library, it was of course also
possible to port this Ocamlnet feature. SunRPC cannot only be used
on sockets, but also on named pipes.

&#60;/p&#62;&#60;p&#62;
Another difficult beast was the &#60;b&#62;Shell&#60;/b&#62; library for starting
and managing external processes. It is now as easy to create complex
pipelines of interconnected subprocesses for Win32 as it used to be
for POSIX.

&#60;/p&#62;&#60;p&#62;
The &#60;b&#62;Nethttpd&#60;/b&#62; web server library could also be verified
to be working, even in conjunction with Netplex.


&#60;/p&#62;&#60;h2&#62;Where to get Ocamlnet 3&#60;/h2&#62;

There is no official release yet, not even an alpha release for
developers. In order to get it, one has to check out
&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/&#34;&#62;the Subversion repository&#60;/a&#62; (use the &#60;code&#62;svn&#60;/code&#62; command
on this URL, or click on it to view it with your web browser - most
of the discussed code lives in &#60;code&#62;src/netsys&#60;/code&#62;).

&#60;p&#62;
The Win32 port of Ocamlnet requires the MinGW port of Ocaml. Also, the
same set of base libraries are needed as for POSIX, especially
PCRE. The simplest way to install that is to use &#60;a href=&#34;http://godi.camlcity.org&#34;&#62;GODI&#60;/a&#62; which also supports MinGW.
&#60;/p&#62;
&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Documentation fun</title>
          <guid>http://blog.camlcity.org/blog/pxp121.html</guid>
          <link>http://blog.camlcity.org/blog/pxp121.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;PXP-1.2.1 with a new reference manual&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
The last release of a stable PXP version happened 5 years ago. That&#38;#39;s
a long time. Actually, a lot of devlopment took place since then, only
that it was difficult to bring PXP into a releasable shape. Now the
last missing piece has been added, namely extensive documentation. So
I can proudly announce the best XML parser that has ever been
available for programming in O&#38;#39;Caml: Not only fast and feature-rich,
but also easy to understand. 

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
Writing documentation is something programmers do not like very much,
and also in the case of PXP the code was far ahead of any description
about it. There was a &#38;#34;User&#38;#39;s Guide&#38;#34;, but it took an oldish approach
of explaining things I don&#38;#39;t like anymore. Also, it was very incomplete.
Last year, I got some funding from a company to improve the PXP
documentation, so I faced the problem to reorganize it completely,
and to add anything missing.

&#60;/p&#62;&#60;p&#62;
If you would like to take a look at the result, here it is:
&#60;a href=&#34;http://projects.camlcity.org/projects/dl/pxp-1.2.1/doc/manual/html/ref/index.html&#34;&#62;The PXP Reference&#60;/a&#62;.

&#60;/p&#62;&#60;h2&#62;Switching from docbook to ocamldoc&#60;/h2&#62;
&#60;p&#62;
The old &#38;#34;User&#38;#39;s Guide&#38;#34; was written as docbook document. This is a good
general-purpose text format that allows one to structure a large text
into chapters, sections, etc., and to generate viewable and printable
output from it (especially one can convert it into a bunch of HTML
pages, and into PDF). However, there is one difficulty: It does not
integrate well with ocamldoc-style interface references.&#60;/p&#62;

&#60;p&#62;
The &#38;#34;User&#38;#39;s Guide&#38;#34; predates ocamldoc - when I wrote the first version
of PXP documentation I had no other chance than to use some
third-party tool to process it. Now what to do? Stick with docbook,
and include ocamldoc somehow into the processing chain? The docbook
format has clearly more features for formatting text, e.g. one can
easily include pictures. However, ocamldoc cannot output in a format
that would be convertible to docbook with only little effort, and
this made this way unfeasible.

&#60;/p&#62;&#60;p&#62;
I decided to switch completely to ocamldoc. Not only the module
interfaces should be documented with it, but also the various
introductory chapters explaining concepts spanning several modules.
Since O&#38;#39;Caml 3.09, ocamldoc understands the file suffix *.txt and
takes these input files as pure documentation. One can still use all
formatting directives like &#60;code&#62;{2 headings}&#60;/code&#62; or
&#60;code&#62;{!Hyperlinks}&#60;/code&#62; pointing to code elements. However, there
was still the difficulty of missing features.

&#60;/p&#62;&#60;p&#62;
So I looked at developing a custom HTML generator (I am mostly
interested in outputting HTML). It is possible to load an add-on
into ocamldoc that modifies its behaviour. One just has to write
a class that inherits from &#60;code&#62;Odoc_html.html&#60;/code&#62;, and
overrides its methods:

&#60;/p&#62;&#60;pre&#62;
class chtml =
  object(self)
    inherit Odoc_html.html as super

    method private html_of_&#38;#60;foo&#38;#62; ... = ...
  end

let chtml = new chtml
let _ = 
  Odoc_args.set_doc_generator (Some chtml :&#38;#62; Odoc_args.doc_generator option)
&#60;/pre&#62;

&#60;p&#62;
Of course, it was still the question whether my features could be
added this way (without rewriting half of the generator class). Yes,
they can, and it only needed about 160 lines of code. I must admit
it took quite a long time to develop this code, since I had to dig into
internals of ocamldoc to understand it better. But anyway, ocamldoc
turns out to be a customizable utility.

&#60;/p&#62;&#60;p&#62;
What I added in particular:
  &#60;/p&#62;&#60;ul&#62;
    &#60;li&#62;A &#60;code&#62;{picture}&#60;/code&#62; tag for including pictures&#60;br/&#62;&#38;#160;
    &#60;/li&#62;&#60;li&#62;The possibility to change the output for &#60;code&#62;include Module&#60;/code&#62;
        in interfaces so that the included interface is directly shown
        instead of only the include statement as such. For clarity,
        the included interface is indented, and has grey background.
        This change can be turned on and off with a &#60;code&#62;{directinclude}&#60;/code&#62;
        tag.&#60;br/&#62;&#38;#160;
    &#60;/li&#62;&#60;li&#62;The &#60;code&#62;include&#60;/code&#62; change requires another feature to be
        really looking good. All references (hyperlinks and plain occurrences)
        pointing to the included module should be rewritten so that
        they point to the including module instead. That means if
        module &#60;code&#62;N&#60;/code&#62; uses &#60;code&#62;include M&#60;/code&#62; we want that all
        references &#60;code&#62;M.x&#60;/code&#62; are changed into &#60;code&#62;N.x&#60;/code&#62;.
        The intention is that &#60;code&#62;M&#60;/code&#62; is no longer referenced,
        and that the duplication of definitions in two modules cannot
        confuse readers (especially those that are unfamiliar with the
        module system).
        I added that feature for my specific case, and the ocamldoc tag
        &#60;code&#62;{fixpxpcoretypes}&#60;/code&#62; enables that rewriting. (It 
        changes &#60;code&#62;Pxp_core_types.[S|I]&#60;/code&#62; into &#60;code&#62;Pxp_types&#60;/code&#62;.)
        &#60;br/&#62;&#38;#160;
    &#60;/li&#62;&#60;li&#62;A last change has also to do with the &#60;code&#62;include&#60;/code&#62; feature.
        With &#60;code&#62;{knowntype}&#60;/code&#62; and &#60;code&#62;{knownclass}&#60;/code&#62; one
        can add identifiers to the lists of known types and classes, so
        that the generator will emit hyperlinks to them, although there is
        no such definition in reality. It turned out that many identifiers
        were already pointing to the including module, but because there is no
        definition in the mli file, ocamldoc does not make these identifiers
        clickable. With &#60;code&#62;{knowntype}&#60;/code&#62; and &#60;code&#62;{knownclass}&#60;/code&#62;
        one can change that on a case by case basis.
  &#60;/li&#62;&#60;/ul&#62;

&#60;p&#62;
The full source code of the custom generator class can be studied here:
&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-pxp/tags/pxp-1.2.1/tools/src/odoc/chtml.ml&#34;&#62;chtml.ml&#60;/a&#62;. The module with the mentioned &#60;code&#62;include&#60;/code&#62;
directive is &#60;code&#62;Pxp_types&#60;/code&#62;. &#60;a href=&#34;http://projects.camlcity.org/projects/dl/pxp-1.2.1/doc/manual/html/ref/Pxp_types.html&#34;&#62;Look here how nice
the generated page is.&#60;/a&#62;

&#60;/p&#62;&#60;h2&#62;The need for conceptual introductions&#60;/h2&#62;

&#60;p&#62;
XML is a cute and simple text format, right? Many people think like
that, but given the fact that many XML parsers are either feature-rich
and slow, or poor and fast, there must be some complexity in the XML
definition.  Recently, I read the article &#38;#34;XML fever&#38;#34; (by Erik Wilde
and Robert J.  Glushko, Communications of the ACM, issue 7, 2008),
where the authors point out a number of deficiencies in the definition
of XML that can lead to delusion about XML, and finally into
&#38;#34;fever&#38;#34;. After years of maintaining this XML parser, I can only second
the authors.  Clearly, there are problems even in the fundamental XML
specification.

&#60;/p&#62;&#60;p&#62;
I do not want to complain about this - XML is widely used, and many of
the standards are practically unfixable without breaking large numbers
of programs. For me the problem arose how to explain all that. For
example, there is the question what is to be considered as the root
node of an XML tree. This is a conceptual question, and the
explanation should not be hidden in an interface description of a PXP
module. For that reason, I had to add a number of chapters to the
manual that explain concepts and generally introduce into the PXP
world. All the &#60;code&#62;Intro_*&#60;/code&#62; chapters are like this.

&#60;/p&#62;&#60;p&#62;
The nice thing is now that I can add direct links from introductory
chapters to interface references and vice versa, since all
documentation is now processed with the same utility, ocamldoc.
When some complicated issue arises in some function description,
it is now possible to point to the section in the introduction where
this issue is explained in detail, and conversely, I can point to
the definition in the interface when a function or type is used
in an intro chapter.

&#60;/p&#62;&#60;h2&#62;What next?&#60;/h2&#62;

&#60;p&#62;
I must admit that my interest in XML has not gained in the last years,
to say it politely. XML is most often used as a base technology for
HTML, or as a data exchange format. Many of the advanced XML standards
like XSLT or XQuery have not found the way into the daily life of
us programmers. The hype is over.

&#60;/p&#62;&#60;p&#62;
Nevertheless, I promise that I will still maintain PXP, and now and
then add another feature. For example, there is a nice XPath evaluator
in the development pipeline - again, I do not find time to finish it,
but hey, there are still many years for doing it. (By the way, if you
want to accelerate that and have some money, we will find a way to
quickly finish XPath.)

&#60;/p&#62;&#60;p&#62;
In August 2009, PXP becomes 10 years old (counted from the first
mentioning in the O&#38;#39;Caml mailing list). This is already a long time
for a software library and an open source project. I am quite
confident it will now also reach its 20th birthday!
&#60;/p&#62;


&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
      &#60;p&#62;&#60;b&#62;Links:&#60;/b&#62;
      &#60;ul&#62;
	
	    &#60;li&#62;&#60;a href=&#34;http://projects.camlcity.org/projects/pxp.html&#34;&#62;PXP homepage&#60;/a&#62;:
	    Find here links to downloads and documentation
	  &#60;/li&#62;
      &#60;/ul&#62;
    &#60;/p&#62;
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Parallelizing with Ocamlnet</title>
          <guid>http://blog.camlcity.org/blog/parallelmm.html</guid>
          <link>http://blog.camlcity.org/blog/parallelmm.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;A scalable implementation of matrix multiplications in O&#38;#39;Caml&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
O&#38;#39;Caml seems not be recognized as a programming language where it is
easy to parallelize tasks. Recently, there was a heated discussion on
caml-list about this subject, and people complained that the memory
management of O&#38;#39;Caml does not work well for multi-threaded programs.
Actually, the memory manager enforces that only one O&#38;#39;Caml thread can
execute code at any time, so that there is no way to make efficient
use of multi-core CPUs. Well, there are ways around this limitation,
and in this article I&#38;#39;ll show how to multiply matrices in a distributed
system. We don&#38;#39;t use multi-threading, but multi-processing which means
that the threads of execution don&#38;#39;t have easy access to shared memory.
For communication and synchronization we use remote procedure calls
which can basically do the same as shared memory, but at higher
cost. Most important, however, is that the restriction of the O&#38;#39;Caml
memory manager no longer applies: The processes are independent, and
every process has its own manager.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;In order to tackle problem we need a way to break down the
algorithm so that the running processes don&#38;#39;t have to synchronize with
each other often.  This is very easy for the matrix multiplication
(but may be more complicated for other problems): Every cell of the
result matrix is the scalar product of one column of the first input
matrix with one row of the second input matrix. Thus the cells can be
computed independently, and one only has to take care that all
processes have access to the input matrices, and that the processes do
not interfer with each other when writing the results.

&#60;/p&#62;&#60;h2&#62;The tools&#60;/h2&#62;

In this solution of the problem we use the toolset provided by Ocamlnet.
It includes a mature and quite efficient implementation of SunRPC
we&#38;#39;ll use for interprocess communication. Furthermore, there is Netplex
for creating and managing multi-process servers. The plan is to create
an RPC service for multiplying matrices that can take advantage of
several cores and that can even run in a compute cluster.

&#60;p&#62;The Netplex/SunRPC combination is quite powerful. Basically, a SunRPC
server accepts TCP connections, and executes the incoming RPC calls.
With our tools we can choose whether we want to have a single process
that processes all connections, or whether we want to create new
processes for every connection. Obviously, the latter is the way to go
for doing the hard computation work, because O&#38;#39;Caml allows us only to
take advantage of several cores by running the code in different
processes. The single process choice is also useful, namely for
controlling the computation, i.e. for synchronization. In this example
we&#38;#39;ll use both types of process management.

&#60;/p&#62;&#60;p&#62;SunRPC is a quite aged standard for doing remote procedure calls.
RPC is typed message passing, that means we cannot only pass strings
from one process to the other, but values of all types of the 
interface definition language. In SunRPC we have ints, floats, strings,
arrays, options, records (&#38;#34;structs&#38;#34;), and variants (&#38;#34;unions&#38;#34;).
Furthermore, RPCs are usually &#38;#34;two-way&#38;#34; messages: Input messages are
always replied with output messages.

&#60;/p&#62;&#60;h2&#62;The components&#60;/h2&#62;

We have two components. One is called the &#38;#34;controller&#38;#34; and manages the
work to do. Actually, the controller defines the outer loops of the
algorithm. Furthermore, there is the &#38;#34;worker&#38;#34; component. The worker
can be instantiated several times (and runs then in several processes).
It implements the inner loop of the algorithm that runs in parallel.

&#60;p&#62;The general idea is that the controller triggers the workers
when a new multiplication is to be done, and that the workers are
themselves repsonsible for getting the input data, and for getting
the tasks to execute. In the SunRPC IDL this looks like:

&#60;/p&#62;&#60;blockquote&#62;&#60;small&#62;
&#60;pre&#62;
program Controller {
    version V1 {
	void ping(void) = 0;

	dim get_dim(which) = 1;
	/* Workers can call this proc to get the dimension of the matrix */

	row get_row(which,int) = 2;
	/* Workers can call this proc to get a row of the matrix. The int
           is the row number, 0..rows-1
	*/

	jobs pull_jobs(int) = 3;
	/* The controller maintains a queue of jobs. This proc pulls a list
           of jobs from this queue. The int is the requested number.
	*/

	void put_results(results) = 4;
	/* Put results into the result matrix. */
	
    } = 1;
} = 2;


program Worker {
    version V1 {
	void ping(void) = 0;

	void run(void) = 1;
	/* The controller calls this proc to initiate the action in the worker.
           When it returns, it is assumed that the worker is completely
           finished with everything.
	*/
    } = 1;
} = 3;
&#60;/pre&#62;
&#60;/small&#62;&#60;/blockquote&#62;

(There is also a program Multiplier with program number 1 we&#38;#39;ll show later.)

&#60;p&#62;These are the procedures the two components expose. When a multplication
is to be done, the order of actions is as follows:

&#60;/p&#62;&#60;ol&#62;
  &#60;li&#62;We assume the controller has the input data. The controller calls
      the &#38;#34;run&#38;#34; RPC of all workers to trigger them.
  &#60;/li&#62;&#60;li&#62;The workers now fetch the input data from the controller by
      calling &#38;#34;get_dim&#38;#34; and then &#38;#34;get_row&#38;#34; until they have all input
      values.
  &#60;/li&#62;&#60;li&#62;The workers now run jobs from the controller by repeatedly calling
      &#38;#34;pull_jobs&#38;#34;, then executing the jobs, and finally transmitting the
       results back
      to the controller by invoking &#38;#34;put_results&#38;#34;. The workers can pull
      several jobs at one, and they can put several results at once.
      This reduces the relative costs of the RPC overhead per 
      job.
  &#60;/li&#62;&#60;li&#62;The workers are done when there are no more jobs, i.e. the list
      of jobs pulled from the controller is empty. Now they can pass 
      back the result message of &#38;#34;run&#38;#34; to the controller.
&#60;/li&#62;&#60;/ol&#62;

Note that the controller must be able to do several things in an 
overlapped way: After it has sent the input messages to the &#38;#34;run&#38;#34; RPC
of the workers it must be responsive and accept incoming RPC calls
from the workers. Finally it must wait until all result messages from
the &#38;#34;run&#38;#34; RPCs have arrived. Ocamlnet&#38;#39;s RPC implementation supports
such overlapped RPC execution. (Note that Sun&#38;#39;s original implementation
does not support this!)

&#60;p&#62;The Multiplier program is just another interface of the controller.
It is the public interface used by the client of the multiplier:

&#60;/p&#62;&#60;blockquote&#62;&#60;small&#62;
&#60;pre&#62;
program Multiplier {
    version V1 {
	void ping(void) = 0;
	
	void test_multiply(int,int,int) = 1;
	/* Creates a test matrix with random values and multiplies them.
	   Args are: (l_rows, r_cols, l_cols = r_rows)
	*/
    } = 1;
} = 1;
&#60;/pre&#62;
&#60;/small&#62;&#60;/blockquote&#62;

&#60;h2&#62;The implementation&#60;/h2&#62;

The complete source code is available here:

&#60;ul&#62;
&#60;li&#62;&#60;a href=&#34;https://godirepo.camlcity.org/svn/lib-ocamlnet2/trunk/code/examples/rpc/matrixmult/&#34;&#62;code/examples/rpc/matrixmult&#60;/a&#62;
&#60;/li&#62;&#60;/ul&#62;

It will also be available in future releases of Ocamlnet, but for now
it is only accessible via Subversion.

&#60;p&#62;Of course, using a distributed approach creates a lot of coding
overhead.  Compare with the direct implementation in simple.ml (also
in this directory): The multiplication algorithm counts there only 9
lines of code. Compared with that we had to develop a lot of
additional stuff: 389 lines for the distributed implementation. This
is the price we have to pay in terms of coding effort.

&#60;/p&#62;&#60;p&#62;Netplex requires a configuration file, here mm_server.cfg. Basically,
the file lists the components that are supposed to accept incoming
connections. Here, it looks like:

&#60;/p&#62;&#60;blockquote&#62;&#60;small&#62;
&#60;pre&#62;
netplex {
  ...
  service {
    name = &#38;#34;mm_controller&#38;#34;;
    protocol { ... };
    processor {
      type = &#38;#34;mm_controller&#38;#34;;
      worker { host = &#38;#34;localhost&#38;#34;; port = 2022 };
      worker { host = &#38;#34;localhost&#38;#34;; port = 2022 };
    };
    workload_manager { ... };
  };

  service {
    name = &#38;#34;mm_worker&#38;#34;;
    protocol { ... };
    processor {
      type = &#38;#34;mm_worker&#38;#34;;
      controller_host = &#38;#34;localhost&#38;#34;;
      controller_port = 2021;
    };
    workload_manager { ... };
  };
}
&#60;/pre&#62;
&#60;/small&#62;&#60;/blockquote&#62;

We have omitted boring parts (&#38;#34;...&#38;#34;). We configure the mm_controller that
connects with two workers, both reachable over TCP port localhost:2022,
and the mm_workers talking to the controller over port
localhost:2021. By including the same worker port twice in the list
of workers, two independent connections are created, and because of the
worker configuration two processes are started. (This is configured in the
omitted workload_manager block.)

&#60;p&#62;Netplex starts exactly the components it finds in the configuration
file, even if more components are implemented in the code. This is quite
useful, because the same binary can be used for different setups.
In this example, we could run the multiplication server on host X with
both controller and workers, and on host Y with only the workers.

&#60;/p&#62;&#60;h2&#62;Some numbers&#60;/h2&#62;

I&#38;#39;ve done some performance tests on a Opteron 2212 system with 2 CPUs
and each CPU has 2 cores (i.e. 4 cores in total). The tests were done
in 64 bit mode and ocamlopt-compiled programs. The interesting
question is not the total performance, but how much overhead we have
by the message passing, and how scalable the system is.

&#60;p&#62;The test is to square an NxN matrix of random numbers, for N=1000,
N=2000, and N=3000. This has been done with the non-distributed version
of the multiplation (simple.ml), and with the distributed version and
W workers, for W=1, W=2, and W=4.

&#60;/p&#62;&#60;p&#62;What we expect is that for larger input matrices the relative cost
of message passing drops, because we have to transfer the input matrix
only once for every worker, and the result matrix only once in total,
but the computation time for the multiplication is O(N&#60;sup&#62;&#60;small&#62;3&#60;/small&#62;&#60;/sup&#62;)
compared to O(N&#60;sup&#62;&#60;small&#62;2&#60;/small&#62;&#60;/sup&#62;) for message passing.

&#60;/p&#62;&#60;p&#62;Here are the runtimes, as table, and as diagram:

&#60;/p&#62;&#60;table cellspacing=&#34;5&#34; border=&#34;1&#34;&#62;
  &#60;tr&#62;
    &#60;th&#62;Test&#60;/th&#62;
    &#60;th&#62;N&#60;/th&#62;
    &#60;th&#62;Runtime (seconds)&#60;/th&#62;
  &#60;/tr&#62;

  &#60;tr&#62;
    &#60;td&#62;Non-distributed&#60;/td&#62;
    &#60;td&#62;1000&#60;/td&#62;
    &#60;td&#62;38.61&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Non-distributed&#60;/td&#62;
    &#60;td&#62;2000&#60;/td&#62;
    &#60;td&#62;359.765&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Non-distributed&#60;/td&#62;
    &#60;td&#62;3000&#60;/td&#62;
    &#60;td&#62;1483.759&#60;/td&#62;
  &#60;/tr&#62;    

  &#60;tr&#62;
    &#60;td&#62;Distributed, W=1&#60;/td&#62;
    &#60;td&#62;1000&#60;/td&#62;
    &#60;td&#62;45.174&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Distributed, W=1&#60;/td&#62;
    &#60;td&#62;2000&#60;/td&#62;
    &#60;td&#62;382.874&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Distributed, W=1&#60;/td&#62;
    &#60;td&#62;3000&#60;/td&#62;
    &#60;td&#62;1847.225&#60;/td&#62;
  &#60;/tr&#62;    

  &#60;tr&#62;
    &#60;td&#62;Distributed, W=2&#60;/td&#62;
    &#60;td&#62;1000&#60;/td&#62;
    &#60;td&#62;24.407&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Distributed, W=2&#60;/td&#62;
    &#60;td&#62;2000&#60;/td&#62;
    &#60;td&#62;206.426&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Distributed, W=2&#60;/td&#62;
    &#60;td&#62;3000&#60;/td&#62;
    &#60;td&#62;849.089&#60;/td&#62;
  &#60;/tr&#62;    

  &#60;tr&#62;
    &#60;td&#62;Distributed, W=4&#60;/td&#62;
    &#60;td&#62;1000&#60;/td&#62;
    &#60;td&#62;20.842&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Distributed, W=4&#60;/td&#62;
    &#60;td&#62;2000&#60;/td&#62;
    &#60;td&#62;144.718&#60;/td&#62;
  &#60;/tr&#62;    
  &#60;tr&#62;
    &#60;td&#62;Distributed, W=4&#60;/td&#62;
    &#60;td&#62;3000&#60;/td&#62;
    &#60;td&#62;576.02&#60;/td&#62;
  &#60;/tr&#62;    
&#60;/table&#62;

&#60;div&#62;
  &#60;img src=&#34;/files/img/parallelmm_runtime.png&#34; width=&#34;450&#34; height=&#34;588&#34;/&#62;
&#60;/div&#62;

&#60;p&#62;Not bad, right? The time spent for message passing seems to be acceptable.
If you compare the non-distributed time for N=3000 with the time of the
distributed system and one worker, the overhead is approximately 25% of
the total runtime.

&#60;/p&#62;&#60;p&#62;For larger W the overhead for the initial transfer of the input
matrices increases - they have to be copied to every worker. The
algorithm can be improved here, because the controller process becomes
soon the bottleneck as it has to copy the matrices for every worker.
Alternatively, the matrices could be spread among the workers by the
workers themselves in order to get a better degree of parallelization
during the transfer phase.

&#60;/p&#62;&#60;p&#62;As last, fun test I started workers on three nodes so that in total
10 cores were available (I couldn&#38;#39;t do more because I did not find more
idle systems in Wink&#38;#39;s cluster). I got a runtime of 349 seconds for
N=3000. As the cores were not identical, this does not tell us much
except that we really can do it over the network, and that this
multiplier really takes advantage of a cluster.
&#60;/p&#62;
&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
      &#60;p&#62;&#60;b&#62;Links:&#60;/b&#62;
      &#60;ul&#62;
	
	    &#60;li&#62;&#60;a href=&#34;http://projects.camlcity.org/projects/ocamlnet.html&#34;&#62;Ocamlnet&#60;/a&#62;:
	    Project page
	  &#60;/li&#62;
	    &#60;li&#62;&#60;a href=&#34;http://caml.inria.fr/pub/ml-archives/caml-list/2008/05/6e74a918c2903f1e11add5714a72e352.en.html&#34;&#62;Thread about parallelization on caml-list&#60;/a&#62;:
	    Sometimes it sucks, sometimes it rocks
	  &#60;/li&#62;
      &#60;/ul&#62;
    &#60;/p&#62;
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>About LambdaRank</title>
          <guid>http://blog.camlcity.org/blog/lambdarank.html</guid>
          <link>http://blog.camlcity.org/blog/lambdarank.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;The ranking algorithm behind GODI Search&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
The question has always been: Are the results better if a search
engine really understands the text it indexes? You can view my latest
project, &#60;a href=&#34;http://docs.camlcity.org&#34;&#62;GODI Search&#60;/a&#62;, as an
attempt to answer this question for a very limited set of documents,
namely the code and its documentation of &#60;a href=&#34;http://godi.camlcity.org&#34;&#62;GODI&#60;/a&#62;, the source code O&#38;#39;Caml
distribution. Actually, understanding text allows it to rank the
results better, so that the more important occurrences of the query
words are shown more at the beginning of the result list. Because
ranking really bases on text interpretation, and the text is here a
variant of lambda code, I call this method LambdaRank.  
&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;As every search engine, GODI Search consists roughly of two main
components, namely the indexer and the searcher. The indexer iterates
over the set of documents, and populates a database with the words it
extracts. The searcher is the simpler part of the game, as it finds
everything well-prepared in this database, and when a query comes in,
it only has to pull the matching documents out of the database, sort
it according to the ranking score, and show it to the user. So the
tricky part is the indexing.

&#60;/p&#62;&#60;p&#62;If you don&#38;#39;t understand the text at all, you cannot do much about
ranking but count the occurrences of a word in a document. The idea
is that a text that speaks about a certain subject also mentions the
subject more often than other texts, and thus the number of occurrences
is a good measure for ranking.
In a document set where the texts are connected with hyperlinks one
can furthermore look at the relationships between the documents.
Google&#38;#39;s PageRank is based on this approach (but, as rumors say,
is now heavily modified from its original design).

&#60;/p&#62;&#60;p&#62;Fortunately, we know a lot about the document set GODI Search 
analyzes. Most documents are like this:

&#60;/p&#62;&#60;ul&#62;
  &#60;li&#62;&#60;a href=&#34;http://docs.camlcity.org/docs/godipkg/3.10/godi-ocaml/lib/ocaml/std-lib/list.mli&#34;&#62;O&#38;#39;Caml module interfaces&#60;/a&#62;
  &#60;/li&#62;&#60;li&#62;&#60;a href=&#34;http://docs.camlcity.org/docs/godipkg/3.10/godi-ocaml/lib/ocaml/std-lib/list.ml&#34;&#62;O&#38;#39;Caml module implementations&#60;/a&#62;
  &#60;/li&#62;&#60;li&#62;&#60;a href=&#34;http://docs.camlcity.org/docs/godipkg/3.10/godi-omake/doc/godi-omake/html/omake-doc.html&#34;&#62;Technical manuals&#60;/a&#62;
  &#60;/li&#62;&#60;li&#62;&#60;a href=&#34;http://docs.camlcity.org/docs/godipkg/3.10/apps-coq/share/emacs/site-lisp/coq.el&#34;&#62;Code in programming languages other than O&#38;#39;Caml&#60;/a&#62;
  &#60;/li&#62;&#60;li&#62;&#60;a href=&#34;http://docs.camlcity.org/docs/godipkg/3.10/apps-coq/lib/coq/ide/utf8.v&#34;&#62;Code in exoctic languages&#60;/a&#62;
&#60;/li&#62;&#60;/ul&#62;

O&#38;#39;Caml code files and closely corresponding manuals dominate the corpus.
GODI Search tries to make the best of this by analyzing O&#38;#39;Caml code in
detail.

&#60;p&#62;After looking at many examples I had some ideas which occurrences must
be ranked higher than others.

&#60;/p&#62;&#60;p&#62;&#60;b&#62;Idea 1. Definitions of identifiers are more important than uses.&#60;/b&#62;
This sounds natural, but actually not every definition is important. GODI
Search also looks at the scope of the definition: A local definition is
restricted to a surrounding function, and scores the least. Then follow
definitions on module level, and in top-level modules. The highest score
is given to exported identifiers that occur in top-level module interfaces.

&#60;/p&#62;&#60;p&#62;Only let-bound identifiers are ranked this way. Function arguments,
variables in pattern matchings, and fun-bound identifiers are ignored.
This is a bit arbitrary, but my feeling is that these identifiers are
usually not important in the coding styles I&#38;#39;m aware of.

&#60;/p&#62;&#60;p&#62;For types, exceptions, and module names similar scoring techniques
exist.

&#60;/p&#62;&#60;p&#62;&#60;b&#62;Idea 2. Values and types are rated separately.&#60;/b&#62;
The namespace of all identifiers can be roughly divided into two big
zones: Values and types. Of course, there are more kinds of
identifiers (modules, classes, labels, file names,...), but my
impression is that the typical programmer has a mind set that is
dominated by only these two classes of symbols. In this sense, a
&#38;#34;value&#38;#34; names an executable thing, and a &#38;#34;type&#38;#34; names meta data. Both
have little to do with each other, and thus a document that contains
many words &#38;#34;list&#38;#34; as type has little importance in a search for &#38;#34;list&#38;#34;
as value.

&#60;/p&#62;&#60;p&#62;&#60;b&#62;Idea 3. Keywords are stopwords.&#60;/b&#62; Keywords occur in
practically every code file in big number, and thus say nothing about
it. GODI Search simply ignores keywords.

&#60;/p&#62;&#60;p&#62;&#60;b&#62;Idea 4. Qualified identifiers are hyperlinks.&#60;/b&#62; If you
search for &#38;#34;mem&#38;#34; then you will get a list of top-level definitions of
this function. The question is which occurrence is shown first.  GODI
Search implements the PageRank idea of scoring hyperlinks pointing to
a document by looking at qualified identifiers. So if there are more
&#38;#34;Hashtbl.find&#38;#34; than &#38;#34;List.find&#38;#34; in code anywhere in the corpus, the
module &#38;#34;Hashtbl&#38;#34; scores higher than &#38;#34;List&#38;#34;. (Actually, it is the other
way round if you also take other references into account.)

&#60;/p&#62;&#60;p&#62;&#60;b&#62;Idea 5. Code and non-code are rated separately.&#60;/b&#62; 
Of course, the above applies only to text sections that are O&#38;#39;Caml code.
Other languages and non-code cannot be rated this way. For this reason,
GODI Search puts a lot of effort into separating both types of text.
Currently, this is done on a per-paragraph basis, i.e. every paragraph
is first analyzed in order to know whether it is O&#38;#39;Caml code or not.
Also, comments and string literals in code files are considered as
non-code.

&#60;/p&#62;&#60;p&#62;So far about the ideas behind LambdaRank. The results of the
implementation look promising: If the user types in &#38;#34;fold_left&#38;#34; he or
she will be taken to really relevant occurrences. And user experience
is what counts, finally.

&#60;/p&#62;
&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
      &#60;p&#62;&#60;b&#62;Links:&#60;/b&#62;
      &#60;ul&#62;
	
	    &#60;li&#62;&#60;a href=&#34;http://docs.camlcity.org&#34;&#62;GODI Search&#60;/a&#62;:
	    Try out GODI Search here
	  &#60;/li&#62;
	    &#60;li&#62;&#60;a href=&#34;http://godi.camlcity.org&#34;&#62;GODI Homepage&#60;/a&#62;:
	    About GODI in general
	  &#60;/li&#62;
      &#60;/ul&#62;
    &#60;/p&#62;
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Cross-Language Cluster Computing</title>
          <guid>http://blog.camlcity.org/blog/hydrostory.html</guid>
          <link>http://blog.camlcity.org/blog/hydrostory.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;The Story Behind Hydro&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
On the surface a search engine looks like a very simple web site, but
actually most things happen in the backend, and are hidden from the
user. A search engine consists of 10 to 20 different types of servers,
and many of them are instantiated several times in a cluster
configuration.  No single programming language is well suited for the
entire implementation. In addition, some language environments may
have compelling libraries that are lacking in other languages.  It is,
however, still difficult to let servers written in different languages
communicate with each other. At Wink, we decided to go with ICE, and
to develop the missing O&#38;#39;Caml implementation ourselves.  
&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;For two years now I have done consulting work for &#60;a href=&#34;http://wink.com&#34;&#62;Wink Technologies&#60;/a&#62; who started out as a
user-powered search engine, but switched later to people
search. Actually, people search began as an experiment, and was
originally developed in O&#38;#39;Caml with some Java standard components. It
turned out that O&#38;#39;Caml was very well suited for crawling and parsing,
and that our solution was convincing enough so we could go on with
O&#38;#39;Caml as implementation language. It was also a big plus that it was
already possible to develop server clusters in O&#38;#39;Caml with Ocamlnet&#38;#39;s
SunRPC implementation - we only had to add a highly available
directory and configuration service. However, the problem of SunRPC is
that there is no good C++ implementation (interestingly, there is an
acceptable one for Java, RemoteTea). Too bad, since SunRPC is simple
and robust, and its type system matches the one of O&#38;#39;Caml quite well
(there are records, arrays, variants, and option types).

&#60;/p&#62;&#60;h2&#62;ICE: An RPC Middleware&#60;/h2&#62;

&#60;p&#62;Looking for an alternative, somebody came up with ICE (&#38;#34;Internet
Communications Engine&#38;#34;). This is a commercial product by &#60;a href=&#34;http://zeroc.com&#34;&#62;ZeroC&#60;/a&#62; which is dual-licensed under the GPL
(like e.g. MySQL). There are implementations for a number of
languages, including C++, Java, Python, and PHP, but unfortunately not
for O&#38;#39;Caml. Well, this is no surprise, but at least the other company
languages are covered. So we looked closer at ICE. Does it match our
needs?

&#60;/p&#62;&#60;p&#62;ICE follows the object-oriented paradigm. For an RPC middleware
this means that a remote call is seen as sending a message to a remote
object. Of course, it is possible to have several such objects of the
same type, and creating instances is made possible by a class
construct. Such a design is very acceptable, but unfortunately object
orientation often meant in the past that the rest of the type system
was crippled. To some degree, this also happened to ICE - especially,
there are no variants and no option types. Well, not optimal, but
there are at least clean &#38;#34;design patterns&#38;#34; how to emulate these 
missing features with classes, and for pure OO languages like Java
the ICE approach simplifies the language mapping.

&#60;/p&#62;&#60;p&#62;Unlike CORBA, there is a fixed protocol the components have to use
to talk to each other. That means a client in language X can directly
contact a server in language Y, and there is no need for an
intermediate instance to translate the protocol. Basically, this means
you can use ICE without any infrastructure - no central server you
are dependent upon. For developing massively parallel cluster services
this is an essential requirement, because such central servers don&#38;#39;t 
scale well enough, and are single points of failure.

&#60;/p&#62;&#60;p&#62;For using ICE in a cluster context, there is the IceGrid add-on.
Basically, this is a highly available directory and configuration
service, and serves for a similar purpose like the service we had
developed for SunRPC before. Clients ask IceGrid where to find their
servers in the network, and IceGrid replies with a suggestion of TCP
ports. This can be used for load-balancing and for high availability.

&#60;/p&#62;&#60;h2&#62;Hydro: Implementing ICE for O&#38;#39;Caml&#60;/h2&#62;

&#60;p&#62;After ICE was found to be good enough, we needed an implementation
for O&#38;#39;Caml. Well, this was my field - I already developed the SunRPC
support in Ocamlnet years ago, and this made me an expert for this
type of work. It took only about 3 weeks until it was possible to
generate client code, and about another week until server support
was ready. However, it was still challenging work, because the ICE
type system needed to be mapped to O&#38;#39;Caml&#38;#39;s type system. Furthermore,
the ICE reference manual was full of errors, and everything had to
be checked against ZeroC&#38;#39;s implementation.

&#60;/p&#62;&#60;p&#62;The difficulty of the type mapping is that ICE demands that objects
and exceptions can be downcasted. O&#38;#39;Caml, however, does not support
this operation, because there is no efficient implementation of
downcasting for a type system like O&#38;#39;Caml&#38;#39;s that includes structural
subtyping. Nevertheless, downcasting is a
reasonable operation in the context of RPC, and it is hardly possible
to get around it.

&#60;/p&#62;&#60;p&#62;Maybe an example demonstrates this best. In Slice, the IDL for
ICE, one can easily define a hierarchy of classes (the syntax resembles
Java&#38;#39;s):

&#60;/p&#62;&#60;blockquote lang=&#34;x-slice&#34;&#62;
&#60;code&#62;&#60;pre&#62;
class SearchResult {
    string url;
    string title;
}

class PeopleSearchResult extends SearchResult {
    string name;
}

class BandSearchResult extends SearchResult {
    string bandName;
    stringSeq bandMembers;
}
&#60;/pre&#62;&#60;/code&#62;&#60;/blockquote&#62;

&#60;p&#62;When the search engine returns a &#60;code&#62;SearchResult&#60;/code&#62; item, it
can also be one of the descendants of this class. Of course, a client
of the search engine that simply wants to display the result, needs to
know all details, and thus downcasts &#60;code&#62;SearchResult&#60;/code&#62; to the
real subclass.

&#60;/p&#62;&#60;p&#62;In a normal OO program one can get rid of this downcast by adding
an operation for displaying the result:

&#60;/p&#62;&#60;blockquote lang=&#34;x-slice&#34;&#62;
&#60;code&#62;&#60;pre&#62;
class SearchResult {
    string url;
    string title;
    string display();
}
&#60;/pre&#62;&#60;/code&#62;&#60;/blockquote&#62;

In an RPC context such an addition might be difficult, however, or may
break some other principle of the RPC design. Basically, RPC is about
marshalling data, and that means getting data out of the context of
one server and forcing them into the context of another server. The
&#38;#34;unity of data and operations&#38;#34;, one of the OO principles, is
intentionally given up.

&#60;p&#62;Note that ICE allows to define operations for classes, and
operations are always executed in the context of the data. In this
example, it would be in deed possible to define &#60;code&#62;display&#60;/code&#62;
in a reasonable way, and to avoid the downcast. However,
&#60;code&#62;display&#60;/code&#62; then becomes part of the protocol, although it is
rather a detail of the client. Anyway, one quickly faces the situation
where downcasting is unavoidable.

&#60;/p&#62;&#60;p&#62;In the O&#38;#39;Caml mapping generated by Hydro, these three classes would
appear like

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
class type o_SearchResult = 
  object
    inherit o_Ice_Object
    method url : string ref
    method title : string ref
  end

class type o_PeopleSearchResult =
  object
    inherit o_SearchResult
    method name : string ref
  end

class type o_BandSearchResult =
  object
    inherit o_SearchResult
    method bandName : string ref
    method bandMembers : string array ref
  end

val as_SearchResult : 
      #Hydro_lm.object_base -&#38;#62; o_SearchResult

val as_PeopleSearchResult : 
      #Hydro_lm.object_base -&#38;#62; o_PeopleSearchResult

val as_BandSearchResult : 
      #Hydro_lm.object_base -&#38;#62; o_BandSearchResult
&#60;/pre&#62;&#60;/code&#62;&#60;/blockquote&#62;

This is a bit simplified, but shows the idea. The ICE classes are
mapped to O&#38;#39;Caml classes with some hidden machinery. The data members
appear as O&#38;#39;Caml methods returning references - the most direct 
translation of this concept. The class hierarchy corresponds to the
hierarchy in ICE, so the O&#38;#39;Caml operator for upcasting, :&#38;#62;, can
be directly used. The hidden machinery comes into play by inheriting
from &#60;code&#62;o_Ice_Object&#60;/code&#62;, the root of the ICE hierarchy, and
by using &#60;code&#62;object_base&#60;/code&#62;, an even smaller antecedent that
defines the marshalling core.

&#60;p&#62;The downcast operation is emulated by defining conversion functions
for every class type: &#60;code&#62;as_PeopleSearchResult&#60;/code&#62; checks whether
the argument is a &#60;code&#62;PeopleSearchResult&#60;/code&#62; in reality, and if so,
casts it to this class type. If not, an exception is raised.

&#60;/p&#62;&#60;p&#62;Of course, this emulation is a bit inconvenient, but this is mostly
a problem of generating good code. From a user&#38;#39;s perspecitve, there is
not much difference between calling a generated conversion function,
or using a built-in language operation. It makes, however, the whole
generated code a lot more difficult to understand.

&#60;/p&#62;&#60;h2&#62;The Story Continues&#60;/h2&#62;

Missing support for RPC middleware is one of biggest concerns when
using a new language in enterprises. In a startup company like Wink it
is possible to address these concerns, because such companies are open
for unconventional solutions (and ICE is unconventional - the industry
standards are CORBA and DCOM in the LAN, and HTTP-based protocols like
SOAP in the Internet). Finally, by integrating several languages into
the system it was possible to deliver some components quicker and with
better quality because we could choose the best language for every
component.

&#60;p&#62;In the people searcher context we use now both SunRPC and ICE. 
The former is arguably better when only O&#38;#39;Caml components have to
talk with each other, and the latter is for crossing the language
boundaries.
&#60;/p&#62;
&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
      &#60;p&#62;&#60;b&#62;Links:&#60;/b&#62;
      &#60;ul&#62;
	
	    &#60;li&#62;&#60;a href=&#34;http://wink.com&#34;&#62;Wink Technologies&#60;/a&#62;:
	    Company homepage
	  &#60;/li&#62;
	    &#60;li&#62;&#60;a href=&#34;http://www.zeroc.com&#34;&#62;ZeroC&#60;/a&#62;:
	    Company homepage, ICE documentation, ICE implementation for a number of languages
	  &#60;/li&#62;
	    &#60;li&#62;&#60;a href=&#34;http://remotetea.sourceforge.net&#34;&#62;RemoteTea&#60;/a&#62;:
	    SunRPC for Java
	  &#60;/li&#62;
	    &#60;li&#62;&#60;a href=&#34;http://oss.wink.com/hydro&#34;&#62;Hydro&#60;/a&#62;:
	    ICE for O&#38;#39;Caml
	  &#60;/li&#62;
	    &#60;li&#62;&#60;a href=&#34;http://projects.camlcity.org/projects/ocamlnet.html&#34;&#62;Ocamlnet&#60;/a&#62;:
	    Internet library for O&#38;#39;Caml with SunRPC support
	  &#60;/li&#62;
      &#60;/ul&#62;
    &#60;/p&#62;
&#60;/div&#62;


          </description>
        </item>
      
        <item>
          <title>Mixing Apples And Pears</title>
          <guid>http://blog.camlcity.org/blog/polyvariants.html</guid>
          <link>http://blog.camlcity.org/blog/polyvariants.html</link>
          <description>

&#60;div&#62;
  &#60;b&#62;Using Polymorphic Variants&#60;/b&#62;&#60;br/&#62;&#38;#160;
&#60;/div&#62;

&#60;div&#62;
  
It is one of the coolest language constructs, but its conception leads
sometimes to confusion. O&#38;#39;Caml allows it to form ad-hoc unions of
tagged values, the so-called polymorphic variants. They are the free-style
counterpart of the &#38;#34;normal&#38;#34; variant types. We want to shed some light
on this construction in this article, and encourage programmers to
try it out.

&#60;/div&#62;

&#60;div&#62;
  
&#60;p&#62;
The most baffling property of the polyvariants is that one can mix
tags that come from different pieces of code. We&#38;#39;ll give an example of
that later in the text, but first let&#38;#39;s explain some foundations.
Syntactically, the tags are written with a leading reverse apostrophe,
so for example &#60;code&#62;`Apple&#60;/code&#62; is a (value-less) tag. Like the
normal variants the tags can have attached values, so for instance
&#60;code&#62;`Pear &#38;#34;it&#38;#39;s sweet man&#38;#34;&#60;/code&#62; is a tag with a string value.  It
is not required to declare polyvariant types, one can simply start
creating such tagged values in the code.

&#60;/p&#62;&#60;h2&#62;Data Analysis Step-By-Step&#60;/h2&#62;

&#60;p&#62;Imagine we want to analyze a string. In a first step, we would like
to classify every character of the string, and determine whether it is
a letter, a digit, or something else. Using polyvariants, this function
does the job:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
let classify_chars s =
  let rec classify_chars_at p =
    if p &#38;#60; String.length s then
      let c = s.[p] in
      let cls =
	match c with
	 | &#38;#39;0&#38;#39;..&#38;#39;9&#38;#39; -&#38;#62; `Digit c
	 | &#38;#39;A&#38;#39;..&#38;#39;Z&#38;#39; | &#38;#39;a&#38;#39;..&#38;#39;z&#38;#39; -&#38;#62; `Letter c
         | _ -&#38;#62; `Other c in
      cls :: classify_chars_at (p+1)
    else
      []
  in
    classify_chars_at 0
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;So this function would return this list for the input string
&#38;#34;a56*&#38;#34;:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;[ `Letter &#38;#39;a&#38;#39;; `Digit &#38;#39;5&#38;#39;; `Digit &#38;#39;6&#38;#39;; `Other &#38;#39;*&#38;#39; ]
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;Note that there is no type declaration! If you enter this function into
the O&#38;#39;Caml toploop, you see that its type is inferred like this:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
val classify_chars :
  string -&#38;#62; 
    [&#38;#62; `Digit of char | `Letter of char | `Other of char ] list =
  &#38;#60;fun&#38;#62;
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;Read this as: The function returns tagged values with tags
&#60;code&#62;`Digit&#60;/code&#62;, &#60;code&#62;`Letter&#60;/code&#62;, or &#60;code&#62;`Other&#60;/code&#62;,
and every tag has an attached character. Note the &#38;#34;greater than&#38;#34;
sign at the beginning of the tag list. It means that this tag list
is compatible with being mixed with completely unrelated tags.
There are also polyvariant types where this sign is reversed
(like in &#60;code&#62;[&#38;#60;...]&#60;/code&#62;), or completely missing. We&#38;#39;ll
come back to that later.

&#60;/p&#62;&#60;p&#62;Back to our string analysis example. We are now interested in
recognizing integer numbers in the list of classified characters.
We assume our input is a list that contains &#60;code&#62;`Digit&#60;/code&#62;
tags, but also other tags. In the output list, sequences of
&#60;code&#62;`Digit&#60;/code&#62; are replaced by &#60;code&#62;`Number&#60;/code&#62;:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
let recognize_numbers l =
  let rec recognize_at m acc =
    match m with
      | `Digit d :: m&#38;#39; -&#38;#62;
          let d_v = Char.code d - Char.code &#38;#39;0&#38;#39; in
          let acc&#38;#39; =
            match acc with
              | Some v -&#38;#62; Some(10*v + d_v)
              | None -&#38;#62; Some d_v in
          recognize_at m&#38;#39; acc&#38;#39;
      | x :: m&#38;#39; -&#38;#62;
          ( match acc with
              | None -&#38;#62; x :: recognize_at m&#38;#39; None
              | Some v -&#38;#62; (`Number v) :: x :: recognize_at m&#38;#39; None
          )
      | [] -&#38;#62;
          ( match acc with
              | None -&#38;#62; []
              | Some v -&#38;#62; (`Number v) :: []
          )
  in
  recognize_at l None
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;The inferred type of this function is now really strange:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
val recognize_numbers :
  ([&#38;#62; `Digit of char | `Number of int ] as &#38;#39;a) list -&#38;#62; &#38;#39;a list 
  = &#60;fun&#62;
&#60;/fun&#62;&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;Basically, it says that there is a tagged list as input, and
that the output list has the same tags. Furthermore, the &#38;#34;&#38;#62;&#38;#34;
sign again signals extensibility, so we can not only use the tags
mentioned in the function, but any other tag as well. Especially,
we are free to pass &#60;code&#62;`Letter&#60;/code&#62; and &#60;code&#62;`Other&#60;/code&#62;
tags in:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
recognize_numbers
   [ `Digit &#38;#39;1&#38;#39; ; `Digit &#38;#39;3&#38;#39;; `Letter &#38;#39;a&#38;#39;; `Digit &#38;#39;2&#38;#39;; `Other &#38;#39;*&#38;#39; ]
&#60;/pre&#62;&#60;/code&#62;
yields
&#60;code&#62;&#60;pre&#62;
[ `Number 13; `Letter &#38;#39;a&#38;#39;; `Number 2; `Other &#38;#39;*&#38;#39;]
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;Note that the type of the &#60;code&#62;recognize_numbers&#60;/code&#62; function
does not reflect all what we could know about the function. We can be
sure that the function will never return a &#60;code&#62;`Digit&#60;/code&#62; tag,
but this is not expressed in the function type. We have run into one
of the cases where the O&#38;#39;Caml type system is not powerful enough to
find this out, or even to write this knowledge down. In practice, this
is no real limitation - the types are usually a bit weaker than
necessary, but it is unlikely that weaker types cause problems.

&#60;/p&#62;&#60;p&#62;The really great thing about the polymorphic variants is that 
it is possible to mix tags that come from different contexts. So
in our example we can combine the two functions, and apply one
after the other:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
let analyze s =
  recognize_numbers (classify_chars s)
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;It is no problem that &#60;code&#62;classify_chars&#60;/code&#62; emits tags that
are completely unknown to &#60;code&#62;recognize_numbers&#60;/code&#62;. And both
functions can use the same tag, &#60;code&#62;`Digit&#60;/code&#62;, without having
to declare in some way that they are meaning the same. It is sufficient
that the tag is the same, and that the attached value has the same
type.

&#60;/p&#62;&#60;p&#62;This may have big advantages for structuring programs. Of course,
our example of string analysis already benefits from the loose type
correspondence the polyvariants make possible. The problem can now be
divided into several steps, and every step needs only to know the tags
it operates on. There is no global type all steps have to agree upon,
rather every step sees only the fraction of type information that is 
needed for the task.


&#60;/p&#62;&#60;h2&#62;Limiting Tags&#60;/h2&#62;

&#60;p&#62;Compare these two functions:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
let number_value1 t =
  match t with
   | `Number n -&#38;#62; n
   | `Digit d -&#38;#62; Char.code d - Char.code &#38;#39;0&#38;#39;

let number_value2 t =
  match t with
   | `Number n -&#38;#62; n
   | `Digit d -&#38;#62; Char.code d - Char.code &#38;#39;0&#38;#39;
   | _ -&#38;#62; failwith &#38;#34;This is not a number&#38;#34;
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;The difference is that the second version explicitly catches the case
of &#38;#34;any other tag&#38;#34; whereas the first version leaves this unspecified.
This leads to different typings:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
val number_value1 : [&#38;#60; `Digit of char | `Number of int ] -&#38;#62; int
val number_value2 : [&#38;#62; `Digit of char | `Number of int ] -&#38;#62; int
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;So in the first version we have a &#38;#34;&#38;#60;&#38;#34; sign! This sign usually only
appears for input arguments, and means that the function can only
process these tags (or less tags), but no other tags. The type checker
prevents that any other tag can be passed in:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
# number_value1 (`Letter &#38;#39;a&#38;#39;);;
This expression has type [&#38;#62; `Letter of char ] but is here used with type
  [&#38;#60; `Digit of char | `Number of int ]
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;In the second version of the function, this case is handled at
runtime.  From a typing perspective, the &#38;#34;&#38;#62;&#38;#34; sign signals that the
function can also accept other tags than the mentioned ones. However,
this function actually raises an exception.

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
# number_value2 (`Letter &#38;#39;a&#38;#39;);;
Exception: Failure &#38;#34;This is not a number&#38;#34;.
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;So the &#38;#34;&#38;#60;&#38;#34; sign is a way to limit the number of tags a function
can process. The programmer gets it by not adding a &#38;#34;catch all&#38;#34; case
to pattern matchings. This kind of polyvariant is useful to enforce
some strictness in programming, and the type checker catches the cases
that would otherwise only be handed at runtime.


&#60;/p&#62;&#60;h2&#62;Giving Polyvariants Names&#60;/h2&#62;

&#60;p&#62;
Although the mantra of this article is that we don&#38;#39;t need declarations,
it is of course possible to define named polymorphic variants. For
example, we could introduce these named types:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
type classified_char =
  [ `Digit of char | `Letter of char | `Other of char ]

type number_token =
  [ `Digit of char | `Number of int ]
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;Note that there is no &#38;#34;&#38;#62;&#38;#34; or &#38;#34;&#38;#60;&#38;#34; sign in such definitions - it
would not make sense to say something about whether more or less tags
are possible than given, because the context is missing.

&#60;/p&#62;&#60;p&#62;Using these names, one could simplify the typings of our functions:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
val classify_chars : string -&#38;#62; [&#38;#62; classified_char ]
val recognize_numbers : ( [&#38;#62; number_token ] as &#38;#39;a) list -&#38;#62; &#38;#39;a list 
val number_value1 : [&#38;#60; number_token ] -&#38;#62; int
val number_value2 : [&#38;#62; number_token ] -&#38;#62; int
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;As you see, the &#38;#34;&#38;#62;&#38;#34; or &#38;#34;&#38;#60;&#38;#34; sign can still appear in function
types using these names. Actually, there is a special syntax behind
this notation. If you just say &#60;code&#62;number_token&#60;/code&#62; in a
type expression, exactly the definition applies (without sign).
But you can also construct new polyvariants from existing ones.
For example, 

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;[ classified_char | number_token ]
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;would mean a type that combines the tags of both types, i.e. this is 
the same as if all four tags were enumerated. The syntax
&#60;code&#62;[&#38;#60;number_token ]&#60;/code&#62; is only a special case of this
type constructor, where the new type also gets a sign.


&#60;/p&#62;&#60;h2&#62;Matching Variants&#60;/h2&#62;

&#60;p&#62;Try to compile this piece of code that ought to sum up all 
&#60;code&#62;`Digit&#60;/code&#62; and &#60;code&#62;`Number&#60;/code&#62; tags of a list of
any tags:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
let rec sum l =
  match l with
    | x :: l&#38;#39; -&#38;#62;
      ( match x with
         | `Digit _ | `Number _ -&#38;#62;
              number_value1 x + sum l&#38;#39;
         | _ -&#38;#62;
              sum l&#38;#39;
      )
    | [] -&#38;#62; 
      0
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;It compiles, but there is a little surprise. The compiler emits 
a warning that the &#60;code&#62;_ -&#38;#62; sum l&#38;#39;&#60;/code&#62; case of the matching
is unused, and the inferred type is just

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
val sum : [ `Digit of char | `Number of int ] list -&#38;#62; int 
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

i.e. the &#38;#34;&#38;#62;&#38;#34; sign is missing that would allow us to pass any tags
in. What&#38;#39;s wrong? 

&#60;p&#62;This is a pitfall one quickly runs into when using polyvariants.
The type checker assumes that the &#60;code&#62;x&#60;/code&#62; in 
&#60;code&#62;number_value1 x&#60;/code&#62; has the same type as the &#60;code&#62;x&#60;/code&#62;
that is matched again. It is not sufficient to use a matched variable
in the expression of the matched case to restrict its type.
Fortunately, there is a special syntax for that (look at the bold &#38;#34;as&#38;#34;
clause):

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
let rec sum l =
  match l with
    | x :: l&#38;#39; -&#38;#62;
      ( match x with
         | (`Digit _ | `Number _) &#60;b&#62;as y&#60;/b&#62; -&#38;#62;
              number_value1 &#60;b&#62;y&#60;/b&#62; + sum l&#38;#39;
         | _ -&#38;#62;
              sum l&#38;#39;
      )
    | [] -&#38;#62; 
      0
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

&#60;p&#62;Here, &#60;code&#62;y&#60;/code&#62; is the value &#60;code&#62;x&#60;/code&#62; for the case that
the match applies, so &#60;code&#62;y&#60;/code&#62; can have a stricter type than
&#60;code&#62;x&#60;/code&#62;.

&#60;/p&#62;&#60;p&#62;Alternatively, the match condition could also have been written as:

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
match x with
  | #number_token as y -&#38;#62; ...
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

i.e. one can use named polyvariants in matchings. This is just a 
convenience notation for the former.


&#60;h2&#62;Behind The Scene&#60;/h2&#62;

&#60;p&#62;The polyvariants might be cool, but many programmers suspect that
the performance of their programs suffer when they use them. Well,
although there are some runtime cost, these are very small, and not
noticeable for many programs.

&#60;/p&#62;&#60;p&#62;Internally, the tags are represented by hash values of the names
of the tags. So the tags are simply reduced to integers at runtime.
Compared with the normal variant types, there is some additional
overhead for tags with values. In particular, for storing
&#60;code&#62;`X&#38;#160;value&#60;/code&#62; one extra word is needed in comparison with
the normal variant &#60;code&#62;X&#38;#160;value&#60;/code&#62;.

&#60;/p&#62;&#60;p&#62;It is possible that the hash values of variants collide, e.g.

&#60;/p&#62;&#60;blockquote lang=&#34;x-ocaml&#34;&#62;
&#60;code&#62;&#60;pre&#62;
# type x = [ `jagJhn | `oZshTt ];;
Variant tags `jagJhn and `oZshTt have same hash value. Change one of them.
&#60;/pre&#62;&#60;/code&#62;
&#60;/blockquote&#62;

As you see, the compiler checks for this rare case. I&#38;#39;ve never seen
it in practice.

&#60;p&#62;Despite rumours, there is nothing special done at link time. The
tags are already cut down to integers at this point of compiling.


&#60;/p&#62;&#60;h2&#62;Conclusion&#60;/h2&#62;

&#60;p&#62;I hope I&#38;#39;ve shown how elegant code looks that uses polyvariants to
represent data cases. But there is some more to say, especially if you
look at other programming languages.

&#60;/p&#62;&#60;p&#62;The author of this article thinks that polymorphic variants are one
of the features that makes O&#38;#39;Caml so different in comparison with
mainstream languages like Java. In particular, there are competing
approaches for representing data cases, and one of the radical ideas
of object orientation has always been that combining data and program
cases into a single class construct is the best way to deal with the
problem. However, I think something has been overlooked - data and
algorithms do not always walk hand in hand, and classes are inflexible
if only a loose correlation between both is needed. In contrast, 
polyvariants give the programmer the maximum of freedom in this respect.
Thus it is believed that polyvariants are a serious alternative for
representing data cases.

&#60;/p&#62;
&#60;/div&#62;

&#60;div&#62;
  Gerd Stolpmann works as O&#38;#39;Caml consultant

&#60;/div&#62;

&#60;div&#62;
  
      &#60;p&#62;&#60;b&#62;Links:&#60;/b&#62;
      &#60;ul&#62;
	
	    &#60;li&#62;&#60;a href=&#34;http://caml.inria.fr/pub/docs/manual-ocaml/manual006.html#htoc41&#34;&#62;Polymorphic variants&#60;/a&#62;:
	    Chapter in the O&#38;#39;Caml manual
	  &#60;/li&#62;
      &#60;/ul&#62;
    &#60;/p&#62;
&#60;/div&#62;


          </description>
        </item>
      
  </channel>
</rss>
