BLOG ON CAMLCITY.ORG: Ocamlnet 3
What's new in Ocamlnet 3: The Win32 port - by Gerd Stolpmann, 2009-10-20
In the POSIX world (let me use "POSIX" as general term for
Unix/Linux/BSD) the select()
system call is known as the
linchpin for dispatching file descriptor events. Generally, a program
using select()
looks like
(* Event loop: *) while <something to do> do <find out interesting descriptors> select(); <interpret events> doneand the crucial point is that all kinds of descriptors can be passed as input to
select()
. This makes it possible to wait
simultaneously for very different events, like socket events, pipeline
events, or events for devices. There is no such universal
select()
in Win32. The select()
call
provided by Win32 only works for sockets. There are other kinds of
mechanisms for event handling, though, but they are less systematic,
and one faces the problem that different ways of watching for events
need to be integrated into the single event loop.
Since version 3.11, Ocaml includes a "fancy" version of
select()
in its standard library, which actually tries to
emulate the POSIX semantics by combining several Win32 event-handling
approaches in a quite tricky way. The reader might ask what the whole
point for Ocamlnet is then - one could simply have relied on this
emulation. However, there are a number of drawbacks. First, some of
the used emulation techniques are incredibly simplistic. For example,
if one of the descriptors references the output side of a pipe, the
implementation falls back to a form of busy waiting (wasting CPU
time). The input side of a pipe always signals that the pipe has space
for new data (which may block the program). There is no special
support for named pipes, although Win32 supports them much better than
anoynmous pipes. The second problem is that the emulation is limited
to 64 descriptors only - far too less for servers [please see update
below]. For all these reasons, Ocamlnet does not make use of this
select()
emulation, but follows its own, more ambitious
approach. Actually, the basic problem of this emulation is that
select()
is not the right level of abstraction for
combining multiple ways of waiting for events into a single operation.
Update: I've got a message from Sylvain Le Gall, the
implementor of this emulation code. He explained that actually 4096
handles can be waited for (which is true, sorry for my mistake). Also,
he points out that he does not know how to handle anonymous pipes
better, and that even Cygwin uses the same technique as his code. The
reason for not treating named pipes specially is that the Ocaml
standard library does not support them anyway. - I think his
implementation decisions are perfectly rational for a general-purpose
and drop-in select()
emulation, and the standard library
is now way better than before his contribution. However, Ocamlnet has
special needs (like using named pipes as substitute for Unix Domain
sockets), and by switching to pollsets (see below) as the API for
handling events the system resources can be better managed.
select()
-style I/O. It
works a lot like a non-blocking TCP connect: One has to start the I/O
operation, and it is signaled to the caller when the operation is
completed. The signal can be a callback (APC), or it can be an event
variable that is set to signaled state. The difference to
select()
is that the latter indicates in advance
whether an I/O operation would be possible (and non-blocking) before
the operation is started whereas overlapped I/O requires that the
operation is actually initiated, and one can only wait asynchronously
for its completion. (Look
here for how Microsoft explains overlapped I/O.)
This article is not about which style is better, but how to port
programs using Ocamlnet that assume select()
loops as
their basic construction principle. So the question here is: Can we
get some kind of emulation of select()
for overlapped
I/O?
Before answering, let us look at for what we would need overlapped
I/O. Essentially, one can use it for files, sockets, and a few IPC
mechanisms. Ocamlnet does not have support for reading or writing
files asychronously anyway, and for sockets it is easy to use the
Win32 call WSAWaitForMultipleEvents()
in order to combine
waiting for sockets with other events. Actually, Ocamlnet is mostly
interested in overlapped I/O for named pipes, because named pipes are
a good replacement for the otherwise missing Unix Domain sockets and
socketpairs. (Note that Win32 named pipes are a different IPC
mechanism than POSIX named pipes, and support connection multiplexing
like TCP sockets.)
The Win32 functions ReadFile()
and
WriteFile()
can be used to start an overlapped I/O
request by passing an OVERLAPPED
struct to them (for a
non-overlapped request the argument would remain NULL). This causes
that the function returns immediately, and that the completion of the
operation is signaled to the caller by an IPC mechanism. In Ocamlnet
Win32 events are used for that purpose. A Win32 event is a
synchronization primitive that works similar to a condition variable:
It can be in unsignaled or in signaled state, and it is possible to
suspend the program until it enters signaled state. Win32 events have
the big advantage that they provide the required level of genericity:
Many Win32 objects are actually either subtypes of Win32 events, and
implement the event interface directly, or they can be connected to
Win32 events. For example, a process handle is also an event, and it
is signaled when the referenced process is terminated - so waiting for
the process handle as event means to wait until the process is
finished. It is also possible to wait until one of several events
enter signaled state (WaitForMultipleObjects()
). By
putting an event into the OVERLAPPED
struct Windows
notifies the user about the completion of the operation by signalling
the event.
Collecting events and waiting until one of them is signaled sounds
already a lot like select()
. Still, the problem remains
that one has to start the operation before one can wait for it. There
is no way around it on the Win32 level. The only chance for porting
Ocamlnet was to change the level of abstraction the user code
sees. Actually, it was possible to provide an emulation, but the price
is that the user code must no longer invoke the generic read/write
operations, but special wrappers that do the required impedance
transformation. So Unix.read
and Unix.write
are forbidden when dealing with named pipes, and instead the special
wrapper functions Netsys_win32.pipe_read
and
Netsys_win32.pipe_write
have to be called. Ocamlnet
provides buffers for input and output, so that pipe_read
only reads from the input buffer (and raises EWOULDBLOCK
if the buffer is empty), and that pipe_write
only writes
to the output buffer (and raises EWOULDBLOCK
if the
buffer is full). In addition to that, Ocamlnet organizes that
overlapped I/O operations are started in the background when data
needs to be pumped from the named pipe to the input buffer, or from
the output buffer to the named pipe. This way, the overlapped
operation is hidden from the user, and Ocamlnet can provide a view so
that an event signals when a pipe_read
or a
pipe_write
can actually process data.
Here is a small subset of the named pipe API provided by Ocamlnet
in the module Netsys_win32
:
type w32_event (* Win32 event objects *) type w32_pipe (* A pipe endpoint *) type pipe_mode = Pipe_in | Pipe_out | Pipe_duplex val pipe_pair : pipe_mode -> w32_pipe * w32_pipe (* like socketpair *) val pipe_read : w32_pipe -> string -> int -> int -> int (* like Unix.read *) val pipe_write : w32_pipe -> string -> int -> int -> int (* like Unix.write *) val pipe_shutdown : w32_pipe -> unit (* like Unix.shutdown *) val pipe_rd_event : w32_pipe -> w32_event val pipe_wr_event : w32_pipe -> w32_event (* get the events notifying about read/write possibility *) val wsa_wait_for_multiple_events : w32_event array -> int -> int option (* wait for a number of events, or until a timer times out *)
For instance, this code reads from two named pipes p1 and p2 simultaneously, and outputs the data to stdout:
let s = String.create 1024 let try_read p = try let n = Netsys_win32.pipe_read p s 0 1024 in if n=0 then raise Exit; (* deal somehow with eof *) print_string (String.sub s 0 n) with Unix.Unix_error(Unix.EWOULDBLOCK,_,_) -> () let loop() = try let e1 = Netsys_win32.pipe_rd_event p1 in let e2 = Netsys_win32.pipe_rd_event p2 in while true do match Netsys_win32.wsa_wait_for_multiple_events [| e1; e2 |] (-1) with | None -> () | Some _ -> try_read p1; (* always try both for simplicity of the example *) try_read p2 done with | Exit -> ()
Note that p1 and p2 have type Netsys_win32.w32_pipe
,
and not Unix.file_descr
.
This example has the shape of the select()
loop
outlined at the beginning of the article. There are still differences,
though, to the POSIX way of doing it: The file handle provided by the
OS is hidden by the Netsys_win32
layer, and cannot be
directly used by the program (because this could break the
abstraction). Also, one first has to create event objects (here by
calling pipe_rd_event
) in order to set up waiting. Last
but not least the emulation itself is also not free of subtle
artefacts introduced by the Netsys_win32
layer. In
particular, there is no way of cancelling the overlapped I/O
operations performed under the hood of the emulation (one can only
close/disconnect the pipe to stop them). This can be an issue when the
file descriptor is passed on to other processes. (N.B. Windows Vista
promises to solve the cancellation issue, but I had not yet a chance
to test it.)
Of course, this approach only works when the watched Win32 file
object implements overlapped I/O. If not, one can only read and write
synchronously, and Ocamlnet provides special helper threads for
dealing with this issue. This is discussed in more detail below.
First lets look how to generalize select()
so it can also
be backed by the Win32 call WSAWaitForMultipleEvents()
.
select()
call is reflected by the Ocaml standard
library as a function
val Unix.select : file_descr list -> file_descr list -> file_descr list -> float -> file_descr list * file_descr list * file_descr list
This interface has a few disadvantages. First, in every round of
waiting one has to pass all descriptors to select()
.
This is time-consuming, and the reason for the bad reputation of
select()
with regards to performance (although in reality
is not as bad as some bloggers pretend). Second, there is no way to
cancel an already started select()
from a different
thread. This is important for multi-threaded programs, because a
second thread may want to change the list of descriptors the first
thread is watching.
Ocamlnet uses now a different interface for polling descriptors, so-called pollsets:
class type pollset = object method find : Unix.file_descr -> poll_req_events method add : Unix.file_descr -> poll_req_events -> unit method remove : Unix.file_descr -> unit method wait : float -> ( Unix.file_descr * poll_req_events * poll_act_events ) list method dispose : unit -> unit method cancel_wait : bool -> unit end
These sets are used in this way: Descriptors may be added and removed
from the set, and for each descriptor one can specify which events to
watch for (reading or writing). When the set is ready, the user can
invoke wait
to start waiting for the specified events.
The function returns the events that are actually signalled by the
OS. It is possible to cancel waiting at any time by calling
cancel_wait true
.
I had not only the Win32 port in mind when designing the pollsets, but also POSIX-type OS. For example on Linux there is the epoll API that operates on a similar data structure, and that can easily back a pollset implementation.
The Win32 implementation of pollsets is done in two layers. The basic
class Netsys_pollset_win32.pollset
already supports all
kinds of descriptors Ocamlnet needs, but is restricted to watch at
most 64 Win32 event objects (corresponding to 63 sockets, or 31 named
pipes). This restriction is abandoned by
Netsys_pollset_win32.threaded_pollset
. However, the
latter class requires that the program is multi-threaded.
Essentially, the implementation works by inspecting the descriptors to
be watched, and by looking up the required helper objects (like
calling Netsys_win32.pipe_rd_event
to get the Win32 event
object reflecting the read status of a pipe). After that,
WSAWaitForMultipleEvents()
is invoked to start waiting,
and when events happen, they are mapped back from the signaled event
objects to the connected file descriptors. The
cancel_wait
feature is supported by always adding an
additional Win32 event object to the set of watched events which is
set to signaled state when cancel_wait
is called.
Of course, this is only a rough sketch of the algorithm. It is quite
complicated which helper objects are actually needed, and how they
affect the central WSAWaitForMultipleEvents()
call. Of
course, this depends very much on the type of the descriptors put into
the pollset, and it would go too far to fully present these details in
this article.
However, one thing should not remain "magic" to the reader: In the
above paragraphs, I pointed out that the representation of Win32
objects like named pipes is complex (e.g. it includes buffers,
OVERLAPPED
structs, and Win32 event objects), and that an
opaque type like Netsys_win32.w32_pipe
needs to hide the
details of the representation from the user. Also, I mentioned that
using the Unix.file_descr
of the named pipe handle would
break the abstraction, and that the handle is made unavailable to user
code for this reason. However, pollsets nevertheless use file
descriptors for passing system objects around. How does this fit
together?
Ocamlnet does not give up on Unix.file_descr
as the
central type for referencing system objects - switching to a different
type for this purpose would break tons of user code. Instead, a tricky
mechanism has been added allowing us to keep
Unix.file_descr
but also to attach further management
objects to such descriptors. This is explained in detail below. The
crucial idea is that Ocamlnet introduces artificial descriptors that
are only used for identifying system objects but that cannot be used
for actually performing I/O. So the descriptor handed out to user
code for a named pipe is not the Win32 handle for the named pipe
(which would allow to do I/O and to break abstractions), but it is
an additionally allocated handle that only exists for the purpose
of identifying the system object. This handle, now called proxy
descriptor, is the value passed to pollsets and other interfaces
assuming Unix.file_descr
as the type for referencing
system objects.
For sockets everything is very easy. As mentioned, the pollset
implementation is based upon WSAWaitForMultipleEvents()
which is actually a Winsock function. It supports sockets directly -
no tricky emulation layers are required.
Win32 distinguishes anonymous pipes as returned by
Unix.pipe
from named pipes. Anonymous pipes do not
support overlapped I/O. As this kind of pipes is important for
starting subprocesses, Ocamlnet nevertheless tries to provide an
asynchronous API for them. Because only synchronous I/O is possible
helper threads need to be created which implement buffers in much the
same way than it is done for overlapped I/O: The helper threads pump
data from the buffer to the pipe, or from the pipe to the buffer
(depending on the direction of I/O). The user code only accesses the
buffer in a non-blocking way, and Win32 event objects are used to
signal the state of the buffer (empty or full). The resulting API
looks very much like the API for named pipes, and it is also required
that the special read
and write
functions of
the API are called by user code instead of Unix.read
and
Unix.write
, and there are also proxy descriptors. As the
implementation is done by helper threads, there is the difficulty how
to stop these threads when there is no more interest in watching the
descriptors. Unfortunately, this is not possible in the general case -
when the pipe "hangs" the thread will also hang, and there is no means
to interrupt it (there are no signals (software interrupts) in Win32,
and thread cancellation is a hot issue). As anonymous pipes are mostly
used for driving external processes this seems to be acceptable (there
is always the fallback solution to kill the process).
The Win32 consoles are supported in the same way as anonymous pipes.
Even processes can be waited for. Although there is no direct
data flow (neither read
nor write
make sense
in any way), processes are referenced by means of file handles. When
the handle is set to signaled state, this means that the process has
terminated. So process handles can be added to pollsets, and this
makes it easy to wait for the termination of a subprocess in parallel
to managing the I/O over the pipes that are connected with the
process.
For other types of file handles there is no good support yet (except one creates the mentioned helper threads). Of course, adding support would be easy for all handles where Win32 allows overlapped I/O. However, this seems not to be urgent.
Unix.file_descr
while having complex management objects for controlling asynchronous
I/O. For example, one can get a proxy descriptor for a named pipe by
calling:
val pipe_descr : w32_pipe -> Unix.file_descrThe returned descriptor cannot be used for anything except for looking up the attached named pipe up:
val lookup_pipe : Unix.file_descr -> w32_pipe(There is also a slightly more general lookup function that can be used for any type of Win32 object using proxy descriptors.)
The proxy descriptors are backed by real file handles (otherwise it
could happen that the next open()
returns the same
handle, and the proxy descriptor would no longer be identifiable as
such), but a cheap kind of handle was chosen to avoid too much
resource consumption. There is a hidden global table that maps proxy
descriptors to the referenced complex objects, and by GC trickery it
is ensured that the table shrinks when proxy descriptors are freed by
GC runs (note that Unix.file_descr
is a heap-allocated
value for Win32, so we can add finalisers).
Of course, user code has to close the proxy descriptors when they are no longer needed (but only when they were actually requested). This means they have the same "lifetime" as normal file descriptors which also need to be closed after use.
connect
where a generic approach is hard to get right,
but is totally impractical for simple reading and writing. It would be
required to call different functions that have very similar
signatures, e.g. (read case) pipe_read
for named pipes,
input_thread_read
for objects managed by helper threads,
and of course the well-known Unix.recv
for sockets and
Unix.read
for normal files.
In the Netsys
module a simple generic approach of handling read and writes is
available. There is a function inspecting the kind of file descriptor,
and a set of generic functions for actually performing read/write:
type fd_style (* indicates the kind of descriptor (details omitted here) *) val get_fd_style : Unix.file_descr -> fd_style (* get the file descriptor style *) val gread : fd_style -> Unix.file_descr -> string -> int -> int -> int (* generic read: call the right implementation function depending on the fd style *) val blocking_gread : fd_style -> Unix.file_descr -> string -> int -> int -> int (* similar to gread, but it is blocked until at least one byte can be read *) val really_gread : fd_style -> Unix.file_descr -> string -> int -> int -> unit (* similar to gread, but it is blocked until exactly the passed number of bytes are read *) (* similar functions are available for writing, for shutting down, and for closing *)If the descriptors are proxy descriptors, these functions automatically look up the underlying complex management object and invoke the right I/O function. If the descriptors are sockets, they call socket functions like
Unix.recv
. Otherwise,
they fall back to Unix.read
or Unix.write
.
Large parts of Ocamlnet have been ported so they use this generic
layer instead of directly calling Unix.read
or
Unix.write
. For example, the class
Netchannels.input_descr
wraps a netchannel object around
a file descriptor, and it has been changed so it can now also deal
with all kinds of descriptors supported by gread
.
Http_client
. The question is what can be supported on
Win32.
Fortunately, the answer is - thanks to dealing with these details carefully - that almost everything works! Although not every module has been fully tested yet, the difficult modules could be ported, and there is now the conviction that the simpler ones are not in any way problematic.
The most difficult case was Netplex. Of course, there is no way
to support multi-processing as there is no fork()
equivalent in Win32. However, multi-threading works well. The
socketpairs connecting the containers with the controller have been
replaced by pairs of connected named pipes. For Unix Domain sockets
there is the possibility of using either named pipes, or Internet
sockets bound to localhost.
As Netplex uses SunRPC as base library, it was of course also possible to port this Ocamlnet feature. SunRPC cannot only be used on sockets, but also on named pipes.
Another difficult beast was the Shell library for starting and managing external processes. It is now as easy to create complex pipelines of interconnected subprocesses for Win32 as it used to be for POSIX.
The Nethttpd web server library could also be verified to be working, even in conjunction with Netplex.
svn
command
on this URL, or click on it to view it with your web browser - most
of the discussed code lives in src/netsys
).
The Win32 port of Ocamlnet requires the MinGW port of Ocaml. Also, the same set of base libraries are needed as for POSIX, especially PCRE. The simplest way to install that is to use GODI which also supports MinGW.