Ctypes

From the OCaml Labs wiki
Jump to: navigation, search

Overview[edit]

OCaml-ctypes is a library that enables binding to C libraries using OCaml - without the need to write or generate C code. Removing the need to write C "stub" functions reduces the likelihood of error and makes the process of writing C extensions more straightforward. The Ctypes library lets you define the C interface in pure OCaml, and the library then takes care of loading the C symbols and invoking the foreign function call.

Why Ctypes?[edit]

The need to bind and call functions written in another language arises frequently in programming, for example calling C code. Although it is conceptually straightforward, writing bindings presents many opportunities to introduce subtle errors. The programmer must convert the arguments of the bound function from OCaml values to C values, pass them to the foreign function, and convert the result back to an OCaml value. Using an external tool for this process can negatively affect program cohesiveness, since there is no connection between the types used within the tool and the types of the resulting code, and since tools introduce types and values into a program that are not apparent in its source code. Ctypes makes it possible to eliminate the boilerplate needed to bind foreign functions — not by generating it with an external tool, but by using the abstraction mechanisms of the language to parameterise over the common type structure. The result is type-safe, flexible, and tightly integrated into the host language.

How?[edit]

The central idea behind ctypes —writing OCaml-to-C bindings in OCaml rather than in C— has been used successfully in other languages. For example, James Bielman’s Common Foreign Function Interface for Common Lisp, Matthias Blume’s No-Longer-Foreign Function Interface for SML and Thomas Heller’s ctypes library for Python (from which this OCaml library draws both its name and its inspiration) are all based on this approach.

Our foreign function interface design elaborates on these design principles by separating the description of the C foreign functions from how we link to that code. The core of ctypes is a set of combinators for describing the structure of C types -- numeric types, arrays, pointers, structs, unions and functions. You can use these combinators to describe the types of the functions that you want to call, then bind directly to those functions -- all without writing or generating any C. Ctypes makes passing OCaml functions to C as straightforward as passing first-order values by representing the foreign bindings as first-class module packages. This method together with a staged, generic programming approach strives to simplify library bindings and make the process more reliable and flexible. OCaml functors and GADTs are integral to the success of Ctypes.

Binding a C function using ctypes involves two steps:

  • First, construct an OCaml value that represents the type of the function
  • Second, use the type representation and the function name to resolve and bind the function

For example, here's a binding to C's puts function, which prints a string to standard output and returns the number of characters written:

let puts = foreign "puts" (string @-> returning int)

After the call to foreign the bound function is available to OCaml immediately. Here's a call to puts from the interactive top level:

# puts "Hello, world";;
Hello, world
- : int = 13

Static and dynamic binding strategies[edit]

Although foreign is simple to use, there's quite a lot going on behind the scenes. The two arguments to foreign are used to dynamically construct an OCaml function value that wraps the C function: the name is used to resolve the code for the C function, and the type representation is used to construct a call frame appropriate to the C types involved and to the underlying platform. The dynamic nature of foreign that makes it convenient for interactive use, also makes it unsuitable for some environments. There are three main drawbacks:

  • Binding functions dynamically involves a certain loss of safety: since C libraries typically don't maintain information about the types of the functions they contain, there's no way to check whether the type representation passed to foreign matches the actual type of the C function.
  • Dynamically constructing calls introduces a certain interpretative overhead. In mitigation, this overhead is much less than might be supposed, since much of the work can be done when the function is bound rather than when the call is made, and foreign has been used to bind C functions in performance-sensitive applications without problems.
  • The implementation of foreign uses a low-level library, libffi, to deal with calling conventions across platforms. While libffi is mature and widely supported, it's not appropriate for use in every environment. For example, introducing such a (relatively) large and complex library into Mirage would compromise many of the benefits of writing the rest of the system in OCaml.

Fear not! Foreign is one of a number of binding strategies, and OCaml's module system makes it easy to defer the choice of which strategy to use when writing the actual code. Placing the expat bindings in a functor (parameterised module) makes it possible to abstract over the linking strategy:

module Bindings(F : FOREIGN) =
struct
 let parser_create = F.foreign "XML_ParserCreate"
   (ptr void @-> returning xml_Parser)
 let parser_free = F.foreign "XML_ParserFree"
   (xml_Parser @-> returning void)
 let set_element_handler = F.foreign "XML_SetElementHandler"
   (xml_Parser @-> start_handler @-> end_handler @-> returning void)
 let parse = F.foreign "XML_Parse"
   (xml_Parser @-> string @-> int @-> int @-> returning int)
end

Generating code in this way eliminates the concerns associated with constructing calls dynamically:

  • The C compiler checks the types of the generated calls against the C headers (the API), so the safety concerns associated with linking directly against the C library binaries (the ABI) don't apply
  • There's no interpretative overhead, since the generated code is (statically) compiled
  • The dependency on libffi disappears altogether

Switching between dynamic and static binding strategies is quite straightforward, even for code that was originally written without parameterisation. Bindings written using early releases of ctypes used the dynamic strategy exclusively, since dynamic binding was then the only option available. The commit logs for projects that switched over to static generation and linking (e.g. ocaml-lz4 and async-ssl) when it became available show that moving to the new approach involved only straightforward and localised changes.

Cstubs-structs Module[edit]

Ctypes is generally used to specify how to call C code using a DSL that is executed at runtime. This works great for functions but has some limitations when it comes to data types and macros. Cstubs_structs provides a powerful tool to use C itself to generate the ML definition of the struct that Ctypes can then use at runtime. Because C is being used to generate the definition, it also gives access to other constructs that only exist at compile time, such as macros.

  1. Write a stubs module that is a functor which defines the bindings.
  2. Write a module that uses the bindings module and outputs a C file.
  3. Compile the program from step 2 and execute it.
  4. Compile the C program generated in step 3.
  5. Run the C program from step 4, generating an ML module.
  6. Compile the module generated in step 5.

The generated module can then be applied to the functor created in step 1.

Current Projects[edit]

  • Enguerrand Decorne is spending his OCL internship working on using Ctypes to bind OCaml TLS. He's been getting to grips with the internals of ocaml-tls and ocaml-ctypes, working on inverted C stubs, and has a long term goal of replacing SSL in OpenBSD with OCaml TLS.

Related Pages[edit]

Repositories[edit]

Articles[edit]

Talks[edit]

Releases[edit]

0.7.0 - July 2016 Features

Bug fixes

0.6.2 - June 2016 Bug fixes

0.6.1 - June 2016 Bug fixes

0.6.0 - June 2016 Features

  • The Cstubs.FOREIGN interface has been extended with returning and @->, and some new types. See the pull request for details: https://github.com/ocamllabs/ocaml-ctypes/pull/389. NB: code that generates bindings using Cstubs may need to be updated to select return and @-> from the bindings functor argument rather than from the Ctypes module. Code that needs to be updated will fail to compile with the new interface. The pull request shows how to update your code, if necessary.
  • The Cstubs module can now generate asynchronous bindings to C functions using the Lwt jobs framework. See the pull request for details: https://github.com/ocamllabs/ocaml-ctypes/pull/391
  • The Cstubs module now supports optionally returning errno alongside the return value of bound C functions. See the pull request for details: https://github.com/ocamllabs/ocaml-ctypes/pull/392
  • Cross-compilation support is improved: the configuration step no longer runs binaries on the host. See the pull request for details: https://github.com/ocamllabs/ocaml-ctypes/pull/383
  • The Unsigned.S interface has new of_int64 and to_int64 functions.

Compatibility

  • The deprecated *:* and +:+ functions have been removed. Use Ctypes.field instead.
  • OCaml 4.00.* is no longer supported. The earliest supported OCaml release is 4.01.0

0.5.0 - March 2016

  • Build and install *.cmt and *.cmti files
  • Expose time_t as an unsigned value
  • Expose larger interfaces for POSIX types known to be integer types
  • Add support for 1- and 2-byte unsigned integer typedefs
  • Add support for 1-byte and 2-byte integer typedefs
  • Add a Signed.Int module
  • Expose more information in the Uncoercible exception
  • allocate_n now defaults to zeroing its memory
  • Add public root management interface. NB: the interface is experimental and subject to change
  • Look through views to add fields to structs and unions
  • Support signed arithmetic operations for ssize_t
  • Add support for ptrdiff_t as a typedef for a signed integer type
  • Support intptr_t and uintptr_t as typedefs
  • Support coercions between object and function pointers
  • Add public funptr_of_raw_address function
  • Support static_funptr coercions
  • Add function pointers to the core type language. (See the Ctypes_static.static_funptr type, on which Foreign.funptr and Foreign.foreign are now based.)
  • Better support for functions returning void with inverted stubs
  • Add support for releasing runtime lock to Cstubs_inverted

Bug fixes

  • Fix: inconsistent use of caml_stat_* functions
  • Fix: a memory leak in ctypes_caml_roots_release

0.4.2 - February 2016

  • Fix a bug involving access to local roots while the runtime lock was not held

0.4.1 - April 2015

  • Fix placement of docstring titles
  • Add funptr's optional arguments to funptr_opt
  • Fix a typo in libffi detection code
  • Synchronize foreign.mli files (documentation)

0.4.0 - March 2015

  • Windows support (A. Hauptmann)
  • Xen support (Thomas Leonard)
  • Add the C99 bool type (Ramkumar Ramachandra)
  • Typedef support (Edwin Török)
  • Enum types.
  • Accessing C globals with foreign_value in generated stubs
  • Retrieving #define and enum constants from C
  • Releasing the runtime lock in C calls
  • Acquiring the runtime lock in callbacks
  • Passing 'bytes' values directly to C (Peter Zotov)
  • Custom printers in views (Thomas Braibant)
  • Optionally obtain struct and union layout from C
  • string_opt wraps char *, not void *.
  • Remove some poorly-supported POSIX types. Several of the types in the PosixTypes module are no longer available
  • Use nativeint to represent pointers
  • Support zero-argument callbacks
  • A function for retrieving field names (Thomas Braibant)
  • Better exception handling when using RTLD_NOLOAD (A. Hauptmann)
  • RTLD_LOCAL support for Dl.dlopen
  • Changed the #include path to $(ocamlfind query ctypes)
  • Renamed some internal modules to avoid name clashes

There are now two OPAM packages, ctypes and ctypes-foreign. Only the latter depends on libffi, so if your package uses only the ctypes stub generation backend then users of your library need not install libffi. If you use the dynamic (Foreign) backend then you should depend on both packages.

0.3.4 - December 2014

  • Fix printing for nullary function stubs

0.3.3 - November 2014

  • Respect pbyte_offset with cstubs

0.3.2 - 2014

  • Add bytes to the META "requires" field

0.3.1 - 2014

  • Support for 'bytes'
  • Avoid invalidated pointer access

0.3.0 - 2014

  • Support for passing OCaml strings directly to C (Peter Zotov)
  • Support for generating C stubs from names and type declarations
  • Support for turning OCaml modules into C libraries
  • Add a function string_from_ptr for creating a string from an address and length
  • Generate codes for libffi ABI specifications
  • Add raw_address_of_ptr, and inverse of ptr_of_raw_address
  • Add a function typ_of_bigarray_kind for converting Bigarray.kind values to Ctypes.typ values
  • Improved coercion support
  • Array has been renamed to CArray

0.2.3 - February 2014

  • Fix GC-related bug that shows up on OSX

0.2.2 - 2014

  • Don't install ctypes-foreign cmx files

0.2.1 - 2013

  • Bump META version

0.2.0 - 2013

  • Bigarray support
  • Give the user control over the lifetime of closures passed to C
  • Top level printing for C values and types
  • Basic coercion support
  • Remove returning_checking_errno; pass a flag to foreign instead
  • Add an optional argument to Foreign.foreign that ignores absent symbols (Daniel Bünzli)
  • More precise tests for whether types are 'passable'
  • Compulsory names for structure and union fields (*:* and +:+ are deprecated but still supported for now)
  • UInt32.of_int32, UInt32.to_int32, UInt64.of_int64 and UInt64.to_int64 functions
  • Finalisers for ctypes-allocated memory
  • Add a string_opt view (Rob Hoes)
  • Add the 'camlint' basic type
  • Complex number support
  • Abstract types now have names

0.1.1 - August 2013

  • Remove hard-coded alloc size

0.1.0 - 2013

  • Initial release