191 lines
8.9 KiB
Plaintext
191 lines
8.9 KiB
Plaintext
Berkeley DB's Java API
|
|
$Id: README,v 11.2 2003/03/06 00:42:16 mjc Exp $
|
|
|
|
Berkeley DB's Java API is now generated with SWIG
|
|
(http://www.swig.org). This document describes how SWIG is used -
|
|
what we trust it to do, what things we needed to work around.
|
|
|
|
|
|
Overview
|
|
========
|
|
|
|
SWIG is a tool that generates wrappers around native (C/C++) APIs for
|
|
various languages (mainly scripting languages) including Java.
|
|
|
|
By default, SWIG creates an API in the target language that exactly
|
|
replicates the native API (for example, each pointer type in the API
|
|
is wrapped as a distinct type in the language). Although this
|
|
simplifies the wrapper layer (type translation in trivial), it usually
|
|
doesn't result in natural API in the target language.
|
|
|
|
A further constraint for Berkeley DB's Java API was backwards
|
|
compatibility. The original hand-coded Java API is in widespread use,
|
|
and included many design decisions about how native types should be
|
|
represented in Java. As an example, callback functions are
|
|
represented by Java interfaces that applications using Berkeley DB
|
|
could implement. The SWIG implementation was required to maintain
|
|
backwards compatibility for those applications.
|
|
|
|
|
|
Running SWIG
|
|
============
|
|
|
|
The simplest use of SWIG is to simply run it with a C include file as
|
|
input. SWIG parses the file and generates wrapper code for the target
|
|
language. For Java, this includes a Java class for each C struct and
|
|
a C source file containing the Java Native Interface (JNI) function
|
|
calls for each native method.
|
|
|
|
The s_swig shell script in db/dist runs SWIG, and then post-processes
|
|
each Java source file with the sed commands in
|
|
libdb_java/java-post.sed. The Java sources are placed in
|
|
java/src/com/sleepycat/db, and the native wrapper code is in a single
|
|
file in libdb_java/db_java_wrap.c.
|
|
|
|
The post-processing step modifies code in ways that is difficult with
|
|
SWIG (given my current level of knowledge). This includes changing
|
|
some access modifiers to hide some of the implementation methods,
|
|
selectively adding "throws" clauses to methods, and adding calls to
|
|
"initialize" methods in Db and DbEnv after they are constructed (more
|
|
below on what these aclls do).
|
|
|
|
In addition to the source code generated by SWIG, some of the Java
|
|
classes are written by hand, and constants and code to fill statistics
|
|
structures are generated by the script dist/s_java. The native
|
|
statistics code is in libdb_java/java_stat_auto.c, and is compiled
|
|
into the db_java_wrap object file with a #include directive. This
|
|
allows most functions in that object to be static, which encourages
|
|
compiler inlining and reduces the number of symbols we export.
|
|
|
|
|
|
The Implementation
|
|
==================
|
|
|
|
For the reasons mentioned above, Berkeley DB requires a more
|
|
sophisticated mapping between the native API and Java, so additional
|
|
SWIG directives are added to the input. In particular:
|
|
|
|
* The general intention is for db.i to contain the full DB API (just
|
|
like db.h). As much as possible, this file is kept Java independent
|
|
so that it can be updated easily when the API changes. SWIG doesn't
|
|
have any builtin rules for how to handle function pointers in a
|
|
struct, so each DB method must be added in a SWIG "%extend" block
|
|
which includes the method signature and a call to the method.
|
|
|
|
* SWIG's automatically generated function names happen to collide
|
|
with Sleepycat's naming convention. For example, in a SWIG class
|
|
called __db, a method called "open" would result in a wrapper
|
|
function called "__db_open", which already exists in DB. This is
|
|
another reason why making these static functions is important.
|
|
|
|
* The main Java support starts in db_java.i - this file includes all
|
|
Java code that is explicitly inserted into the generated classes,
|
|
and is responsible for defining object lifecycles (handling
|
|
allocation and cleanup).
|
|
|
|
* Methods that need to be wrapped for special handling in Java code
|
|
are renamed with a trailing zero (e.g., close becomes close0).
|
|
This is invisible to applications.
|
|
|
|
* Most DB classes that are wrapped have method calls that imply the
|
|
cleanup of any native resources associated with the Java object
|
|
(for example, Db.close or DbTxn.abort). These methods are wrapped
|
|
so that if the object is accessed after the native part has been
|
|
destroyed, an exception is thrown rather than a trap that crashes
|
|
the JVM.
|
|
|
|
* Db and DbEnv initialization is more complex: a global reference is
|
|
stored in the corresponding struct so that native code can
|
|
efficiently map back to Java code. In addition, if a Db is
|
|
created without an environment (i.e., in a private environment),
|
|
the initialization wraps the internal DbEnv to simplify handling
|
|
of various Db methods that just call the corresponding DbEnv
|
|
method (like err, errx, etc.). It is important that the global
|
|
references are cleaned up before the DB and DB_ENV handles are
|
|
closed, so the Java objects can be garbage collected.
|
|
|
|
* In the case of DbLock and DbLsn, there are no such methods. In
|
|
these cases, there is a finalize method that does the appropriate
|
|
cleanup. No other classes have finalize methods (in particular,
|
|
the Dbt class is now implemented entirely in Java, so no
|
|
finalization is necessary).
|
|
|
|
* Overall initialization code, including the System.loadLibrary call,
|
|
is in java_util.i. This includes looking up all class, field and
|
|
method handles once so that execution is not slowed down by repeated
|
|
runtime type queries.
|
|
|
|
* Exception handling is in java_except.i. The main non-obvious design
|
|
choice was to create a db_ret_t type for methods that return an
|
|
error code as an int in the C API, but return void in the Java API
|
|
(and throw exceptions on error).
|
|
|
|
* The only other odd case with exceptions is DbMemoryException -
|
|
this is thrown as normal when a call returns ENOMEM, but there is
|
|
special handling for the case where a Dbt with DB_DBT_USERMEM is
|
|
not big enough to handle a result: in this case, the Dbt handling
|
|
code calls the method update_dbt on the exception that is about to
|
|
be thrown to register the failed Dbt in the exception.
|
|
|
|
* Statistics handling is in java_stat.i - this mainly just hooks into
|
|
the automatically-generated code in java_stat_auto.c.
|
|
|
|
* Callbacks: the general approach is that Db and DbEnv maintain
|
|
references to the objects that handle each callback, and have a
|
|
helper method for each call. This is primarily to simplify the
|
|
native code, and performs better than more complex native code.
|
|
|
|
* One difference with the new approach is that the implementation is
|
|
more careful about calling DeleteLocalRef on objects created for
|
|
callbacks. This is particularly important for callbacks like
|
|
bt_compare, which may be called repeatedly from native code.
|
|
Without the DeleteLocalRef calls, the Java objects that are
|
|
created can not be collected until the original call returns.
|
|
|
|
* Most of the rest of the code is in java_typemaps.i. A typemap is a
|
|
rule describing how a native type is mapped onto a Java type for
|
|
parameters and return values. These handle most of the complexity
|
|
of creating exactly the Java API we want.
|
|
|
|
* One of the main areas of complexity is Dbt handling. The approach
|
|
taken is to accept whatever data is passed in by the application,
|
|
pass that to native code, and reflect any changes to the native
|
|
DBT back into the Java object. In other words, the Dbt typemaps
|
|
don't replicate DB's rules about whether Dbts will be modified or
|
|
not - they just pass the data through.
|
|
|
|
* As noted above, when a Dbt is "released" (i.e., no longer needed
|
|
in native code), one of the check is whether a DbMemoryException
|
|
is pending, and if so, whether this Dbt might be the cause. In
|
|
that case, the Dbt is added to the exception via the "update_dbt"
|
|
method.
|
|
|
|
* Constant handling has been simplified by making DbConstants an
|
|
interface. This allows the Db class to inherit the constants, and
|
|
most can be inlined by javac.
|
|
|
|
* The danger here is if applications are compiled against one
|
|
version of db.jar, but run against another. This danger existed
|
|
previously, but was partly ameliorated by a separation of
|
|
constants into "case" and "non-case" constants (the non-case
|
|
constants were arranged so they could not be inlined). The only
|
|
complete solution to this problem is for applications to check the
|
|
version returned by DbEnv.get_version* versus the Db.DB_VERSION*
|
|
constants.
|
|
|
|
|
|
Application-visible changes
|
|
===========================
|
|
|
|
* The new API is around 5x faster for many operations.
|
|
|
|
* Some internal methods and constructors that were previously public
|
|
have been hidden or removed.
|
|
|
|
* A few methods that were inconsistent have been cleaned up (e.g.,
|
|
Db.close now returns void, was an int but always zero). The
|
|
synchronized attributed has been toggled on some methods - this is
|
|
an attempt to prevent multi-threaded applications shooting
|
|
themselves in the foot by calling close() or similar methods
|
|
concurrently from multiple threads.
|