mirror of git://gcc.gnu.org/git/gcc.git
				
				
				
			
		
			
				
	
	
		
			997 lines
		
	
	
		
			36 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			997 lines
		
	
	
		
			36 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
| <!DOCTYPE article PUBLIC "-//Davenport//DTD DocBook V3.0//EN">
 | |
| <article>
 | |
| <artheader>
 | |
| <title>The Cygnus Native Interface for C++/Java Integration</title>
 | |
| <subtitle>Writing native Java methods in natural C++</subtitle>
 | |
| <authorgroup>
 | |
| <corpauthor>Cygnus Solutions</corpauthor>
 | |
| </authorgroup>
 | |
| <date>March, 2000</date>
 | |
| </artheader>
 | |
| 
 | |
| <abstract><para>
 | |
| This documents CNI, the Cygnus Native Interface,
 | |
| which is is a convenient way to write Java native methods using C++.
 | |
| This is a more efficient, more convenient, but less portable
 | |
| alternative to the standard JNI (Java Native Interface).</para>
 | |
| </abstract>
 | |
| 
 | |
| <sect1><title>Basic Concepts</title>
 | |
| <para>
 | |
| In terms of languages features, Java is mostly a subset
 | |
| of C++.  Java has a few important extensions, plus a powerful standard
 | |
| class library, but on the whole that does not change the basic similarity.
 | |
| Java is a hybrid object-oriented language, with a few native types,
 | |
| in addition to class types.  It is class-based, where a class may have
 | |
| static as well as per-object fields, and static as well as instance methods.
 | |
| Non-static methods may be virtual, and may be overloaded.  Overloading is
 | |
| resolved at compile time by matching the actual argument types against
 | |
| the parameter types.  Virtual methods are implemented using indirect calls
 | |
| through a dispatch table (virtual function table).  Objects are
 | |
| allocated on the heap, and initialized using a constructor method.
 | |
| Classes are organized in a package hierarchy.
 | |
| </para>
 | |
| <para>
 | |
| All of the listed attributes are also true of C++, though C++ has
 | |
| extra features (for example in C++ objects may be allocated not just
 | |
| on the heap, but also statically or in a local stack frame).  Because
 | |
| <acronym>gcj</acronym> uses the same compiler technology as
 | |
| <acronym>g++</acronym> (the GNU C++ compiler), it is possible
 | |
| to make the intersection of the two languages use the same
 | |
| <acronym>ABI</acronym> (object representation and calling conventions).
 | |
| The key idea in <acronym>CNI</acronym> is that Java objects are C++ objects,
 | |
| and all Java classes are C++ classes (but not the other way around).
 | |
| So the most important task in integrating Java and C++ is to
 | |
| remove gratuitous incompatibilities.
 | |
| </para>
 | |
| <para>
 | |
| You write CNI code as a regular C++ source file.  (You do have to use
 | |
| a Java/CNI-aware C++ compiler, specifically a recent version of G++.)</para>
 | |
| <para>
 | |
| You start with:
 | |
| <programlisting>
 | |
| #include <gcj/cni.h>
 | |
| </programlisting></para>
 | |
| 
 | |
| <para>
 | |
| You then include header files for the various Java classes you need
 | |
| to use:
 | |
| <programlisting>
 | |
| #include <java/lang/Character.h>
 | |
| #include <java/util/Date.h>
 | |
| #include <java/lang/IndexOutOfBoundsException.h>
 | |
| </programlisting></para>
 | |
| 
 | |
| <para>
 | |
| In general, <acronym>CNI</acronym> functions and macros start with the
 | |
| `<literal>Jv</literal>' prefix, for example the function
 | |
| `<literal>JvNewObjectArray</literal>'.  This convention is used to
 | |
| avoid conflicts with other libraries.
 | |
| Internal functions in <acronym>CNI</acronym> start with the prefix
 | |
| `<literal>_Jv_</literal>'.  You should not call these;
 | |
| if you find a need to, let us know and we will try to come up with an
 | |
| alternate solution.  (This manual lists <literal>_Jv_AllocBytes</literal>
 | |
| as an example;  <acronym>CNI</acronym> should instead provide
 | |
| a <literal>JvAllocBytes</literal> function.)</para>
 | |
| <para>
 | |
| These header files are automatically generated by <command>gcjh</command>.
 | |
| </para>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Packages</title>
 | |
| <para>
 | |
| The only global names in Java are class names, and packages.
 | |
| A <firstterm>package</firstterm> can contain zero or more classes, and
 | |
| also zero or more sub-packages.
 | |
| Every class belongs to either an unnamed package or a package that
 | |
| has a hierarchical and globally unique name.
 | |
| </para>
 | |
| <para>
 | |
| A Java package is mapped to a C++ <firstterm>namespace</firstterm>.
 | |
| The Java class <literal>java.lang.String</literal>
 | |
| is in the package <literal>java.lang</literal>, which is a sub-package
 | |
| of <literal>java</literal>.  The C++ equivalent is the
 | |
| class <literal>java::lang::String</literal>,
 | |
| which is in the namespace <literal>java::lang</literal>,
 | |
| which is in the namespace <literal>java</literal>.
 | |
| </para>
 | |
| <para>
 | |
| Here is how you could express this:
 | |
| <programlisting>
 | |
| // Declare the class(es), possibly in a header file:
 | |
| namespace java {
 | |
|   namespace lang {
 | |
|     class Object;
 | |
|     class String;
 | |
|     ...
 | |
|   }
 | |
| }
 | |
| 
 | |
| class java::lang::String : public java::lang::Object
 | |
| {
 | |
|   ...
 | |
| };
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
| The <literal>gcjh</literal> tool automatically generates the
 | |
| nessary namespace declarations.</para>
 | |
| 
 | |
| <sect2><title>Nested classes as a substitute for namespaces</title>
 | |
| <para>
 | |
| <!-- FIXME the next line reads poorly jsm -->
 | |
| It is not that long since g++ got complete namespace support,
 | |
| and it was very recent (end of February 1999) that <literal>libgcj</literal>
 | |
| was changed to uses namespaces.  Releases before then used
 | |
| nested classes, which are the C++ equivalent of Java inner classes.
 | |
| They provide similar (though less convenient) functionality.
 | |
| The old syntax is:
 | |
| <programlisting>
 | |
| class java {
 | |
|   class lang {
 | |
|     class Object;
 | |
|     class String;
 | |
|   };
 | |
| };
 | |
| </programlisting>
 | |
| The obvious difference is the use of <literal>class</literal> instead
 | |
| of <literal>namespace</literal>.  The more important difference is
 | |
| that all the members of a nested class have to be declared inside
 | |
| the parent class definition, while namespaces can be defined in
 | |
| multiple places in the source.  This is more convenient, since it
 | |
| corresponds more closely to how Java packages are defined.
 | |
| The main difference is in the declarations; the syntax for
 | |
| using a nested class is the same as with namespaces:
 | |
| <programlisting>
 | |
| class java::lang::String : public java::lang::Object
 | |
| { ... }
 | |
| </programlisting>
 | |
| Note that the generated code (including name mangling)
 | |
| using nested classes is the same as that using namespaces.</para>
 | |
| </sect2>
 | |
| 
 | |
| <sect2><title>Leaving out package names</title>
 | |
| <para>
 | |
| <!-- FIXME next line reads poorly jsm -->
 | |
| Having to always type the fully-qualified class name is verbose.
 | |
| It also makes it more difficult to change the package containing a class.
 | |
| The Java <literal>package</literal> declaration specifies that the
 | |
| following class declarations are in the named package, without having
 | |
| to explicitly name the full package qualifiers.
 | |
| The <literal>package</literal> declaration can be followed by zero or
 | |
| more <literal>import</literal> declarations, which allows either
 | |
| a single class or all the classes in a package to be named by a simple
 | |
| identifier.  C++ provides something similar
 | |
| with the <literal>using</literal> declaration and directive.
 | |
| </para>
 | |
| <para>
 | |
| A Java simple-type-import declaration:
 | |
| <programlisting>
 | |
| import <replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable>;
 | |
| </programlisting>
 | |
| allows using <replaceable>TypeName</replaceable> as a shorthand for
 | |
| <literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal>.
 | |
| The C++ (more-or-less) equivalent is a <literal>using</literal>-declaration:
 | |
| <programlisting>
 | |
| using <replaceable>PackageName</replaceable>::<replaceable>TypeName</replaceable>;
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
| A Java import-on-demand declaration:
 | |
| <programlisting>
 | |
| import <replaceable>PackageName</replaceable>.*;
 | |
| </programlisting>
 | |
| allows using <replaceable>TypeName</replaceable> as a shorthand for
 | |
| <literal><replaceable>PackageName</replaceable>.<replaceable>TypeName</replaceable></literal>
 | |
| The C++ (more-or-less) equivalent is a <literal>using</literal>-directive:
 | |
| <programlisting>
 | |
| using namespace <replaceable>PackageName</replaceable>;
 | |
| </programlisting>
 | |
| </para>
 | |
| </sect2>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Primitive types</title>
 | |
| <para>
 | |
| Java provides 8 <quote>primitives</quote> types:
 | |
| <literal>byte</literal>, <literal>short</literal>, <literal>int</literal>,
 | |
| <literal>long</literal>, <literal>float</literal>, <literal>double</literal>,
 | |
| <literal>char</literal>, and <literal>boolean</literal>.
 | |
| These are the same as the following C++ <literal>typedef</literal>s
 | |
| (which are defined by <literal>gcj/cni.h</literal>):
 | |
| <literal>jbyte</literal>, <literal>jshort</literal>, <literal>jint</literal>,
 | |
| <literal>jlong</literal>, <literal>jfloat</literal>,
 | |
| <literal>jdouble</literal>,
 | |
| <literal>jchar</literal>, and <literal>jboolean</literal>.
 | |
| You should use the C++ typenames
 | |
| (<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>jint</literal>),
 | |
| and not the Java types names
 | |
| (<ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase> <literal>int</literal>),
 | |
| even if they are <quote>the same</quote>.
 | |
| This is because there is no guarantee that the C++ type
 | |
| <literal>int</literal> is a 32-bit type, but <literal>jint</literal>
 | |
| <emphasis>is</emphasis> guaranteed to be a 32-bit type.
 | |
| 
 | |
| <informaltable frame="all" colsep="1" rowsep="0">
 | |
| <tgroup cols="3">
 | |
| <thead>
 | |
| <row>
 | |
| <entry>Java type</entry>
 | |
| <entry>C/C++ typename</entry>
 | |
| <entry>Description</entry>
 | |
| </thead>
 | |
| <tbody>
 | |
| <row>
 | |
| <entry>byte</entry>
 | |
| <entry>jbyte</entry>
 | |
| <entry>8-bit signed integer</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>short</entry>
 | |
| <entry>jshort</entry>
 | |
| <entry>16-bit signed integer</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>int</entry>
 | |
| <entry>jint</entry>
 | |
| <entry>32-bit signed integer</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>long</entry>
 | |
| <entry>jlong</entry>
 | |
| <entry>64-bit signed integer</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>float</entry>
 | |
| <entry>jfloat</entry>
 | |
| <entry>32-bit IEEE floating-point number</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>double</entry>
 | |
| <entry>jdouble</entry>
 | |
| <entry>64-bit IEEE floating-point number</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>char</entry>
 | |
| <entry>jchar</entry>
 | |
| <entry>16-bit Unicode character</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>boolean</entry>
 | |
| <entry>jboolean</entry>
 | |
| <entry>logical (Boolean) values</entry>
 | |
| </row>
 | |
| <row>
 | |
| <entry>void</entry>
 | |
| <entry>void</entry>
 | |
| <entry>no value</entry>
 | |
| </row>
 | |
| </tbody></tgroup>
 | |
| </informaltable>
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
| <funcdef><function>JvPrimClass</function></funcdef>
 | |
| <paramdef><parameter>primtype</parameter></paramdef>
 | |
| </funcsynopsis>
 | |
| This is a macro whose argument should be the name of a primitive
 | |
| type, <ForeignPhrase><Abbrev>e.g.</Abbrev></ForeignPhrase>
 | |
| <literal>byte</literal>.
 | |
| The macro expands to a pointer to the <literal>Class</literal> object
 | |
| corresponding to the primitive type.
 | |
| <ForeignPhrase><Abbrev>E.g.</Abbrev></ForeignPhrase>,
 | |
| <literal>JvPrimClass(void)</literal>
 | |
| has the same value as the Java expression
 | |
| <literal>Void.TYPE</literal> (or <literal>void.class</literal>).
 | |
| </para>
 | |
| 
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Objects and Classes</title>
 | |
| <sect2><title>Classes</title>
 | |
| <para>
 | |
| All Java classes are derived from <literal>java.lang.Object</literal>.
 | |
| C++ does not have a unique <quote>root</quote>class, but we use
 | |
| a C++ <literal>java::lang::Object</literal> as the C++ version
 | |
| of the <literal>java.lang.Object</literal> Java class.  All
 | |
| other Java classes are mapped into corresponding C++ classes
 | |
| derived from <literal>java::lang::Object</literal>.</para>
 | |
| <para>
 | |
| Interface inheritance (the <quote><literal>implements</literal></quote>
 | |
| keyword) is currently not reflected in the C++ mapping.</para>
 | |
| </sect2>
 | |
| <sect2><title>Object references</title>
 | |
| <para>
 | |
| We implement a Java object reference as a pointer to the start
 | |
| of the referenced object.  It maps to a C++ pointer.
 | |
| (We cannot use C++ references for Java references, since
 | |
| once a C++ reference has been initialized, you cannot change it to
 | |
| point to another object.)
 | |
| The <literal>null</literal> Java reference maps to the <literal>NULL</literal>
 | |
| C++ pointer.
 | |
| </para>
 | |
| <para>
 | |
| Note that in some Java implementations an object reference is implemented as
 | |
| a pointer to a two-word <quote>handle</quote>.  One word of the handle
 | |
| points to the fields of the object, while the other points
 | |
| to a method table.  Gcj does not use this extra indirection.
 | |
| </para>
 | |
| </sect2>
 | |
| <sect2><title>Object fields</title>
 | |
| <para>
 | |
| Each object contains an object header, followed by the instance
 | |
| fields of the class, in order.  The object header consists of
 | |
| a single pointer to a dispatch or virtual function table.
 | |
| (There may be extra fields <quote>in front of</quote> the object,
 | |
| for example for
 | |
| memory management, but this is invisible to the application, and
 | |
| the reference to the object points to the dispatch table pointer.)
 | |
| </para>
 | |
| <para>
 | |
| The fields are laid out in the same order, alignment, and size
 | |
| as in C++.  Specifically, 8-bite and 16-bit native types
 | |
| (<literal>byte</literal>, <literal>short</literal>, <literal>char</literal>,
 | |
| and <literal>boolean</literal>) are <emphasis>not</emphasis>
 | |
| widened to 32 bits.
 | |
| Note that the Java VM does extend 8-bit and 16-bit types to 32 bits
 | |
| when on the VM stack or temporary registers.</para>
 | |
| <para>
 | |
| If you include the <literal>gcjh</literal>-generated header for a
 | |
| class, you can access fields of Java classes in the <quote>natural</quote>
 | |
| way.  Given the following Java class:
 | |
| <programlisting>
 | |
| public class Int
 | |
| {
 | |
|   public int i;
 | |
|   public Integer (int i) { this.i = i; }
 | |
|   public static zero = new Integer(0);
 | |
| }
 | |
| </programlisting>
 | |
| you can write:
 | |
| <programlisting>
 | |
| #include <gcj/cni.h>
 | |
| #include <Int.h>
 | |
| Int*
 | |
| mult (Int *p, jint k)
 | |
| {
 | |
|   if (k == 0)
 | |
|     return Int::zero;  // static member access.
 | |
|   return new Int(p->i * k);
 | |
| }
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
| <acronym>CNI</acronym> does not strictly enforce the Java access
 | |
| specifiers, because Java permissions cannot be directly mapped
 | |
| into C++ permission.  Private Java fields and methods are mapped
 | |
| to private C++ fields and methods, but other fields and methods
 | |
| are mapped to public fields and methods.
 | |
| </para>
 | |
| </sect2>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Arrays</title>
 | |
| <para>
 | |
| While in many ways Java is similar to C and C++,
 | |
| it is quite different in its treatment of arrays.
 | |
| C arrays are based on the idea of pointer arithmetic,
 | |
| which would be incompatible with Java's security requirements.
 | |
| Java arrays are true objects (array types inherit from
 | |
| <literal>java.lang.Object</literal>).  An array-valued variable
 | |
| is one that contains a reference (pointer) to an array object.
 | |
| </para>
 | |
| <para>
 | |
| Referencing a Java array in C++ code is done using the
 | |
| <literal>JArray</literal> template, which as defined as follows:
 | |
| <programlisting>
 | |
| class __JArray : public java::lang::Object
 | |
| {
 | |
| public:
 | |
|   int length;
 | |
| };
 | |
| 
 | |
| template<class T>
 | |
| class JArray : public __JArray
 | |
| {
 | |
|   T data[0];
 | |
| public:
 | |
|   T& operator[](jint i) { return data[i]; }
 | |
| };
 | |
| </programlisting></para>
 | |
| <para>
 | |
| <funcsynopsis> 
 | |
|    <funcdef>template<class T>  T *<function>elements</function></funcdef>
 | |
|    <paramdef>JArray<T> &<parameter>array</parameter></paramdef>
 | |
| </funcsynopsis>
 | |
|    This template function can be used to get a pointer to the
 | |
|    elements of the <parameter>array</parameter>.
 | |
|    For instance, you can fetch a pointer
 | |
|    to the integers that make up an <literal>int[]</literal> like so:
 | |
| <programlisting>
 | |
| extern jintArray foo;
 | |
| jint *intp = elements (foo);
 | |
| </programlisting>
 | |
| The name of this function may change in the future.</para>
 | |
| <para>
 | |
| There are a number of typedefs which correspond to typedefs from JNI.
 | |
| Each is the type of an array holding objects of the appropriate type:
 | |
| <programlisting>
 | |
| typedef __JArray *jarray;
 | |
| typedef JArray<jobject> *jobjectArray;
 | |
| typedef JArray<jboolean> *jbooleanArray;
 | |
| typedef JArray<jbyte> *jbyteArray;
 | |
| typedef JArray<jchar> *jcharArray;
 | |
| typedef JArray<jshort> *jshortArray;
 | |
| typedef JArray<jint> *jintArray;
 | |
| typedef JArray<jlong> *jlongArray;
 | |
| typedef JArray<jfloat> *jfloatArray;
 | |
| typedef JArray<jdouble> *jdoubleArray;
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
|  You can create an array of objects using this function:
 | |
| <funcsynopsis> 
 | |
|    <funcdef>jobjectArray <function>JvNewObjectArray</function></funcdef>
 | |
|    <paramdef>jint <parameter>length</parameter></paramdef>
 | |
|    <paramdef>jclass <parameter>klass</parameter></paramdef>
 | |
|    <paramdef>jobject <parameter>init</parameter></paramdef>
 | |
|    </funcsynopsis>
 | |
|    Here <parameter>klass</parameter> is the type of elements of the array;
 | |
|    <parameter>init</parameter> is the initial
 | |
|    value to be put into every slot in the array.
 | |
| </para>
 | |
| <para>
 | |
| For each primitive type there is a function which can be used
 | |
|    to create a new array holding that type.  The name of the function
 | |
|    is of the form
 | |
|    `<literal>JvNew<<replaceable>Type</replaceable>>Array</literal>',
 | |
|    where `<<replaceable>Type</replaceable>>' is the name of
 | |
|    the primitive type, with its initial letter in upper-case.  For
 | |
|    instance, `<literal>JvNewBooleanArray</literal>' can be used to create
 | |
|    a new array of booleans.
 | |
|    Each such function follows this example:
 | |
| <funcsynopsis>  
 | |
|    <funcdef>jbooleanArray <function>JvNewBooleanArray</function></funcdef> 
 | |
|    <paramdef>jint <parameter>length</parameter></paramdef>
 | |
| </funcsynopsis>
 | |
| </para>
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|    <funcdef>jsize <function>JvGetArrayLength</function></funcdef>
 | |
|    <paramdef>jarray <parameter>array</parameter></paramdef> 
 | |
|    </funcsynopsis>
 | |
|    Returns the length of <parameter>array</parameter>.</para>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Methods</title>
 | |
| 
 | |
| <para>
 | |
| Java methods are mapped directly into C++ methods.
 | |
| The header files generated by <literal>gcjh</literal>
 | |
| include the appropriate method definitions.
 | |
| Basically, the generated methods have the same names and
 | |
| <quote>corresponding</quote> types as the Java methods,
 | |
| and are called in the natural manner.</para>
 | |
| 
 | |
| <sect2><title>Overloading</title>
 | |
| <para>
 | |
| Both Java and C++ provide method overloading, where multiple
 | |
| methods in a class have the same name, and the correct one is chosen
 | |
| (at compile time) depending on the argument types.
 | |
| The rules for choosing the correct method are (as expected) more complicated
 | |
| in C++ than in Java, but given a set of overloaded methods
 | |
| generated by <literal>gcjh</literal> the C++ compiler will choose
 | |
| the expected one.</para>
 | |
| <para>
 | |
| Common assemblers and linkers are not aware of C++ overloading,
 | |
| so the standard implementation strategy is to encode the
 | |
| parameter types of a method into its assembly-level name.
 | |
| This encoding is called <firstterm>mangling</firstterm>,
 | |
| and the encoded name is the <firstterm>mangled name</firstterm>.
 | |
| The same mechanism is used to implement Java overloading.
 | |
| For C++/Java interoperability, it is important that both the Java
 | |
| and C++ compilers use the <emphasis>same</emphasis> encoding scheme.
 | |
| </para>
 | |
| </sect2>
 | |
| 
 | |
| <sect2><title>Static methods</title>
 | |
| <para>
 | |
| Static Java methods are invoked in <acronym>CNI</acronym> using the standard
 | |
| C++ syntax, using the `<literal>::</literal>' operator rather
 | |
| than the `<literal>.</literal>' operator.  For example:
 | |
| </para>
 | |
| <programlisting>
 | |
| jint i = java::lang::Math::round((jfloat) 2.3);
 | |
| </programlisting>
 | |
| <para>
 | |
| <!-- FIXME this next sentence seems ungammatical jsm -->
 | |
| Defining a static native method uses standard C++ method
 | |
| definition syntax.  For example:
 | |
| <programlisting>
 | |
| #include <java/lang/Integer.h>
 | |
| java::lang::Integer*
 | |
| java::lang::Integer::getInteger(jstring str)
 | |
| {
 | |
|   ...
 | |
| }
 | |
| </programlisting>
 | |
| </sect2>
 | |
| 
 | |
| <sect2><title>Object Constructors</title>
 | |
| <para>
 | |
| Constructors are called implicitly as part of object allocation
 | |
| using the <literal>new</literal> operator.  For example:
 | |
| <programlisting> 
 | |
| java::lang::Int x = new java::lang::Int(234);
 | |
| </programlisting> 
 | |
| </para>
 | |
| <para>
 | |
| <!-- FIXME rewrite needed here, mine may not be good jsm -->
 | |
| Java does not allow a constructor to be a native method.
 | |
| Instead, you could define a private method which
 | |
| you can have the constructor call.
 | |
| </para>
 | |
| </sect2>
 | |
| 
 | |
| <sect2><title>Instance methods</title>
 | |
| <para>
 | |
| <!-- FIXME next para week, I would remove a few words from some sentences jsm -->
 | |
| Virtual method dispatch is handled essentially the same way
 | |
| in C++ and Java -- <abbrev>i.e.</abbrev> by doing an
 | |
| indirect call through a function pointer stored in a per-class virtual
 | |
| function table.  C++ is more complicated because it has to support
 | |
| multiple inheritance, but this does not effect Java classes.
 | |
| However, G++ has historically used a different calling convention
 | |
| that is not compatible with the one used by <acronym>gcj</acronym>.
 | |
| During 1999, G++ will switch to a new ABI that is compatible with
 | |
| <acronym>gcj</acronym>.  Some platforms (including Linux) have already
 | |
| changed.  On other platforms, you will have to pass
 | |
| the <literal>-fvtable-thunks</literal> flag to g++ when
 | |
| compiling <acronym>CNI</acronym> code.  Note that you must also compile
 | |
| your C++ source code with <literal>-fno-rtti</literal>.
 | |
| </para>
 | |
| <para>
 | |
| Calling a Java instance method in <acronym>CNI</acronym> is done
 | |
| using the standard C++ syntax.  For example:
 | |
| <programlisting>
 | |
|   java::lang::Number *x;
 | |
|   if (x->doubleValue() > 0.0) ...
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
| Defining a Java native instance method is also done the natural way:
 | |
| <programlisting>
 | |
| #include <java/lang/Integer.h>
 | |
| jdouble
 | |
| java::lang:Integer::doubleValue()
 | |
| {
 | |
|   return (jdouble) value;
 | |
| }
 | |
| </programlisting>
 | |
| </para>
 | |
| </sect2>
 | |
| 
 | |
| <sect2><title>Interface method calls</title>
 | |
| <para>
 | |
| In Java you can call a method using an interface reference.
 | |
| This is not yet supported in <acronym>CNI</acronym>.</para>
 | |
| </sect2>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Object allocation</title>
 | |
| 
 | |
| <para>
 | |
| New Java objects are allocated using a
 | |
| <firstterm>class-instance-creation-expression</firstterm>:
 | |
| <programlisting>
 | |
| new <replaceable>Type</replaceable> ( <replaceable>arguments</replaceable> )
 | |
| </programlisting>
 | |
| The same syntax is used in C++.  The main difference is that
 | |
| C++ objects have to be explicitly deleted; in Java they are
 | |
| automatically deleted by the garbage collector.
 | |
| Using <acronym>CNI</acronym>, you can allocate a new object
 | |
| using standard C++ syntax.  The C++ compiler is smart enough to
 | |
| realize the class is a Java class, and hence it needs to allocate
 | |
| memory from the garbage collector.  If you have overloaded
 | |
| constructors, the compiler will choose the correct one
 | |
| using standard C++ overload resolution rules.  For example:
 | |
| <programlisting>
 | |
| java::util::Hashtable *ht = new java::util::Hashtable(120);
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|   <funcdef>void *<function>_Jv_AllocBytes</function></funcdef>
 | |
|   <paramdef>jsize <parameter>size</parameter></paramdef>
 | |
| </funcsynopsis>
 | |
|    Allocate <parameter>size</parameter> bytes.  This memory is not
 | |
|    scanned by the garbage collector.  However, it will be freed by
 | |
| the GC if no references to it are discovered.
 | |
| </para>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Interfaces</title>
 | |
| <para>
 | |
| A Java class can <firstterm>implement</firstterm> zero or more
 | |
| <firstterm>interfaces</firstterm>, in addition to inheriting from
 | |
| a single base class. 
 | |
| An interface is a collection of constants and method specifications;
 | |
| it is similar to the <firstterm>signatures</firstterm> available
 | |
| as a G++ extension.  An interface provides a subset of the
 | |
| functionality of C++ abstract virtual base classes, but they
 | |
| are currently implemented differently.
 | |
| CNI does not currently provide any support for interfaces,
 | |
| or calling methods from an interface pointer.
 | |
| This is partly because we are planning to re-do how
 | |
| interfaces are implemented in <acronym>gcj</acronym>.
 | |
| </para>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Strings</title>
 | |
| <para>
 | |
| <acronym>CNI</acronym> provides a number of utility functions for
 | |
| working with Java <literal>String</literal> objects.
 | |
| The names and interfaces are analogous to those of <acronym>JNI</acronym>.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|   <funcdef>jstring <function>JvNewString</function></funcdef>
 | |
|   <paramdef>const jchar *<parameter>chars</parameter></paramdef>
 | |
|   <paramdef>jsize <parameter>len</parameter></paramdef>
 | |
|   </funcsynopsis>
 | |
|   Creates a new Java String object, where
 | |
|   <parameter>chars</parameter> are the contents, and
 | |
|   <parameter>len</parameter> is the number of characters.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|   <funcdef>jstring <function>JvNewStringLatin1</function></funcdef>
 | |
|   <paramdef>const char *<parameter>bytes</parameter></paramdef>
 | |
|   <paramdef>jsize <parameter>len</parameter></paramdef>
 | |
|  </funcsynopsis>
 | |
|   Creates a new Java String object, where <parameter>bytes</parameter>
 | |
|   are the Latin-1 encoded
 | |
|   characters, and <parameter>len</parameter> is the length of
 | |
|   <parameter>bytes</parameter>, in bytes.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|   <funcdef>jstring <function>JvNewStringLatin1</function></funcdef>
 | |
|   <paramdef>const char *<parameter>bytes</parameter></paramdef>
 | |
|   </funcsynopsis>
 | |
|   Like the first JvNewStringLatin1, but computes <parameter>len</parameter>
 | |
|   using <literal>strlen</literal>.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|   <funcdef>jstring <function>JvNewStringUTF</function></funcdef>
 | |
|   <paramdef>const char *<parameter>bytes</parameter></paramdef>
 | |
|   </funcsynopsis>
 | |
|    Creates a new Java String object, where <parameter>bytes</parameter> are
 | |
|    the UTF-8 encoded characters of the string, terminated by a null byte.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|    <funcdef>jchar *<function>JvGetStringChars</function></funcdef>
 | |
|   <paramdef>jstring <parameter>str</parameter></paramdef>
 | |
|   </funcsynopsis>
 | |
|    Returns a pointer to the array of characters which make up a string.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|    <funcdef> int <function>JvGetStringUTFLength</function></funcdef>
 | |
|   <paramdef>jstring <parameter>str</parameter></paramdef>
 | |
|   </funcsynopsis>
 | |
|    Returns number of bytes required to encode contents
 | |
|    of <parameter>str</parameter> as UTF-8.
 | |
| </para>
 | |
| 
 | |
| <para>
 | |
| <funcsynopsis>
 | |
|   <funcdef> jsize <function>JvGetStringUTFRegion</function></funcdef>
 | |
|   <paramdef>jstring <parameter>str</parameter></paramdef>
 | |
|   <paramdef>jsize <parameter>start</parameter></paramdef>
 | |
|   <paramdef>jsize <parameter>len</parameter></paramdef>
 | |
|   <paramdef>char *<parameter>buf</parameter></paramdef>
 | |
|   </funcsynopsis>
 | |
|   This puts the UTF-8 encoding of a region of the
 | |
|   string <parameter>str</parameter> into
 | |
|   the buffer <parameter>buf</parameter>.
 | |
|   The region of the string to fetch is specifued by
 | |
|   <parameter>start</parameter> and <parameter>len</parameter>.
 | |
|    It is assumed that <parameter>buf</parameter> is big enough
 | |
|    to hold the result.  Note
 | |
|    that <parameter>buf</parameter> is <emphasis>not</emphasis> null-terminated.
 | |
| </para>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Class Initialization</title>
 | |
| <para>
 | |
| Java requires that each class be automatically initialized at the time 
 | |
| of the first active use.  Initializing a class involves 
 | |
| initializing the static fields, running code in class initializer 
 | |
| methods, and initializing base classes.  There may also be 
 | |
| some implementation specific actions, such as allocating 
 | |
| <classname>String</classname> objects corresponding to string literals in
 | |
| the code.</para>
 | |
| <para>
 | |
| The Gcj compiler inserts calls to <literal>JvInitClass</literal> (actually
 | |
| <literal>_Jv_InitClass</literal>) at appropriate places to ensure that a
 | |
| class is initialized when required.  The C++ compiler does not
 | |
| insert these calls automatically - it is the programmer's
 | |
| responsibility to make sure classes are initialized.  However,
 | |
| this is fairly painless because of the conventions assumed by the Java
 | |
| system.</para>
 | |
| <para>
 | |
| First, <literal>libgcj</literal> will make sure a class is initialized
 | |
| before an instance of that object is created.  This is one
 | |
| of the responsibilities of the <literal>new</literal> operation.  This is
 | |
| taken care of both in Java code, and in C++ code.  (When the G++
 | |
| compiler sees a <literal>new</literal> of a Java class, it will call
 | |
| a routine in <literal>libgcj</literal> to allocate the object, and that
 | |
| routine will take care of initializing the class.)  It follows that you can
 | |
| access an instance field, or call an instance (non-static)
 | |
| method and be safe in the knowledge that the class and all
 | |
| of its base classes have been initialized.</para>
 | |
| <para>
 | |
| Invoking a static method is also safe.  This is because the
 | |
| Java compiler adds code to the start of a static method to make sure
 | |
| the class is initialized.  However, the C++ compiler does not
 | |
| add this extra code.  Hence, if you write a native static method
 | |
| using CNI, you are responsible for calling <literal>JvInitClass</literal>
 | |
| before doing anything else in the method (unless you are sure
 | |
| it is safe to leave it out).</para>
 | |
| <para>
 | |
| Accessing a static field also requires the class of the
 | |
| field to be initialized.  The Java compiler will generate code
 | |
| to call <literal>_Jv_InitClass</literal> before getting or setting the field.
 | |
| However, the C++ compiler will not generate this extra code,
 | |
| so it is your responsibility to make sure the class is
 | |
| initialized before you access a static field.</para>
 | |
| </sect1>
 | |
| <sect1><title>Exception Handling</title>
 | |
| <para>
 | |
| While C++ and Java share a common exception handling framework,
 | |
| things are not yet perfectly integrated.  The main issue is that the
 | |
| <quote>run-time type information</quote> facilities of the two
 | |
| languages are not integrated.</para>
 | |
| <para>
 | |
| Still, things work fairly well.  You can throw a Java exception from
 | |
| C++ using the ordinary <literal>throw</literal> construct, and this
 | |
| exception can be caught by Java code.  Similarly, you can catch an
 | |
| exception thrown from Java using the C++ <literal>catch</literal>
 | |
| construct.
 | |
| <para>
 | |
| Note that currently you cannot mix C++ catches and Java catches in
 | |
| a single C++ translation unit.  We do intend to fix this eventually.
 | |
| </para>
 | |
| <para>
 | |
| Here is an example:
 | |
| <programlisting>
 | |
| if (i >= count)
 | |
|    throw new java::lang::IndexOutOfBoundsException();
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
| Normally, GNU C++ will automatically detect when you are writing C++
 | |
| code that uses Java exceptions, and handle them appropriately.
 | |
| However, if C++ code only needs to execute destructors when Java
 | |
| exceptions are thrown through it, GCC will guess incorrectly.  Sample
 | |
| problematic code:
 | |
| <programlisting>
 | |
|   struct S { ~S(); };
 | |
|   extern void bar();    // is implemented in Java and may throw exceptions
 | |
|   void foo()
 | |
|   {
 | |
|     S s;
 | |
|     bar();
 | |
|   }
 | |
| </programlisting>
 | |
| The usual effect of an incorrect guess is a link failure, complaining of
 | |
| a missing routine called <literal>__gxx_personality_v0</literal>.
 | |
| </para>
 | |
| <para>
 | |
| You can inform the compiler that Java exceptions are to be used in a
 | |
| translation unit, irrespective of what it might think, by writing
 | |
| <literal>#pragma GCC java_exceptions</literal> at the head of the
 | |
| file.  This <literal>#pragma</literal> must appear before any
 | |
| functions that throw or catch exceptions, or run destructors when
 | |
| exceptions are thrown through them.</para>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Synchronization</title>
 | |
| <para>
 | |
| Each Java object has an implicit monitor.
 | |
| The Java VM uses the instruction <literal>monitorenter</literal> to acquire
 | |
| and lock a monitor, and <literal>monitorexit</literal> to release it.
 | |
| The JNI has corresponding methods <literal>MonitorEnter</literal>
 | |
| and <literal>MonitorExit</literal>.  The corresponding CNI macros
 | |
| are <literal>JvMonitorEnter</literal> and <literal>JvMonitorExit</literal>.
 | |
| </para>
 | |
| <para>
 | |
| The Java source language does not provide direct access to these primitives.
 | |
| Instead, there is a <literal>synchronized</literal> statement that does an
 | |
| implicit <literal>monitorenter</literal> before entry to the block,
 | |
| and does a <literal>monitorexit</literal> on exit from the block.
 | |
| Note that the lock has to be released even the block is abnormally
 | |
| terminated by an exception, which means there is an implicit
 | |
| <literal>try</literal>-<literal>finally</literal>.
 | |
| </para>
 | |
| <para>
 | |
| From C++, it makes sense to use a destructor to release a lock.
 | |
| CNI defines the following utility class.
 | |
| <programlisting>
 | |
| class JvSynchronize() {
 | |
|   jobject obj;
 | |
|   JvSynchronize(jobject o) { obj = o; JvMonitorEnter(o); }
 | |
|   ~JvSynchronize() { JvMonitorExit(obj); }
 | |
| };
 | |
| </programlisting>
 | |
| The equivalent of Java's:
 | |
| <programlisting>
 | |
| synchronized (OBJ) { CODE; }
 | |
| </programlisting>
 | |
| can be simply expressed:
 | |
| <programlisting>
 | |
| { JvSynchronize dummy(OBJ); CODE; }
 | |
| </programlisting>
 | |
| </para>
 | |
| <para>
 | |
| Java also has methods with the <literal>synchronized</literal> attribute.
 | |
| This is equivalent to wrapping the entire method body in a
 | |
| <literal>synchronized</literal> statement.
 | |
| (Alternatively, an implementation could require the caller to do
 | |
| the synchronization.  This is not practical for a compiler, because
 | |
| each virtual method call would have to test at run-time if
 | |
| synchronization is needed.)  Since in <literal>gcj</literal>
 | |
| the <literal>synchronized</literal> attribute is handled by the
 | |
| method implementation, it is up to the programmer
 | |
| of a synchronized native method to handle the synchronization
 | |
| (in the C++ implementation of the method).
 | |
| In otherwords, you need to manually add <literal>JvSynchronize</literal>
 | |
| in a <literal>native synchornized</literal> method.</para>
 | |
| </sect1>
 | |
| 
 | |
| <sect1><title>Reflection</title>
 | |
| <para>The types <literal>jfieldID</literal> and <literal>jmethodID</literal>
 | |
| are as in JNI.</para>
 | |
| <para>
 | |
| The function <literal>JvFromReflectedField</literal>,
 | |
| <literal>JvFromReflectedMethod</literal>,
 | |
| <literal>JvToReflectedField</literal>, and
 | |
| <literal>JvToFromReflectedMethod</literal> (as in Java 2 JNI)
 | |
| will be added shortly, as will other functions corresponding to JNI.</para>
 | |
| 
 | |
| <sect1><title>Using gcjh</title>
 | |
| <para>
 | |
|       The <command>gcjh</command> is used to generate C++ header files from
 | |
|       Java class files.  By default, <command>gcjh</command> generates
 | |
|       a relatively straightforward C++ header file.  However, there
 | |
|       are a few caveats to its use, and a few options which can be
 | |
|       used to change how it operates:
 | |
| </para>
 | |
| <variablelist>
 | |
| <varlistentry>
 | |
| <term><literal>--classpath</literal> <replaceable>path</replaceable></term>
 | |
| <term><literal>--CLASSPATH</literal> <replaceable>path</replaceable></term>
 | |
| <term><literal>-I</literal> <replaceable>dir</replaceable></term>
 | |
| <listitem><para>
 | |
|         These options can be used to set the class path for gcjh.
 | |
|         Gcjh searches the class path the same way the compiler does;
 | |
| 	these options have their familiar meanings.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry>
 | |
| <term><literal>-d <replaceable>directory</replaceable></literal></term>
 | |
| <listitem><para>
 | |
| Puts the generated <literal>.h</literal> files
 | |
| beneath <replaceable>directory</replaceable>.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry>
 | |
| <term><literal>-o <replaceable>file</replaceable></literal></term>
 | |
| <listitem><para>
 | |
|         Sets the name of the <literal>.h</literal> file to be generated.
 | |
|         By default the <literal>.h</literal> file is named after the class.
 | |
|         This option only really makes sense if just a single class file
 | |
|         is specified.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry>
 | |
| <term><literal>--verbose</literal></term>
 | |
| <listitem><para>
 | |
|         gcjh will print information to stderr as it works.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry>
 | |
| <term><literal>-M</literal></term>
 | |
| <term><literal>-MM</literal></term>
 | |
| <term><literal>-MD</literal></term>
 | |
| <term><literal>-MMD</literal></term>
 | |
| <listitem><para>
 | |
|         These options can be used to generate dependency information
 | |
|         for the generated header file.  They work the same way as the
 | |
|         corresponding compiler options.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry>
 | |
| <term><literal>-prepend <replaceable>text</replaceable></literal></term>
 | |
| <listitem><para>
 | |
| This causes the <replaceable>text</replaceable> to be put into the generated
 | |
|         header just after class declarations (but before declaration
 | |
|         of the current class).  This option should be used with caution.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry> 
 | |
| <term><literal>-friend <replaceable>text</replaceable></literal></term>
 | |
| <listitem><para>
 | |
| This causes the <replaceable>text</replaceable> to be put into the class
 | |
| declaration after a <literal>friend</literal> keyword.
 | |
| This can be used to declare some
 | |
|         other class or function to be a friend of this class.
 | |
|         This option should be used with caution.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry>  
 | |
| <term><literal>-add <replaceable>text</replaceable></literal></term>
 | |
| <listitem><para>
 | |
| The <replaceable>text</replaceable> is inserted into the class declaration.
 | |
| This option should be used with caution.</para>
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| 
 | |
| <varlistentry> 
 | |
| <term><literal>-append <replaceable>text</replaceable></literal></term>
 | |
| <listitem><para>
 | |
| The <replaceable>text</replaceable> is inserted into the header file
 | |
| after the class declaration.  One use for this is to generate
 | |
| inline functions.  This option should be used with caution.
 | |
| </listitem>
 | |
| </varlistentry>
 | |
| </variablelist>
 | |
| <para>
 | |
| All other options not beginning with a <literal>-</literal> are treated
 | |
| as the names of classes for which headers should be generated.</para>
 | |
| <para>
 | |
| gcjh will generate all the required namespace declarations and
 | |
| <literal>#include</literal>'s for the header file.
 | |
| In some situations, gcjh will generate simple inline member
 | |
| functions.  Note that, while gcjh puts <literal>#pragma
 | |
| interface</literal> in the generated header file, you should
 | |
| <emphasis>not</emphasis> put <literal>#pragma implementation</literal>
 | |
| into your C++ source file.  If you do, duplicate definitions of
 | |
| inline functions will sometimes be created, leading to link-time
 | |
| errors.
 | |
| </para>
 | |
| <para>
 | |
| There are a few cases where gcjh will fail to work properly:</para>
 | |
| <para>
 | |
| gcjh assumes that all the methods and fields of a class have ASCII
 | |
| names.  The C++ compiler cannot correctly handle non-ASCII
 | |
| identifiers.  gcjh does not currently diagnose this problem.</para>
 | |
| <para>
 | |
| gcjh also cannot fully handle classes where a field and a method have
 | |
| the same name.  If the field is static, an error will result.
 | |
| Otherwise, the field will be renamed in the generated header; `__'
 | |
| will be appended to the field name.</para>
 | |
| <para>
 | |
| Eventually we hope to change the C++ compiler so that these
 | |
| restrictions can be lifted.</para>
 | |
| </sect1>
 | |
| 
 | |
| </article>
 |