Skip navigation links

Package com.goldencode.p2j.persist.pl

Classes required by user defined, server-side, database functions implemented in Java using the PL/Java framework.

See: Description

Package com.goldencode.p2j.persist.pl Description

Classes required by user defined, server-side, database functions implemented in Java using the PL/Java framework.

Author
Eric Faulhaber
Date
March 13, 2007
Access Control
CONFIDENTIAL

Contents

Introduction
Supporting Technology - PL/Java
Installation
Uninstallation
Implementation
Forwarding API
Function Resolution and Data Types
Packaging - p2jpl.jar
Installing p2jpl.jar
Uninstalling p2jpl.jar

Introduction

This package contains classes which support the creation and execution of user defined functions, implemented in Java, within a database backend server process.  The current implementation will work within a PostgreSQL backend process using PL/Java.  However, it is anticipated that this implementation can be adapted to work within other database servers which support Java as a User Defined Function (UDF) language.  Specifically, both DB2 and Oracle support Java in this capacity.

Supporting Technology - PL/Java

The technology which makes the current implementation of this package possible is PL/Java.  PL/Java is a framework, implemented in Java and C (supported through JNI on Linux and Windows), which integrates with the UDF architecture within PostgreSQL to permit the definition and installation of Java UDFs in PostgreSQL.  PL/Java is not a core part of the PostgreSQL codebase, but rather is a separate, open source project, released under the Berkley/BSD license.  It is worth noting that this project has a very small membership and all evidence suggests that it is actually maintained by a single developer.  While this poses some continuity and support risk for the project, this risk is mitigated by its open source nature.

PL/Java works by running Java UDFs in a separate JVM which is started within the context of the backend database server process.  The version of PL/Java (1.3.0) currently used by P2J starts a new JVM instance for each active database connection.  While this carries with it some overhead, this overhead is amortized over the life of the connection, which should survive multiple application-level connections, because of the connection pooling used by P2J.  Note that the roadmap for PL/Java calls for a future version which shares a single JVM across multiple connections, which would further amortize this overhead.

Installation

To install PL/Java, it is necessary to make modifications to the PostgreSQL start-up script and configuration, and to "bootstrap" PL/Java within each database instance for which it will be used.  This "bootstrap" installation needs to be performed only once per database instance.  During this phase, PL/Java installs its own schema (sqlj), used to support its internal operations, and defines its own set of UDFs as an API which we use to implement additional P2J-specific, Java UDFs.

Unpack PL/Java

Unpack the PL/Java binary distribution to a well-known location, such as /usr/share/pljava (the remaining instructions assume this directory is used).  The distribution file used by P2J at the time of writing is pljava-i686-pc-linux-gnu-pg8.1-1.3.0.tar.gz, and it is part of the P2J project (in the p2j/pl directory).

PostgreSQL Launch/Config Changes

  1. Copy the pljava.so file into PostgreSQL's lib subdirectory (e.g., /usr/local/pgsql/lib if PostgreSQL is installed in /usr/local/pgsql).
  2. Modify the shell configuration (e.g., .bashrc if using bash) for the postgres user, to ensure the following environment variables are set for the postmaster process:
    1. CLASSPATH=
      • this is used to ensure the JVM launched by the PostgreSQL backend does not inherit a default CLASSPATH setting for the machine, but rather relies solely upon the classpath set by the p2jpl.jar installer
    2. LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/i386:$JAVA_HOME/jre/lib/i386/server
      • this is used to ensure that the PostgreSQL backend process can find the libraries necessary to launch JVM instances
      • note that we use the HotSpot server compiler rather than the client compiler;  this is intended to increase performance and assumes that a JVM, once launched, is long-lived, due to the connection pooling noted above
  3. Add the following settings to the postgresql.conf file (customarily located in the CUSTOMIZED OPTIONS section):
    • custom_variable_classes = 'pljava'
    • pljava.classpath = '/usr/share/pljava/pljava.jar'
  4. Restart the backend process

Bootstrap PL/Java

The following must be executed against each database which will use Java UDFs:
  1. cd /usr/share/pljava
  2. psql -U <super_user> -f install.sql <database_name>, where:
    • super_user is a user with rights to modify the target database
    • database_name is the target database

Uninstallation

The following must be executed against each database into which PL/Java was installed:
  1. cd /usr/share/pljava
  2. psql -U <super_user> -f uninstall.sql <database_name>, where:
    • super_user is a user with rights to modify the target database
    • database_name is the target database
  3. Back out the changes made in the PostgreSQL Launch/Config Changes section
  4. Restart the backend process

Implementation

For each Java method which must be supported as a database server function, a user defined function ("UDF") is created, using the SQL create function statement.  This registers our function with the database server backend.  For example:

create function myFunc(int, text, date)
returns boolean as
'java.lang.Boolean
    com.goldencode.p2j.persist.pl.Functions.myFuncImpl(
       java.lang.Integer,
       java.lang.String,
       java.sql.Date)'
language java immutable


registers a database function myFunc for the currently connected database.  This function accepts SQL integer, text, and date values as its parameters, and returns a boolean value to the backend.  The myFunc database function is mapped to a static method myFuncImpl in the com.goldencode.p2j.persist.pl.Functions class.  The method accepts parameters of type java.lang.Integer, java.lang.String, and java.sql.Date and returns a value of type java.lang.Boolean.  PL/Java manages the type conversions from SQL types to J2SE wrappers on input, and from J2SE wrapper to SQL type on return.

[Note:  due to a parser defect in the 1.3.0 version of PL/Java, the current P2J implementation actually embeds an equals (=) symbol within the Java method signature of the create function statement, between the return type and the function name.  This is a temporary workaround for the parser defect, which has been reported to the PL/Java developer.  This symbol should be removed when a fixed version of PL/Java becomes available.]

Forwarding API

Generally, the classes within this package do not define the implementations of the worker methods themselves.  Rather, they provide methods which define a thin, forwarding API layer.  The typical method implementation in this API converts parameters from J2SE wrapper types to P2J wrapper data types, delegates to the actual worker method in a separate P2J class, and converts the return value from a P2J wrapper type to the corresponding J2SE wrapper type.  The myFuncImpl() static method therefore might be implemented as follows:

package com.goldencode.p2j.persist.pl;

import java.sql.*;

public final class Functions
{
    public static final Boolean myFuncImpl(Integer i, String s, Date d)
    {
       return SomeWorkerClass.myFuncWorker(new integer(i),
                                           new character(s),
                                           new date(d)).toJavaType();
    }
    ...
}

This API creates an additional layer of code to maintain, and the type conversions performed in every method are duplicative.  While it would have been preferable in theory to map the create function statements directly to the backing worker methods in the P2J runtime, this was not possible in most cases, for several reasons:
Therefore, a thin, forwarding API layer was deemed an appropriate implementation.

Function Resolution and Data Types

Special care must be taken when defining functions to ensure the PostgreSQL backend is able to resolve the proper function implementation.  In particular, certain functions must be polymorphic to support a mixture of integral and arbitrary precision numeric data, particularly since it will not necessarily be known until runtime which data type will be used for a particular query.  If a function whose signature matches the parameter data types provided at runtime cannot be found, an error will occur and the query will fail.

Fortunately, the backend will attempt to coerce a runtime datum to the necessary type, if a function signature is available whose data type is compatible with, but wider than that of a particular runtime argument.  Thus, it is not necessary to define exhaustively every possible combination of parameter data types for each function, and in fact doing so would create ambiguity problems during function resolution.  The following data type mappings have been chosen to minimize ambiguity and to enable function resolution:

SQL Input Type
UDF Type (also SQL)
J2SE Wrapper Type
P2J Wrapper Type
integer
integer
java.lang.Integer
com.goldencode.p2j.util.integer
integer
numeric
double precision
double precision
java.lang.Double
com.goldencode.p2j.util.decimal
text
varchar
text
java.lang.String
com.goldencode.p2j.util.character
date
date
java.sql.Date
com.goldencode.p2j.util.date
boolean
boolean
java.lang.Boolean
com.goldencode.p2j.util.logical
bytea
bytea
byte[]
com.goldencode.p2j.util.raw

PostgreSQL Implementation Note:  due to complications handling timezone offsets, the PostgreSQL JDBC driver binds a prepared statement date parameter as the string representation of that date.  This implementation requires that some additional care be taken when defining functions.  Namely, it is not advisable to create polymorphic UDFs which accept a date parameter in one case and a text parameter in another case, because the text version always will be resolved in preference to the date version.  Clearly, this can cause unintended consequences and errors.  A workaround for this problem is to refactor the polymorphic UDF into separately named UDFs.  An example in P2J is the polymorphic com.goldencode.p2j.persist.pl.Functions.toInt() method.  One version accepts a java.lang.String parameter, another accepts a java.sql.Date parameter.  However, the UDFs are defined as toInt(text) and julianDayInt(date), respectively, to avoid the resolution conflict.

Packaging - p2jpl.jar

All of the code necessary to support the UDFs defined for P2J is stored in the p2jpl.jar file, which is built automatically with ant jar (or any ant targets dependent upon P2J's build.xml JAR target).  This jar file contains the subset of the full P2J codebase which is necessary to support the functionality required by the Progress-specific operator and built-in function implementations used within where clauses.  This jar file is used only by the JVMs launched by the PostgreSQL backend process;  the P2J server does not use this file.  Consequently, it is important to remember that despite the fact that the codebase is common, there is no communication or state-sharing between the PL/Java-enabled functions and the active P2J server, except insofar as the former are used to restrict the contents of database query result sets requested by the latter.

Installing p2jpl.jar

PL/Java installs Java as a trusted language in PostgreSQL, which means that no special privileges are required to invoke P2J UDFs.  It also means that PL/Java does not access p2jpl.jar from the file system (and that P2J UDFs also cannot access the file system;  an untrusted mode also is available, but the trusted mode suffices for our purposes).  Instead, PL/Java serializes jar files to a byte array and stores these within a database table in its sqlj schema.  At runtime, it uses a custom ClassLoader implementation to deserialize these byte arrays back into classes and load them into the JVM.

Consequently, some additional steps are required to install jar files for PL/Java's use.  We use the PL/Java API to install (and uninstall) p2jpl.jar, and to set the classpath to allow the JVM to find our classes.  A special deployment descriptor (p2j/pl/p2jpl.ddr) is embedded within the jar file, which executes SQL statements to create all of our UDFs on install or to drop all of our UDFs on uninstall.

The following statements must be executed against each target database which is to support P2J server-side functions.  This is typically done from within psql.
  1. select sqlj.install_jar('file:///<project_path>/p2j/build/lib/p2jpl.jar', 'p2j', true), where:
    • <project_path> is the directory in which the P2J project is installed
    • the second parameter, 'p2j', is used in future statements as a handle for the serialized p2jpl.jar
    • the third parameter, true, indicates that the p2jpl.jar contains a deployment descriptor which should be executed upon installation
  2. select sqlj.set_classpath('public', 'p2j')

Uninstalling p2jpl.jar

The following statement uninstalls the p2jpl.jar, and executes its deployment descriptor to drop all of our UDFs:
Skip navigation links
Copyright (c) 2004-2017, Golden Code Development Corporation.
ALL RIGHTS RESERVED. Use is subject to license terms.