Project

General

Profile

Adding Support For a New Database

FWD supports multiple RDBMS back ends as an application's primary databases. The FWD runtime environment manages the differences in database implementations, such that the choice of a back end can be made at deployment time, without the need to re-convert the application.

At the time of this writing, these are:

  • PostgreSQL, currently the most thoroughly supported and tested environment;
  • SQL Server, supported with some limitations, less well tested;
  • H2 (pure Java database), supported primarily for testing and development purposes.

This document describes what needs to be considered when adding support for a new database implementation to FWD, and what needs to be done to implement that support.

Database Independence

Database Implementation Differences

The FWD runtime environment uses the Java Database Connectivity (JDBC) API to access connected databases. JDBC provides a way to connect to and access databases using Structured Query Language (SQL).

Although there are (numerous) standards to make SQL uniform and regular across database implementations, in practice it is not. Different database vendors support various SQL standards to varying degrees, and often offer vendor-specific extensions to the language. They may implement SQL syntax differently where the standards are vague on a particular feature (and even sometimes when they are not). The reasons for these differences are historical, as standards often lagged well after products went to market which implemented key features went to market. The reasons are not necessarily relevant to this discussion.

The result is that a number of SQL "dialects" have evolved over time, where different syntax may be used to accomplish the same purpose. For example, PostgreSQL uses the LIMIT phrase to cap the number of records returned by a query, while SQL Server uses the TOP phrase to accomplish the same goal. One database implementation may support the use of expressions in index definitions, while another may require the use of computed columns to achieve the same effect.

Dialect-Neutral Converted Code

4GL code which performs data access is converted to a SQL-dialect neutral result. Regardless of which database is configured to run as the back end, an application's converted code is the same. This database independence is achieved by providing a SQL-dialect neutral data access API to converted code.

On the query side, FWD provides a class hierarchy that maps to 4GL data access language statements (e.g., FIND, FOR, etc.). Data Model Objects (DMOs), which represent database tables, are passed by converted code to these query objects, along with other configuration, such as sort specifications and filtering constraints. Record filtering is defined by a 4GL WHERE clause, which FWD converts to a filter expression in a dialect-neutral, object-like query language (FWD Query Language, or FQL). FQL is modeled largely on Hibernate's (www.hibernate.org) query language HQL.

Once a FWD query object is configured with all this information, it executes the query against the backing database. At this point, the FWD runtime assembles a full FQL query statement from the FQL WHERE clause and all of the other information provided. The runtime, which is aware of the specific type of database configured as the back end, then transform that FQL statement into a dialect-aware SQL statement.

Prerequisite Database Features

FWD requires a database to support certain features. If a database does not support all of these features, then the effort to add support that database will be greater, at best. At worst, compatibility, reliability, and performance may be affected adversely when using that database.

Key features are:

  • a solid, performant JDBC driver;
  • modern SQL support:
    • subqueries/subselects;
    • joins (inner and outer);
  • control over locale and encoding;
  • User Defined Functions (UDFs) support (SQL and a performant PL);
  • sorting/ordering customization:
    • fine-grained control over text data collation, preferably database- or cluster-wide;
    • ability to sort NULLs high;
  • multi-column indices;
  • expressions or computed columns in indices;
  • data types used by FWD, including BLOB/CLOB.

Implementing Support For a New Database

Adding support for a new database involves changes within FWD's runtime environment and within code that will be loaded into the back end database. In no particular order, the following are needed:

  • Implement a concrete com.goldencode.persist.dialect.Dialect subclass.
  • Plan data type mappings.
  • Ensure DDL generation works for the new dialect.
  • Ensure generated/converted indices are appropriate for the dialect (i.e., some dialects may need to wrap text columns in upper/rtrim functions; some may accomplish the same goal with computed columns; others may offer data types which preclude the need for these).
  • Implement a set of UDFs in SQL and/or a performant procedure language (PL) native to the database.
  • Ensure all target locales and character encodings are supported by the database.
  • Determine how custom text collation can be achieved and implement it.
  • Determine issues, invent workarounds or document limitations.
  • Test. Test. Test.

The com.goldencode.persist.dialect.Dialect Abstract Class

This abstract class provides an API used by the FWD runtime to determine how all dialect-specific aspects of a database are managed when using JDBC to interact with a database. All abstract methods of this class must be implemented by a concrete subclass, to provide information about how a particular database implementation handles certain features in a dialect-specific way. The implementations of these subclasses affect the way the FWD runtime uses SQL syntax to express queries, creates, updates, deletes, etc. for the associated database.

This class may be expanded to handle additional syntax and idioms, as the list of FWD's supported databases grows.

Examples of concrete Dialect implementations in FWD:

P2JPostgreSQLDialect
MariaDbDialect
P2JSQLServer2008Dialect
P2JH2Dialect

User Defined Functions

It is possible in 4GL code to embed built-in functions in the WHERE clause of a data access statement. These are converted in one of two ways:

  • if the function DOES NOT reference the buffer which is the target of the data access statement (i.e., the "current" buffer), the built-in function (or the sub-expression enclosing it) is factored out of the WHERE clause and becomes a query substitution parameter;
  • if the function DOES reference the current buffer, the function is converted inline, as part of the WHERE clause.

Consider this example of a data access language statement with an embedded, built-in function in its WHERE clause:

DEFINE VARIABLE myVar AS INTEGER INIT 75.
FIND FIRST customer WHERE STRING(myVar) = "75".

In this case, the STRING function does not reference the current buffer (i.e., customer), so STRING(myVAR) is factored out of the WHERE clause and becomes a substitution parameter:

new FindQuery(customer, "? = '75'", null, new Object[] { myVar }, ...).first();

The expression myVar is evaluated once, and the result of its evaluation is passed into the FindQuery as a query substitution parameter when the query is executed.

Now consider this example:

FIND FIRST customer WHERE STRING(customer.idNum) = "75".

Assuming customer.idNum represents an integer field, this case will convert differently:

new FindQuery(customer, "toString(customer.idNum) = '75'", ...).first();

In this case, the customer.idNum must be converted to a string for each record against which the WHERE clause criterion is evaluated. This must occur on the database server. The 4GL STRING built-in function is converted to reference a User Defined Function (UDF) named toString, which is assumed to be defined in the database.

When adding support for a new database, a facility must be present to define UDFs, such as toString. Most modern RDBMS' have some form of CREATE FUNCTION statement which permits the definition of UDFs.

There are numerous 4GL built-in functions which must be ported as UDFs to each new database supported by FWD. Some UDF implementations are relatively straightforward, and can be expressed using simple SQL syntax, which is commonly supported by database vendors for the creation of functions. Other UDFs are more challenging to implement, and require more advanced features, like control flow capabilities, the definition of variables, a library of useful built-in functions upon which to build logic, etc. Ideally, the database supports a procedure language (PL) which offers these capabilities.

Some database implementations permit the use of lower level languages, like, C, C++, and Java to define UDFs; however, these often are problematic in terms of deployment. At minimum, they complicate deployment, and at worst, they are disallowed by many platforms (e.g., cloud offerings) because of the security risks posed by such languages.

Locales and Character Encodings

Custom Text Collation

Limitations and Workarounds

Testing