Project

General

Profile

Non-Standard SAVE CACHE Implementation

SAVE CACHE is a 4GL language statement which caches the schema of a database to file. This statement is implemented differently in FWD. Instead of caching a schema to a file, this statement is repurposed by FWD to act as a notification to the FWD runtime that a database's schema has changed. The FWD runtime uses this notification to update its internal state, such that the new or updated database schema can be used when creating, reading, updating, and deleting records, using converted, dynamic, 4GL database features.

The implementation is intended to pick up both live updates to a mutable database, as well as updates made to a mutable database while the FWD server is offline. For example, a column may be dropped from a table, or an index added. For schema changes made while the FWD server is live, a SAVE CACHE statement must be executed after the schema change is committed. For changes made while the server is offline, an implicit SAVE CACHE operation will be executed when the server next starts.

In both cases, the change must be legal, according to the requirements described below. If not, a runtime error condition will be raised, and FWD's runtime state will not correctly reflect the change (i.e., a correct data model object (DMO) for that table will not exist in memory), and CRUD operations can not be run against the affected table.

Please note the following characteristics of the FWD implementation of SAVE CACHE:

  • The CURRENT and COMPLETE options do not distinguish behavior in this implementation; they are interchangeable. Regardless of which option is specified, the entire database schema is scanned.
  • The logical database name is needed to identify the database whose schema is to be scanned, but any statement parameters after the database name are ignored.

This work was done in #6371.

Background

The FWD implementation of SAVE CACHE is intended for limited use cases. Specifically, certain databases can be flagged as mutable in runtime configuration. These are databases which applications may use to provide users with user-defined data storage. At the time of this writing, only databases configured as mutable will work with this feature. If SAVE CACHE is executed for non-mutable databases, it is simply a no-op. For more information, see Configuration below.

It is the application's or an administrator's responsibility to edit the schema of a mutable database. At this time, using 4GL code to edit the 4GL database metadata (e.g. _file, _field) is not supported. A database might be edited while the application is running, in response to a user's action, through scripting, or in general through some means external to the converted code of the application. How the mutable database's schema is updated is outside the of scope of FWD's SAVE CACHE implementation. The SAVE CACHE implementation is limited to detecting and reacting to those changes, in order to enable the converted application to be used to dynamically access the mutable database's schema.

A mutable database's schema can be modified outside of the FWD runtime, or by executing DDL using the runtime at prescribed times while the application is running. However the edits are made, it is important that edits to the schema not be made while the application is actively using the database! The SAVE CACHE statement is used by the application to notify FWD that a scan of the database is needed. The scan provides the FWD runtime with the most up-to-date representation of the database. Such a scan is also done of all mutable databases at server startup, to ensure any changes made while the FWD server was offline are picked up.

Upon notification that a mutable database's schema has changed, FWD initiates a scan of that database using JDBC. JDBC's metadata API is used to gather information about all tables and indices of the mutable database, including information about every column in each table. Since JDBC metadata is not aware of many 4GL schema constructs (e.g., format phrases, validation expressions and messages, UI-related labels, etc.), certain legacy information present in a 4GL schema, which normally would be made available in FWD through a static conversion, is not available through this feature. Furthermore, the database schema is expected to follow certain conventions, in order for its tables, columns, and indices to work with the FWD runtime. If those conventions are not followed, the execution of the SAVE CACHE statement will result in ERROR conditions being raised at runtime. These must be resolved before the database can be used with actual CRUD operations.

Syntax

A fragment in ABL like:

&if defined(fwd-version) > 0 &then
save cache complete MyModifyableDbName to save_cache_dummy.txt.
&endif

Will (as of current, 2024-03-29) convert to:
SchemaCheck.run(new character("MyModifyableDbName"));

Mutable Database Requirements

A mutable database is expected to conform with certain requirements, in order to work with the FWD runtime. These are the same requirements applied during a static conversion. These include:

  • the presence of a surrogate primary key;
  • any UDFs (User Defined Functions) supplied by FWD must be loaded;
  • adherence to well-known data type mappings;
  • use of SQL DEFAULT constraints to specify initial field values;
  • the existence of an ID generator sequence;
  • the existence of a meta_user table, optionally containing records;
  • the surrogate primary key must not be used in indices, except at the end of non-unique indices.

Surrogate Primary Key

FWD expects every table to have a surrogate primary key; that is, a unique primary key with no business meaning. This key is used by FWD for internal identification purposes and is exposed to converted 4GL code as a record's recid/rowid value. In FWD, this is an 8-byte integer, and its value for each record is unique across the database. The default name of this column is recid, though it can be configured to any name which is known to be unique among all fields across the schemata used by an application. recid was chosen as a default name, since it is a reserved keyword in the 4GL, but not in standard SQL; thus, it will not conflict with the names of any existing field names in a 4GL schema. This column must be defined as the primary key of a table; thus, it is expected to be the only component of a unique, primary constraint/index.

An error will be raised during the execution of the SAVE CACHE statement if this column does not exist in every table; or if it is not defined as an 8-byte integer; or if it is not the only component of a unique constraint/index.

UDFs

4GL queries can embed 4GL builtin functions in their WHERE clauses. When FWD executes the converted forms of these queries, the builtin functions are represented by user-defined functions (UDFs). UDFs are implemented in SQL or in a SQL procedure language supported by the database dialect in use. The standard UDFs representing the 4GL builtin functions must be installed in a database in order for these queries to work. The UDF implementation scripts are stored in the deployed p2j.jar file.

For additional detail, please refer to Native UDFs for PostgreSQL.

UDFs can be loaded into a database in several ways. The recommended and generally easiest way is to use a FWD utility called the ScriptRunner. Here is an example for MariaDB:

java -classpath <path to p2j.jar> com.goldencode.p2j.persist.deploy.ScriptRunner
                                  <JDBC URL>
                                  <username>
                                  <password>
                                  udf.install

udf.install is the command which tells the ScriptRunner to install the UDFs.

Another form of the command for PostgreSQL:

java -classpath <path to p2j.jar> com.goldencode.p2j.persist.deploy.ScriptRunner
                                  <JDBC URL>
                                  <username>
                                  <password>
                                  udf.install.search_path

The udf.install.search_path tells the ScriptRunner to first install the UDFs, then set the database's search path. In PostgreSQL, the UDFs are installed in a separate schema from the main database. This schema is named udf. The search path is set to allow the UDFs to be resolved in query statements without the udf schema qualifier.

If the ScriptRunner approach is not suitable for your deployment, some additional UDF installation options follow...

Here is an example on how to load the UDFs in an Ant target, for postgres persistence:

    <macrodef name="fwd.load.udfs">
        <attribute name="db.name"/>
        <sequential>
            <unzip src="${application.dir}/server/lib/p2j.jar" dest="${temp.dir}">
                <patternset>
                    <include name="udf/postgresql/udfs.sql"/>
                    <include name="udf/postgresql/words-udfs-sql.sql"/>
                </patternset>
            </unzip>
            <pg.db.psql.file db.name="@{db.name}" user="fwd_admin" file="${temp.dir}/udf/postgresql/udfs.sql" verbose="true"/>
            <pg.db.psql.file db.name="@{db.name}" user="fwd_admin" file="${temp.dir}/udf/postgresql/words-udfs-sql.sql" verbose="true"/>
            <delete includeemptydirs="true">
                <fileset dir="${temp.dir}" includes="udf/postgresql/udfs.sql,udf/postgresql/words-udfs-sql.sql"/>
            </delete>
        </sequential>
    </macrodef>

Here is an example on how to load the UDFs in a bash shell script, for postgres persistence:

#!/usr/bin/env bash
mkdir udf-tmp
cd udf-tmp
cp -p ${P@J_HOME}/lib/p2j.jar .
unzip p2j.jar udf/postgresql/udfs.sql udf/postgresql/words-udfs-sql.sql
PAGER= PGPASSFILE=~/.pgpass psql "sslmode=require host=localhost user=postgres dbname=somedatabase" -c '\i ./udf/postgresql/udfs.sql'
PAGER= PGPASSFILE=~/.pgpass psql "sslmode=require host=localhost user=postgres dbname=somedatabase" -c '\i ./udf/postgresql/words-udfs-sql.sql'
cd ..
rm -rf udf-tmp
#EOF

Here is an example result (as of current, 2024-03-29), for postgres persistence:

PAGER= PGPASSFILE=~/.pgpass psql "sslmode=require host=localhost user=postgres dbname=somedatabase" -c '\df udf.*'

returns (abbreviated):

 Schema |             Name             |      Result data type       |                                Argument data types                                 | Type
--------+------------------------------+-----------------------------+------------------------------------------------------------------------------------+------
 udf    | addinterval                  | date                        | initial date, amount bigint, unit text                                             | func
 udf    | addinterval                  | date                        | initial date, amount integer, unit text                                            | func
 udf    | addinterval                  | timestamp without time zone | initial timestamp without time zone, amount bigint, unit text                      | func
 udf    | addinterval                  | timestamp without time zone | initial timestamp without time zone, amount integer, unit text                     | func
 udf    | begins                       | boolean                     | a text, b text                                                                     | func
 udf    | begins                       | boolean                     | a text, b text, cs boolean                                                         | func
 udf    | caps                         | text                        | t text                                                                             | func
 udf    | checkerror                   | boolean                     | init boolean, result boolean                                                       | func
 udf    | concat                       | text                        | a text, b text                                                                     | func
 udf    | datespan                     | bigint                      | d1 date, d2 date                                                                   | func
 udf    | divide                       | numeric                     | x bigint, y bigint                                                                 | func
...
 udf    | tostring                     | text                        | v numeric, fmt text                                                                | func
 udf    | tostring                     | text                        | v text                                                                             | func
 udf    | tostring                     | text                        | v text, fmt text                                                                   | func
 udf    | tostring_                    | text                        | v text, fmt text                                                                   | func
 udf    | totimestring                 | text                        | v bigint, fmt text, tz integer                                                     | func
 udf    | trimws                       | text                        | s text                                                                             | func
 udf    | trimws                       | text                        | s text, trimchars text                                                             | func
 udf    | words                        | text[]                      | s text                                                                             | func
 udf    | words                        | text[]                      | s text, toupper boolean                                                            | func
 udf    | words                        | text[]                      | s text, toupper boolean, forcaseinsensitive boolean                                | func
(442 rows)

Minimum of One Column (in Addition to the Primary Key Column)

A table must have at least one column that is not the surrogate primary key column.

Data Type Mappings

Permitted data type mappings vary by database dialect. If the schema check performed by the FWD SAVE CACHE implementation discovers data types in use other than those listed here, an error condition is raised and the database will not be usable by the FWD runtime.

In cases when a database type maps to multiple 4GL types, it is possible to request a specific type. This is done by adding a comment to the database column, describing the required type mapping. Example of a valid mapping of text field into comhandle (instead of the default character) in PostgreSQL:

COMMENT ON COLUMN example_table.example_column IS 'Type: comhandle';

If no comment is specified, column type will be mapped to a default 4GL type, marked in bold in the following tables.

Supported annotations
Annotation Notes
type options available for individual types are listed in the next table
case-sensitive TRUE, FALSE, 1, 0 are valid options
TIMEZONE TODO
EXTENT TODO
PostgreSQL Dialect
SQL Type(s) 4GL Type(s) Notes
integer, int4 integer, recid
bigint, int8 int64, handle, object, rowid See also Surrogate Primary Key
numeric(50, X) decimal X is scale, max 10
boolean logical
text character, clob, comhandle
oid blob
date date
timestamp datetime
timestamp with time zone datetimetz
bytea raw
MariaDB Default (Strict) Dialect
SQL Type(s) 4GL Type(s) Notes
integer integer, recid
bigint int64, handle, object, rowid See also Surrogate Primary Key
decimal(50, 10) decimal
boolean logical
text character, clob, comhandle
blob blob
date date
datetime(3) datetime
timestamp(3) datetimetz
varbinary(#) raw

Note: the above mappings are placeholders; the MariaDB strict dialect implementation is not currently correct.

MariaDB Lenient Dialect
SQL Type(s) 4GL Type(s) Notes
integer integer, recid
bigint int64, handle, object, rowid See also Surrogate Primary Key
decimal(50, 10) decimal
boolean logical
text character, comhandle
varchar(#) character
mediumtext clob
blob blob
date date
datetime(3) datetime
timestamp(3) datetimetz
varbinary(#) raw

Initial Field Values

Initial values for newly created records are specified using the SQL DEFAULT constraint. For example (PostgreSQL dialect):

CREATE TABLE example (
   recid bigint not null,
   int_test integer default 10,
   txt_test_a text default 'hello',
   txt_test_b text,
   txt_test_c text default '',
   ...
)

When a converted SAVE CACHE statement is executed, the DEFAULT constraints detected in the database's schema are written into the generated DMO for a table, as its initial property values. This is the equivalent to how the INITIAL keyword is treated when statically converting a schema (*.df) file.

As a consequence, in the example above, txt_test_b will get an initial value of null (in most database flavors), while txt_test_c will get an initial value of 'empty string'. The latter must be used if you need to have compatibility with your OpenEdge runtime business logic. You need to apply this to all OpenEdge data types that behave differently in your FWD database flavor.

When FWD creates a new DMO instance in response to a converted 4GL CREATE statement, those initial values are set into the DMO's properties in memory. They may be overwritten by subsequent assignments from business logic. When it is time to flush the new DMO to the database, a SQL INSERT statement specifying the values for all columns of the table is executed. Thus, the DEFAULT constraints are never actually triggered at the database level, since no column is allowed to default. They are managed only within FWD.

Note that the specification of DEFAULT constraints for a mutable database is a departure from the DDL generated for statically converted (i.e., non-mutable) databases. There are no such constraints specified for the latter, and it is recommended that such constraints not be added post-conversion. Doing so could throw the statically converted DMOs out of sync with the actual database schema.

This discrepancy of DEFAULT constraint use for a mutable database, but not for a non-mutable database is actually meant to keep the FWD runtime consistent with the backing database's schema.

Consider that for a non-mutable database, the generation of DMOs and the corresponding DDL is driven from a converted schema (*.df) file. The DDL is generated without DEFAULT constraints, so the database schema has none. The FWD runtime is responsible for setting all initial values in new records. If such constraints were added to the schema after conversion, they could potentially conflict with the initial values used by the FWD runtime.

For a mutable database, there is no *.df file to specify the initial values; these instead are drawn from the DEFAULT constraints read during a scan of the mutable database's schema. In this case, the DEFAULT constraints are the authoritative source of the initial value information saved in the generated DMOs.

When specifying DEFAULT constraints for a mutable database, take care to use the appropriate value for the database dialect in use. For example, SQL databases may not accept yes as a default value for a boolean column, but may instead expect true or t or 1.

A current limitation of this approach is that 4GL builtin functions, such as today, now, and time are not supported properly. These would need to be mapped to the equivalent SQL functions (to the extent they exist) for each supported database dialect. This is necessary to allow a CREATE TABLE statement containing such a DEFAULT constraint to execute properly for each dialect, while allowing the SAVE CACHE statement implementation to recognize those dialect-specific functions and map them to the equivalent 4GL builtin function.

EXTENT types

Fields defined with an extent property (example_column as int extent 5.) are also supported. Fields generated in a separate table (normalized mode) are detected automatically and do not require further input.

Fields generated as columns in the same table (denormalized mode) are detected automatically for extents >= 3. So, example_column as int extent 3. (generated as example_column_1, example_column_2, example_column_3 in the database) will also be detected automatically. For extent 2 it is expected that the first column will have a clarifying comment:

COMMENT ON COLUMN example_table.example_column_1 IS 'Extent: 2';

Reasoning: it would be a rare case when an extent 2 is used and we wanted to avoid confusion with unrelated columns with a number suffix. Having 3 columns in a row is the least pattern that can reliably detected.

Further more, if there is a need to use columns with numbered suffixes and they are not a part of a generated extent field, the first column can be annotated with extent 0:

COMMENT ON COLUMN example_table.example_column_1 IS 'Extent: 0';

This will cause this column (and any other column with higher suffix) to not be handled as an extent type.

ID Generator Sequence

The mutable database must have a sequence named p2j_id_generator_sequence. The sequence:

  • should not cycle;
  • should have an interval of 1;
  • should have a starting value of:
    • 0, if there is no data in the database; or
    • the value of the highest primary key value of all records in the database.

This sequence will be used by the FWD runtime to get the next surrogate primary key when creating new records.

The database role which is used for all normal CRUD access to the database must have permission to get the current and next value of the sequence.

Migrated _User Metadata Table

If the application running under OE relied upon a _User metadata table for its security implementation, each migrated database used by FWD must contain the migrated form of this table, named meta_user. A mutable database is no exception. This table is expected to exist when a user connects to the database; at minimum the FWD runtime will check the number of records that exist in the table (which may be 0). The schema of the meta_user table may be copied exactly from one of the application's primary, migrated databases. It is up to the database administrator to create the user records needed (if any), presumably by copying them from a migrated, primary, application database.

If the requirements above are not met, the schema check performed by the SAVE CACHE implementation will raise an error condition, and the database will not be usable with the FWD runtime.

Configuration

Databases whose schemata may be edited outside of normal conversion, and which are to be used with the SAVE CACHE feature must be configured as mutable for FWD to recognize them. A database will only be scanned at runtime in response to the converted SAVE CACHE statement if it is configured as mutable. This distinguishes database with an editable schema from normal operational databases.

A database marked mutable will be scanned when it is first loaded, after the FWD server's startup. A dynamically generated pair of DMO interface and implementation classes will be created in the FWD server for every valid table discovered in a mutable database. Regular databases must thus NOT be marked mutable, to prevent duplicate sets of DMO interfaces and implementation classes for tables in those databases, which already were created during static conversion. At best, this would waste memory, but it may also cause operational problems.

Conversion

To mark a database schema as mutable for conversion purposes, the mutable attribute was introduced to the namespace element in the schema section of the cfg/p2j.cfg.xml file. The mutable attribute is false by default, so it only should be added and set to true for a database which has an editable schema.

For example, the following configuration marks a database schema named extension, which uses the minimal name conversion mode and is loaded by default, as mutable:

   <schema>

      ...

      <namespace
         name="extension" 
         mutable="true" 
         conversion="minimal" 
         default="true">
         <parameter name="ddl-dialects" value="mariadblenient,postgresql" />
      </namespace>

      ...

   </schema>

Note that a schema export file (e.g., extension.df) is NOT needed in the conversion project for a mutable database. The schema portion of conversion will behave as if such a file was present, but empty (i.e., defined no database resources). FWD assumes that all tables, indices, sequences, etc. are defined external to the conversion process, and that these resources follow the requirements outlined in this document.

Runtime

The FWD server must be aware that a database is mutable, if it is to be used with the converted SAVE CACHE statement. The SAVE CACHE statement for a non-mutable database is a no-op. This information must be added to the database's runtime configuration in the directory. To mark a database to work with SAVE CACHE, add a boolean mutable flag to the database's p2j container in the directory, as follows:

        ...
        <node class="container" name="database">
          <node class="container" name="<database_name>">
            <node class="container" name="p2j">
              ...
              <node class="boolean" name="mutable">
                <node-attribute name="value" value="true"/>
              </node>
            </node>
            ...

Current Limitations

  • Built-in database functions to represent 4GL built-ins like today, now, time for the purpose of specifying initial field values as SQL DEFAULT constraints are not well supported at this time.
  • TBD...