Data Definition Language (DDL)¶
During M0 part of the conversion process, the .df
files with the schema for the permanent tables are used to generate the DMO files - the DMO interface, the Hibernate mapping file and the DMO implementation class. At the end of the M0 part and using the information in both the .df
files and in the generated DMOs, there will be generated index, table and dialect specific DDL (for each of the dialects configured in the p2j.cfg.xml
file). This DDL can be used to create an empty database in which the data can be imported or identify and extract DDL for permanent tables added to the 4GL schema after the first production release of the converted application.
Near the end of M0 part, the DDL is generated by executing the schema/generate_ddl
rule set and using the settings in the p2j.cfg.xml
file. At the very of M0 part, the sequence DDL are appended to output file by schema/generate_seq_ddl
. Also, the generate_ddl
and generate_seq_ddl
rule sets can be executed from command line at any time, after at least one M0 conversion run was finished successfully. For more information about which commands can be used to generate the DDL, please see the Data Import section of the Data Migration chapter in the FWD Conversion Handbook.
Currently, only postgresql
and h2
dialects are supported for permanent tables and DDL can be generated only for these dialects. To generate DDL for one or both dialects, each namespace referring a physical database defined in the p2j.cfg.xml
file must be configured, to specify the required dialects; please see the Global Configuration section of the Project Setup chapter in the FWD Conversion Handbook for more details on how to configure the p2j.cfg.xml
file.
As noted above, the DDL generation is done using as input the generated DMOs and definition data in the .df
files. This is done in 3 steps: generate index and identity sequence DDL, generate table DDL and generate other dialect-specific DDL; most of the index and table DDL is generated using Hibernate's schema and index export tools. From this point on, we will detail each step explaining how DDL is generated for permanent databases, using the postgresql
and h2
dialects. Note that no DDL is generated for the temporary tables as they are created at runtime, when the temporary table is first used.
Generating Index DDL¶
The index data is collected from the ADD INDEX
clauses in 4GL schema file. The converted index name is built by appending the idx__
prefix to its converted, compatible SQL name. If the schema hints file doesn't have a drop
annotation for this index, it will be processed and DDL will be emitted to the schema_index_{schema_name}_{dialect}.sql
file in the $P2J_HOME/ddl/
folder. For each ADD INDEX
clause, two SQL statements will be emitted: a DROP
statement to remove the index (in case if it already exists) followed by a CREATE INDEX
statement to create the actual index. The generated DDL will always group the statements by each table: first there will be a group with the DROP
statements which will remove all indexes created for this table, followed by a group of CREATE INDEX
statements to create the indexes for that table.
Although the ADD INDEX
clause supports several parameters in 4GL, during DDL generation only the UNIQUE
parameter is of concern, to mark an index unique or not. If an index is determined to not be unique then the id
column will be automatically added to the index, as the last column in the list. The PRIMARY
parameter is not needed for index DDL generation as the converted tables will have their primary key set on the id
column of the converted table. Any UNIQUE
PRIMARY
index defined in the 4GL schema will be converted to an UNIQUE
index.
In the 4GL schema file, the index components are defined using the INDEX FIELD
command, following to the ADD INDEX
clause. Each INDEX FIELD
will have as parameter the 4GL field name followed by the direction of the sort: ASC
for ascending and DESC
for descending. By default, if no ASC
or DESC
parameter is present, all index fields are sorted ascending.
PostgreSQL Dialect¶
For each defined 4GL index, following DDL statements are emitted, to drop and then create the index:
DROP INDEX <index_name>; CREATE [UNIQUE] INDEX <index_name> ON <table_name> ({ [ <column_name> | <formula> ] [ASC | DESC] },...);
In the above statements, the index_name
, table_name
and column_name
always represent the converted name of the index, table and column. If the column is a character column, a formula will be emitted instead: this is needed to comply with the 4GL case-sensitivity and white space trimming rules. When a character column is encountered, the formula will be emitted using one of the following rules:
- if this is a case-sensitive character column, the column's formula will be set to:
RTRIM(<column_name>, E' \t\n\r')
- if this is a case-insensitive character column, the column's formula will be set to:
UPPER(RTRIM(<column_name>, E' \t\n\r'))
H2 Dialect¶
For each defined 4GL index, following DDL statements are emitted, to first drop and than create the index:
DROP INDEX <index_name>; CREATE [UNIQUE] INDEX <index_name> ON <table_name> ({ <column_name> [ASC | DESC] },...);
In the above statements, the index_name
, table_name
and column_name
always represent the converted name of the index, table and column 4GL names. If the column is a character column, it will always be prefixed with the computed column string, the two underlines, as in __<column_name>
.
When converting an index, the only concern are the character columns which appear in the index definition. These will be collected and, when table DDL is generated, a special computed column will be emitted. With this in mind, whenever character columns are found to be part of the index, the index field will be altered so that it refers the computed column and not the original column. This is done by adding a special prefix to the column name: __
(two underlines), the computed column identifier; the formula for this column will be set with ALTER TABLE
statements, appended at the end of the table DDL file. Please see the next section, Generating Table DDL, about how ALTER TABLE
statement looks.
Identity Sequence DDL¶
After the index DDL is generated, at the end of the file following two statements will be emitted:
DROP SEQUENCE p2j_id_generator_sequence; CREATE SEQUENCE p2j_id_generator_sequence START WITH <start_value>;
These statements create a generator sequence which will be used to generate the ID value when a new record is added to a table. The name for this sequence is the reserved p2j_id_generator_sequence
. When the DDL is generated, if no start value is given for the ID generator sequence, it will default to 10000. When first importing the data, the value for this sequence will be set to the ID for the last inserted record (the maximum ID value, from all records in all tables). Note that at the time of this writing the, these two statements are emitted the same for both both H2 and PostgreSQL dialects, as they don't have dialect-specific structure.
Generating Table DDL¶
The table data is collected from the ADD TABLE
entries in the 4GL schema file. If the schema hints file doesn't have a drop
annotation for this table, it will be processed and DDL will be emitted to the schema_table_{schema_name}_{dialect}.sql
file in the $P2J_HOME/ddl/
folder.
For each ADD TABLE
clause, following SQL statements will be emitted: one or more DROP
statements to remove the table and any extent tables (in case they already exist) and CREATE TABLE
statement(s) to create the main table and any associated extent tables. For each table, the SQL table name is built by converting the 4GL table name to the compatible SQL format; in case of tables associated with extent fields, their name will be built from the SQL table name followed by the __<extent_size>
suffix. Each extent table will always contain all the extent fields of the same size, from a certain table.
From each ADD TABLE
definition, only the 4GL table name is needed during DDL generation. The table columns are converted using as source the ADD FIELD
definition which follow the ADD TABLE
definition. For each field, only the AS data-type
, CASE-SENSITIVE
,DECIMALS
, EXTENT
, INITIAL
and MANDATORY
clauses are used, to determine the SQL data type, to mark the case sensitivity state for character columns, the number of decimal digits for decimal columns, the extent size (for extent fields), the default value and the not-null state of this field. The impact of these clauses is explained in the next table:
ADD FIELD Clause | DDL Equivalent | Details |
---|---|---|
AS data-type |
data_type |
The converted SQL data type, based on the H2 dialect. See the ... TODO: book for more details on how the 4GL data type is converted to their SQL data type, based on the dialect. |
CASE-SENSITIVE |
n/a | Marks a certain character column to be case sensitive. Affects only the DMO implementation class - when a new DMO record is created, all case sensitive fields are marked accordingly. |
DECIMALS |
numeric(<size>, <decimals>) |
For all decimal type fields, the DECIMALS clause provides the number of the decimal digits. In DDL, it is emitted as the second parameter for the numeric data type. |
EXTENT |
extent table | All extent fields with the same size are grouped and emitted in the same extent table. See bellow for more details. |
INITIAL |
n/a | The column's default value is not emitted at DDL, but at the DMO class implementation. Whenever a new DMO is created, all of its fields are initialized to the default values, based on the INITIAL clauses, if they exist. |
MANDATORY |
NOT NULL |
Optional clause, marks a certain field as mandatory. |
Table 1. TODO:
The generated DDL structure of a table without extent fields and without indexed character fields looks like this:
CREATE TABLE <table_name> { id <id_type> NOT NULL, <column_name> <data_type> [NOT NULL], ... PRIMARY KEY (id) }
Here, the table_name
is the converted table name, data_type
is the SQL data type and column_name
is the converted column name. Each converted table will have as primary key the id
column, its name being also a reserved name. The id_type
represents the dialect-specific type of the ID field. For postgresql
dialect, this type is int8
and for h2
dialect, this type is bigint
.
In cases when there are indexed character fields, the
For the DROP TABLE
statement, there is a dialect difference: the IF EXISTS
clause is added only for h2
dialect, while postgresql
doesn't support this clause:
DROP TABLE <table_name> [IF EXISTS];
When a 4GL table contains extent fields, all extent fields of the same size are grouped together in a single table. For each generated extent table, beside the extent table DDL, it is needed a foreign key constraint (to link the child extent table to its parent table) and an explicit index on the parent__id
column; note that the primary key for the extent tables always contains both the parent__id
and the list__index
columns, in this order. Also, in 4GL, the extent columns can never be part of an index. Following are the DDL statements emitted for this extent table:
ALTER TABLE <extent_table> DROP CONSTRAINT <constraint_name>; DROP TABLE <extent_table> [IF EXISTS]; CREATE TABLE <extent_table> { parent__id <id_type> NOT NULL, <column_name> <data_type> [NOT NULL], ... list__index <idx_type> NOT NULL, PRIMARY KEY (parent__id, list__index) } CREATE INDEX <extent_table>_fkey ON <extent_table> (parent__id); ALTER TABLE <extent_table> ADD CONSTRAINT <constraint_name> FOREIGN KEY (parent__id) REFERENCES <table_name>;
where:
- the
IF EXISTS
clause for theDROP TABLE
statement ish2
specific (is not emitted forpostgresql
). id_type
represents the dialect-specific type of the ID field. Forpostgresql
dialect, this type isint8
and forh2
dialect, this type isbigint
.idx_type
represents the dialect-specific type of thelist__index
field. Forpostgresql
dialect, this type isint4
and forh2
dialect, this type isinteger
.extent_table
is the converted name of the extent table, which has the<table_name>__<extent_size>
format. Here, thetable_name
is the converted 4GL table name andextent_size
is the size of the extent fields which are grouped in this table.parent__id
is a special field which holds the ID of the parent record to which the data for the extent fields belongs.list__index
is the position of this row in the extent list; it can take values only from 0 toextent_size - 1
- the
parent__id
andlist__index
columns will always form the primary key for any extent table. - an explicit index is created on the
parent__id
column. constraint_name
is the name of the constraint which links the child extent table to its parent table. On DDL generation, the name will have a form like FKC9CB32BFB4620D8.column_name
is the converted name of the 4GL extent field anddata_type
is the dialect-specific data type of the 4GL extent field. The generated DDL for a column in this table follows the same rules as the non-extent columns.
At the end of the file the DDL for creation of sequences are appended. Because sequences are database dependent they are described below, for each dialect.
PostgreSQL Dialect¶
The DDL statements emitted for sequences look like this:
DROP SEQUENCE IF EXISTS <seq_name>; CREATE SEQUENCE <seq_name> INCREMENT <inc> [ MINVALUE <min> ] [ MAXVALUE <max> ] START <init> [CYCLE] CACHE 1;
where the CYCLE
is only emitted for cycling sequences, and MINVALUE
and MAXVALUE
if they are defined into the export file.
Except the ID data type presented above, there are no other dialect specific DDL emitted for PostgreSQL, related to table creation.
H2 Dialect¶
The DDL statements emitted for sequences look like this:
DROP SEQUENCE IF EXISTS <seq_name>; CREATE SEQUENCE <seq_name> START WITH <init> INCREMENT BY <inc> CACHE 1;
The H2 support for sequences is rather limited, only linear (non cycled) sequences can be handled and there is no support for bounds. These 4GL features will be handled by special code in FWD.
Indexed Character Columns¶
As H2 doesn't allow formulas in the index definition, the only solution is to add a copy of the character column which will have the correct formula set and will be used by the index, so that the index behaves as expected. For each character column collected during index DDL generation, another special column will be added to the table, with its name built using the original column name to which the special computed column prefix, __
(two underlines), is added. So, each definition of an indexed character column will be followed by the definition of its associated computed column:
__<column_name> varchar [NOT NULL]
Before Hibernate loads the mapping file for each DMO, the conversion runtime checks if computed columns are needed and inserts them into the mapping file for this DMO. Note that the new computed columns will not have their formulas set at the CREATE TABLE
statement, as the Hibernate mapping file doesn't support formulas to be added to the column's definition. The solution is to alter the column definition: after the DDL for all tables was generated, ALTER TABLE
statements will be emitted to set the correct formula to the computed columns, following these rules:
- if this is a case-sensitive character column, the column's formula will be set to:
RTRIM(<column_name>, ' \t\n\r')
- if this is a case-insensitive character column, the column's formula will be set to:
UPPER(RTRIM(<column_name>, ' \t\n\r'))
The syntax of the ALTER TABLE
statement which sets the computed column's formula is this:
ALTER TABLE <table_name> ALTER COLUMN __<column_name> <data_type> as <formula>;
where table_name
and column_name
are the converted 4GL table and column names and data_type
is the SQL data type. Also, formula
is one of the two expressions presented above.
As the extent columns can't be part of an index, computed columns will never be generated for extent character columns.
Other DDL¶
PostgreSQL Dialect¶
At the time of this writing, PostgreSQL requires DDL to be manually maintained only for the server-side database functions. The file which contains this DDL is located at $P2J_HOME/pl/p2jpl.ddr
and contains definitions for all the methods defined in the Functions
, Operators
and ErrorHandler
classes in the com.goldencode.p2j.persist.pl
package.
The p2jpl.ddr
file is a special file following the structure required by the PL/Java module for the PostgreSQL server and is an implementation of the SQLJDeploymentDescriptor, used by p2jpl.jar
. Basically, this file contains two sections, one to install the server-side functions (which contains CREATE FUNCTION
statements) and one to remove the server-side functions (which contains DROP FUNCTION
statements). Please see the PL/Java documentation for more details about the structure of this file.
H2 Dialect¶
For H2 dialect, some other statements need to be emitted in the table DDL file. These statements are related to the database collation and server-side database functions.
Database Collation¶
For H2, the database files are created when the database is first accessed and there is no external command which creates the database and sets its collation. So, the following statement needs to be executed before any table is created (note that only en_US_FWD
collation is supported at this time):
set collation en_US_FWD;
For convenience, after table DDL generation, the conversion rules read the H2 table schema file and, after the drop table section, before any table is created, the SET COLLATION
statement is emitted.
Server-Side Database Functions¶
For H2 dialect, the server-side database functions need to be explicitly defined using CREATE ALIAS
statements. They are emitted automatically during DDL generation and is needed because, even if the p2jpl.jar
is set in the Java class path when the H2 server is started, the H2 server doesn't know which functions to load - they need to be created at the same time with the rest of the tables. Definitions are emitted for all the methods defined in the Functions
, Operators
and ErrorHandler
classes in the com.goldencode.p2j.persist.pl
package. All the server-side functions are defined using the following syntax:
CREATE ALIAS IF NOT EXISTS <alias> FOR “<class_name>.<method_name>(<data_type>,...)”;
where:
alias
is the server-side function aliasclass_name
is the package-prefixed Java class name which defines the methodmethod_name
is the Java method namedata_type
is the list of parameter data types
DDL with Foreign-Keys Enabled¶
When the foreign-keys
global parameter of the p2j.cfg.xml
file is set to true, foreign key constraints will be emitted for all the natural joins encountered throughout the application code. When found, a special column will be emitted in the child table, which will have the role to refer the record ID in the parent table. This special column will have the name built from the parent table's converted name to which the __fk
suffix is added and will always be emitted at the end of the column list of the CREATE TABLE
statement. Also, this column will be indexed and a the foreign key constraint will be created for it, as in:
ALTER TABLE <child_table> DROP CONSTRAINT <constraint_name>; CREATE TABLE <child_table> { id <id_type>, ... <foreign_column> <id_type>, PRIMARY KEY (id) } CREATE INDEX <index_name> ON <child_table> (<foreign_column>); ALTER TABLE <child_table> ADD CONSTRANT <constraint_name> FOREIGN KEY (<foreign_column>) REFERENCES <parent_table>;
where:
child_table
is the name of the child table, determined from the natural join by the conversion rules.parent_table
is the name of the parent table, determined from the natural join by the conversion rules.foreign_column
is the name of the new foreign key column, with its name built from the parent table name to which the__fk
suffix is added, as in<parent_table>__fk.
id_type
is the dialect-specific data type used for ID columns,int8
forpostgresql
dialect andbigint
forh2
dialect.index_name
is the name of the index on the foreign key column. This has a name built from the child table name and an unique ID, likeidx_<child_table>_<unique_ID>
constraint_name
is the name of the constraint which links the child extent table to its parent table. On DDL generation, the name will have a form likeFKC9CB32BFB4620D8
.
Note this feature is disabled by default and is not recommended to be used in a production environment. This is because it may cause deadlocks at runtime, depending upon the order in which an application updates or deletes records in tables which participate in foreign key relations.
© 2004-2017 Golden Code Development Corporation. ALL RIGHTS RESERVED.