Project

General

Profile

Support #4080

AWS Aurora support

Added by Greg Shah about 5 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
Due date:
% Done:

0%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

Related issues

Related to Database - Support #5183: database collation WIP

History

#1 Updated by Greg Shah about 5 years ago

A customer has asked this question:

We want to move our customers to AWS. When moving to AWS it would be great to leverage the strength of AWS Aurora (speed and less maintenance). Do you know if this is possible with FWD?

Aurora does not seem to support/allow PL/Java, but i'm not sure about this.

This is the list of extensions that are supported already:

Supported Extensions in Aurora for PostgreSQL Compatibility Mode (and details on how you make a PostgreSQL extension)

I don't see PL/Java there (as of this writing), but there are other similar extensions such as plv8 which runs Javascript code. plv8 would need a Javascript VM/engine to execute udfs so I think technically it is possible to also support PL/Java. But how to get this enabled and/or convince Amazon to support PL/Java is an unknown.

I suspect this is probably the only thing holding us back. We could make changes to support a different conversion output that did not rely upon such UDFs being executed in WHERE clauses. Doing such a thing would cause a major performance hit, so it is not clear that it would be a good idea.

Making a connection from Java is a little different from normal PostgreSQL, but it would not be a big deal.

We certainly have interest in the PostgreSQL compatibility mode for Aurora. The promised 3x performance improvement is quite interesting as is the highly managed nature of it. The automatic replication is also really interesting.

Some links on Aurora:

Amazon Aurora with PostgreSQL Compatibility (this is the original availability announcement)
Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases (this is how they got their 3x performance improvement)

#2 Updated by Greg Shah over 4 years ago

Some thoughts from Eric:

In regard to pljava, as long as this can be selected/enabled for an Aurora instance, we should be OK. The current configuration we implement (the classpath) is just for enabling the extension itself. The p2jpl.jar that we load is handled through SQL.

We also need to consider the custom locale requirement. This is likely only for the customers using 8859-1 (and similar) in the 4GL. If a custom locale is not possible, then I suspect we might need to "fall back" to UTF-8. Customers might have to accept this as a "limitation" when using Aurora. This may mean differences in collation. If this causes real problems in the application, 4GL code changes might be needed for resolution.

Eric did ask if the differences in connection preclude the ability to use something like C3PO for connection pooling.

Assuming we can address the above, then we would need to plan for the following work:

  • Code changes to support Aurora as an alternative database. I don't yet know the extent of this work.
  • Any updates needed to move to a later PostgreSQL release. It seems unlikely that Amazon would implement support for a new extension on older releases.
  • Regression testing to confirm the reliability and functional correctness of FWD with Aurora.
  • Documentation/process for implementing FWD using Aurora.

I also need more details on the high availability aspects to assess the implications of using FWD in such an environment.

#3 Updated by Eric Faulhaber over 4 years ago

Greg Shah wrote:

Some thoughts from Eric:

In regard to pljava, as long as this can be selected/enabled for an Aurora instance, we should be OK. The current configuration we implement (the classpath) is just for enabling the extension itself. The p2jpl.jar that we load is handled through SQL.

To clarify, these are the options we set in postgresql.conf for PL/Java currently:

#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------

# Add settings for extensions here

pljava.libjvm_location = '/usr/lib/jvm/default-java/jre/lib/amd64/server/libjvm.so'
pljava.classpath = '/usr/share/postgresql/9.5/pljava/pljava-1.5.0.jar'
pljava.vmoptions = '-Xmx16m -Djava.awt.headless=true'

#4 Updated by Greg Shah over 4 years ago

I spoke with Amazon Principal Database Engineer Jim M. We discussed a range of items which I am documenting here.

1. PostgreSQL Compatibility

Although we had many questions about the features, SQL dialect differences, JDBC driver differences, connections, transactions, locking... the answer here is actually simple and positive. Aurora is fully compatible with PostgreSQL. In fact, it is just the PostgreSQL engine with the storage and caching layers replaced by a cloud native layer. All the upper layers of PostgreSQL are unchanged. This means that from an application perspective an Aurora instance looks exactly like PostgreSQL. You use the same code, same SQL, the same JDBC driver that you would for the corresponding version of PostgreSQL. Aurora supports a range of PostgreSQL versions that map directly to the same versions in the PostgreSQL distribution. There is no difference in how you make a connection, you use the same URLs and approach. Any connection pooling facility that you use (e.g. C3PO) will continue to work the same as with PostgreSQL. All transaction processing and locking is the same as well. There is no difference at the application level.

Even tools like pgdump will work. Database import will work as normal. They support CITEXT and all other common extensions/well supported contrib modules. There are differences in its reliability and scalability/performance profiles but not how it works from an application level.

The statement from Amazon: "if it doesn't drop in and work, its a bug on our part".

2. Locale Options

There is no low level access to whatever OS is running underneath an Aurora instance. This means that the admin has no ability to add custom locales to the system before a cluster is initialized. This means that only pre-existing standard locales can be used to initialize a cluster. There is no other choice and Amazon will not be providing any mechanism to change this. The default locale is UTF-8, default timezone is UTC. The standard list of locales is available. I guess this means the same list we would normally see on Linux.

The implication is that the sorting/collation may deviate from the 4GL depending on the locale. For 8859-1 we would expect there to be differences. Possible solutions to be investigated:

3. PL/Java

This is a "hard no". PL/Java will never be allowed/provided as an extension available in Aurora (or Amazon RDS). The issue isn't related to the virtual machine design of Java. The problem is in the security review/standard which requires a level of sandboxing which Java does not provide. The example they gave was that one can write user defined functions in Java directly access the local platform/filesystem on which Aurora is running. This is not allowed. In contrast the Javascript extension is inherently designed as a sandboxed environment because it is an engine that was designed for use in a web browser.

Our only solution is to rewrite our user defined functions in a supported language (which meets their security requirements).

  • SQL
    • Anything that can be done directly in SQL can be implemented as a create function as ... language sql
    • As noted above for writing operators, this is very fast and is implemented as inlined SQL at parse time.
    • This should be our first choice if the functionality is simple.
    • I don't know what this may do to the query plan, but it probably can't be worse than calling to an external interpreter or VM.
  • PGSQL
    • It is more work to implement this versus JS.
    • This is built-in to every PostgreSQL install. This makes it simple to implement.
    • It would also port to Oracle quite easily because it is close to their procedure language (PL/SQL).
    • It is an interpreted language which is made up of multiple prepared statements, but it also does some amount of caching of prepared statements if possible.
    • We would have to test the performance compared with JS.
  • Javascript
    • The Java code will more naturally convert to JS than to any of the other supported languages.
    • Amazon did not recommend PLV8 because it is hard to setup/support (in a standard PostgreSQL install). Starting in v12, PLV8 is no longer built in the standard PostgreSQL repos.
    • I noted that this is not as big a deal for us because we already deal with this same issue for PL/Java (we must compile, install and setup/configure it manually).
    • On the other hand, if we could get rid of that requirement it would be a big win because it currently makes PostgreSQL more effort to get started.

#6 Updated by Greg Shah over 3 years ago

Also available in: Atom PDF