Project

General

Profile

WebSpeed Preprocessing

Summary

The Progress WebSpeed product provides an application server environment which can be used as a business logical and database access back end for web applications. WebSpeed is a facility that allows requests to a web server to be serviced by Progress 4GL code (which can and normally does use the Progress database).

WebSpeed supports 4 types of programs:

  1. Static HTML. This is pure, valid HTML that has no embedded 4GL source code.
  2. Embedded 4GL (E4GL). This is HTML with embedded 4GL source code in sections delineated by statement (multi-line arbitrary 4GL code) start/end escape sequences or by expression (a single 4GL expression) start/end escape sequences.
  3. CGI Wrapper. This is hand-coded Progress source code that emits HTML.
  4. HTML Mapping. This is static HTML combined with a separate hard coded 4GL procedure that uses a configuration file containing field offsets to map data to/from the static HTML. The merging and enhancement of the static HTML is done at runtime by the WebSpeed code in cooperation with the coded 4GL procedure).

The FWD conversion front end provides an implementation of a WebSpeed Preprocessor that duplicates the results provided in the Progress 4GL. This preprocessor handles cases 1 and 2 above. Cases 3 and 4 do not have any embedded 4GL to preprocess so they do not need to run through this parser. For this reason the WebSpeed Preprocessor is equivalent to an embedded 4GL preprocessor (E4GL preprocessor) which is the functionality documented in this chapter.

On first glance, case 1 (static HTML with no E4GL) seems like it can be ignored since turning it into 4GL is costly from a runtime efficiency perspective. When run through this preprocessor, the output of such files is still a 4GL procedure (or include file). Since this procedure or include file can itself be included in other E4GL programs, it is necessary to convert all static HTML and E4GL code into 4GL source code.

Interestingly, the 4GL source code result of this preprocessor is the same as if one had coded a CGI Wrapper (case 3 above) instead of using static HTML or E4GL. There is no difference between the 3 cases at runtime, it is only a difference in whether this WebSpeed preprocessor is used first.

HTML Mapping (case 4) is not addressed by this preprocessor since the HTML involved in that solution is processed at runtime.

It is believed that the current WebSpeed support in the conversion front end is complete and correct through OpenEdge v10.1B.

However it is important to note these caveats:

  1. This code has been tested, but not as thoroughly tested as other parts of the conversion front end.
  2. The WebSpeed documentation is inconsistent and appears to be missing important details and content.

As a result, it is likely that the FWD WebSpeed preprocessor may still need enhancement to be 100% compatible and 100% complete. There are no known features that are unsupported except for the character set issues documented in the Unsupported Features section below.

Approach

The E4GL preprocessor parses an HTML file that has embedded 4GL (E4GL) source code and acts like a filter that converts the result on output into a Progress 4GL program. Any pure HTML is output as 4GL source code that duplicates this HTML via writes to a stream. This is exactly how the Progress E4GL preprocessor operates. In WebSpeed, the resulting 4GL code is compiled and executed using the Progress 4GL runtime in appserver mode. The HTML output which is written to a stream is sent back to the web browser by the web server to which the HTTP request was made.

The E4GL preprocessor uses a specially built lexer and parser to provide this filtering function. The lexer and the parser do not implement a general purpose HTML filter. Instead, it is "dumbed down" from the pure HTML specification to a subset that is supported by the very limited parsing that WebSpeed supports. That parsing ignores certain features of HTML encoding that are valid (just passing those features through to its output). For that reason it is feasible for the FWD E4GL preprocessor to similarly ignore those features.

The lexer reads an HTML file and provides a stream of tokens that can be readily parsed. The central design point of this lexer is for all possible input to be matched and passed through on output. For example, all whitespace (with one exception, see below) will be matched and passed through. The minimum number of tokens needed by the parser is generated.

Most HTML constructs are not directly matched. Instead, this only matches those constructs needed to unambiguously parse embedded 4GL in an HTML page and translate that input into a valid 4GL program. No 4GL constructs are matched here as the parser is not aware of any 4GL syntax. Since the parser is only a filter (preprocessor), it only needs to match on a small subset of HTML syntax such that it can make its state transitions and so forth. Generally, there is a top-level rule or artificial token type for every critical parser decision. Many of these are single character matches but some more complicated multi-character constructs are matched as needed. Every possible ASCII and extended ASCII character can be matched by itself. For any character that does not match a unique top-level rule or artificial token type, the character can be returned as part of whitespace or as a special UNKNOWN token. There are some places in the parser that matches on whitespace but the content (token text) is not really important. Anything that is matched as UNKNOWN is simply passed through by the parser grammar rules, though on output the parser may do some transformations as needed to represent 4GL source code (e.g. unsafe character escaping).

Whitespace processing takes into account the platform-specific newline string and maintains the newline counter as appropriate. In the case of a platform's newline of "\r\n", the '\r' character immediately preceding a '\n' character will be dropped. This allows the parser to match on a simple '\n' in that case and on output the full platform-specific newline will be inserted as needed. This also allows the '\r' to be dropped from output if run on a system where the newline string is "\n".

The filtering that happens can be customized to match the output of a specific E4GL implementation. At this time both WebSpeed and Blue Diamond implementations are supported.

Supported Features

Statement Escape Sequences

Statement escape sequences allow arbitrary blocks of code to be embedded into HTML. This allows block processing, multiple lines of code and much more logical complexity than can be inserted in a single expression. The following escape sequences are supported (the ...4GL... indicates where the embedded 4GL code will be found):

<SCRIPT LANGUAGE="SpeedScript"> ...4GL... </SCRIPT>
<SCRIPT LANGUAGE="WebSpeed4GL"> ...4GL... </SCRIPT>
<SCRIPT LANGUAGE="Progress"> ...4GL... </SCRIPT>
<!--WSS ...4GL... -->
<!--WS4GL ...4GL... -->
<?WS> ...4GL... </?WS>
<SERVER> ...4GL... </SERVER>
<% ...4GL... %>

Blocks of Progress 4GL statements which have HTML entities encoded are decoded into the matching character.

Expression Escape Sequences

For a single expression that returns a result that can be input inline into the HTML, expression escape sequences are useful. The following expression escape sequences are supported (the ...4GL... indicates where the embedded 4GL expression will be found):

<!--WSE ...4GL... -->
` ...4GL... `
{= ...4GL... =}
<%= ...4GL... %>

4GL expressions which have HTML entities encoded are decoded into the matching character.

4GL Preprocessor Names

The list of 4GL preprocessor names that are provided for WebSpeed compatibility can be found in the Preprocessor chapter under the section on { } Preprocessor Name Reference.

META and WSMETA Elements

The E4GL preprocessor parses META and/or WSMETA elements with attributes of NAME="wsoptions" or HTTP-EQUIV="content-type"; in both cases the CONTENT attribute is read with the following options being honored:

  • include
  • web-object
  • keep-meta-content-type
  • content-type

URL Decoding

4GL expressions used in HTML URLs that have the %XX character encoding for spaces and other special characters are decoded into the matching character (using the 2 hexidecimal digits that follow the percent sign).

WebSpeed and Blue Diamond Implementation Compatibility

In addition to WebSpeed (from Progress Software Corporation) there is an open source WebSpeed replacement called Blue Diamond. The FWD implementation is compatible with both implementations. FWD defaults to WebSpeed but the conversion project can be configured to choose which implementation with which to be compatible. See the Project Setup chapter of the FWD Conversion Handbook.

Unrecognized Text

Everything other than the above constructs is treated as raw HTML. That raw HTML is read line by line. Each line is output with a prefix and suffix. Effectively, the prefix is this:

PUT Stream WebStream UNFORMATTED '

The suffix is this:

~n'.

There are many variations on this depending on what is being output and in which order. In particular, the actual text used is different depending on the compatibility helper that is in use, WebSpeed or Blue Diamond.

Using this prefix and suffix approach, the raw HTML is converted into 4GL strings and any characters that cannot be safely represented in 4GL source code are escaped.

Unsupported Features

UNICODE and Multi-Byte Character Sets

The FWD E4GL preprocessor only supports single byte character sets.


© 2004-2017 Golden Code Development Corporation. ALL RIGHTS RESERVED.