Project

General

Profile

Feature #1920

Feature #1606: implement persistent procedures/super procedures

implement persistent procedures

Added by Constantin Asofiei over 11 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Start date:
Due date:
% Done:

100%

Estimated time:
300.00 h
billable:
No
vendor_id:
GCD

ca_test20121207a.zip (367 KB) Constantin Asofiei, 12/07/2012 05:33 AM

ca_test20121221a.zip (473 KB) Constantin Asofiei, 12/21/2012 12:35 PM

ca_upd20130116a.zip (627 KB) Constantin Asofiei, 01/16/2013 08:47 AM

ca_upd20130117a.zip (739 KB) Constantin Asofiei, 01/17/2013 06:39 AM

ca_upd20130118b.zip (746 KB) Constantin Asofiei, 01/18/2013 08:37 AM

dynamic_function_xref.xml Magnifier (3.94 KB) Greg Shah, 01/19/2013 04:14 PM

annotations.xml Magnifier (26.3 KB) Constantin Asofiei, 01/21/2013 09:35 AM

ca_upd20130121a.zip (776 KB) Constantin Asofiei, 01/21/2013 12:20 PM

ecf_upd20130121a.zip - ProxyFactory with new isProxyClass method (6.7 KB) Eric Faulhaber, 01/21/2013 12:56 PM

ca_upd20130122e.zip (772 KB) Constantin Asofiei, 01/22/2013 11:10 AM

ca_upd20130123c.zip (819 KB) Constantin Asofiei, 01/23/2013 09:41 AM

ca_upd20130123d.zip (820 KB) Constantin Asofiei, 01/23/2013 11:09 AM

ca_upd20130123e.zip (909 KB) Constantin Asofiei, 01/23/2013 02:35 PM

ca_upd20130124a.zip - testcases for handles, super and persistent procedures (24.3 KB) Constantin Asofiei, 01/24/2013 06:07 AM

ca_upd20130124b.zip (897 KB) Constantin Asofiei, 01/24/2013 06:16 AM

ca_upd20130125a.zip (914 KB) Constantin Asofiei, 01/25/2013 03:09 AM

ca_upd20130126a.zip (890 KB) Constantin Asofiei, 01/26/2013 04:50 AM

ca_upd20130130d.zip - fix for handle extent (10.2 KB) Constantin Asofiei, 01/30/2013 11:39 AM

ca_upd20130130e.zip - handle extent testcase (373 Bytes) Constantin Asofiei, 01/30/2013 11:39 AM

ca_upd20130131a.zip (10.4 KB) Constantin Asofiei, 01/31/2013 04:50 AM

ca_upd20130131d.zip (10.8 KB) Constantin Asofiei, 01/31/2013 10:49 AM

ca_upd20130201c.zip - testcases for subscript attr and method chains (725 Bytes) Constantin Asofiei, 02/01/2013 08:23 AM

ca_upd20130226a.zip (4.51 KB) Constantin Asofiei, 02/26/2013 05:44 AM

ca_upd20130226b.zip (26.7 KB) Constantin Asofiei, 02/26/2013 05:55 AM

ca_upd20130226c.zip (22.4 KB) Constantin Asofiei, 02/26/2013 08:58 AM

ca_upd20130306f.zip (16.2 KB) Constantin Asofiei, 03/06/2013 11:35 AM

ca_upd20130306g.zip (16.8 KB) Constantin Asofiei, 03/06/2013 12:54 PM

ca_upd20130322a.zip (16.9 KB) Constantin Asofiei, 03/22/2013 05:31 AM

ca_upd20130322f.zip (17 KB) Constantin Asofiei, 03/22/2013 07:54 AM

ca_upd20130322h.zip (17 KB) Constantin Asofiei, 03/22/2013 09:37 AM

om_upd20130404a.zip (17.7 KB) Ovidiu Maxiniuc, 04/04/2013 01:00 PM

om_upd20130405a.zip (17.7 KB) Ovidiu Maxiniuc, 04/05/2013 04:56 AM


Related issues

Related to Base Language - Feature #1634: implement full native library (.so or DLL) support Closed 01/14/2013 06/06/2013
Related to Base Language - Bug #1968: implement MAP TO actual-name IN handle clause for function definitions Closed 08/06/2013 08/06/2013
Related to Base Language - Bug #1969: buffer names in get-signature Closed 07/22/2013 07/24/2013
Related to Base Language - Feature #1970: improve ControlFlowOps method resolution by reflection WIP
Related to Base Language - Bug #1971: emit worker versions for functions to fix source-procedure handle Closed 08/07/2013 08/08/2013
Related to Base Language - Feature #1972: provide full-support for DYNAMIC-FUNCTION set to function parameters Closed 08/06/2013 08/06/2013
Related to Base Language - Feature #1984: make methods to act as "read-only" attributes New
Related to Base Language - Bug #2105: BlockManager does not return some datatypes correctly New 03/29/2013
Related to Base Language - Feature #2123: map the converted method/attribute names to their legacy name New
Related to Base Language - Bug #2182: HANDLE attribute doesn't convert properly in "h:handle:<attr/method>" cases New

History

#1 Updated by Constantin Asofiei over 11 years ago

Implement the following 4GL features:

RUN [ PERSISTENT [SET handle]] [IN handle]
DELETE PROCEDURE/OBJECT statement
DYNAMIC FUNCTION ... { IN handle }
FUNCTION ... IN handle

SESSION handle:
  • first-procedure
  • last-procedure
THIS-PROCEDURE/TARGET-PROCEDURE/SOURCE-PROCEDURE handle:
  • persistent
  • type
  • next-sibling
  • prev-sibling
  • private-data
  • internal-entries
  • get-signature()

#2 Updated by Constantin Asofiei over 11 years ago

  • Estimated time set to 60.00

#3 Updated by Constantin Asofiei over 11 years ago

I think the VALID-HANDLE function should be implemented too (it checks if a handle is valid), because after a DELETE OBJECT/PROCEDURE call, the target handle becomes "invalid" (is not set to unknown, but the proc handle can still be used in some cases).

#4 Updated by Greg Shah over 11 years ago

Agreed. You mean that the handle.isValid() should be overridden in phandle?

#5 Updated by Constantin Asofiei over 11 years ago

0. a procedure handle will expose public internal procedures, functions and also triggers.

1. the standard procedure handles
THIS-PROCEDURE always returns a handle to external procedure to which the currently executing code belongs to.

SOURCE-PROCEDURE returns the handle for the external procedure which has invoked the statement which resulted in the invocation of the current func/proc/ext proc. The search is done using these steps:
- go down the call-stack and find the first entry in the super-procedure/function call chain
- go further down the call stack and find the first func/proc/ext proc block. This is because the function must return the handle for the external procedure to which the statement which resulted in the first invocation belongs to. I.e. if we have a chain of ExtProc(RUN A) -> RUN SUPER -> RUN SUPER, SOURCE-PROCEDURE will return the handle for the ExtProc, when called from A or its super "implementations".

TARGET-PROCEDURE returns the handle for the external procedure which was targeted by the statement which resulted in the proc/func invocation. The search is done using these steps:
- go down the call-stack and find the first entry in the super-procedure/function call chain
- go down the call-stack and find the originating statement which invoked this func/proc (this is because if i.e. RUN proc1 in A is called, proc1 might not be in A, but in B (as proc1 is a super procedure with its body in B), and although proc1 belongs to B, this will always return A because A was the target of this statement).

2. persistent procedure handle list
The SESSION handle keeps a list all persistent procedures. This list can be traversed using the SESSIN:last-procedure, SESSION:first-procedure and phandle:prev-sibling/next-sibling pointers. When a persistent procedure is deleted, these pointers are updated accordingly.
By deleting a persistent procedure which (when executed) has created other persistent procedure will not affect the "child" persistent procedures - these will still be available.
This is pretty straightforward, I couldn't find any peculiarities.

3. VALID-HANDLE
Returns 'no' for the following cases:
- the handle was not initilized
- the handle was deleted with DELETE PROCEDURE or DELETE OBJECT
Returns 'yes' for all other cases.
For a deleted proc handle, it will:
- show error if attributes are accessed
- invoking functions or procedures defined by this handle will not work.
- invoking functions defined "IN SUPER" or "IN handle" will work, as long as the function definition can be found in the super procedure list (or the handle is valid).
- invoking procedures defined "IN SUPER" will not work
This suggests that phandle should override handle.isValid().

4. phandle attributes and methods
a. INTERNAL-ENTRIES attribute - read only
Returns the names of all PUBLIC defined functions and internal procedures, in this order: internal procedures, functions, IN SUPER procedures (in order of THE DEFINITION, not lexicographic)
- PRIVATE members are not visible
- prototypes (such as IN SUPER or IN handle functions/procs) appear in this list
- FORWARD functions without implementations do not appear
- IN SUPER procedures are put last.

b. NAME and FILE-NAME attributes - read only
Returns the 4GL program name for this handle.
Both work the same.

c. PERSISTENT attribute - read only
Returns a logical value, depending on whether the current procedure is in the sessions' persistent procedure list or not.

d. TYPE attribute - read only
Always returns PROCEURE for procedure handles.

e. PRIVATE-DATA attribute - writable
User-accessible, can be set and can contain any kind of text.

f. GET-SIGNATURE method
- the signature for ... IN SUPER and ... IN handle procedures and functions can be accessed
- FORWARD functions without definition can not be seen
- if the target func/proc is not found, returns an empty string
- PRIVATE members can not be accessed
- the format is: TYPE,return type,param1,param2,param3 where:
$ TYPE is either PROCEDURE or FUNCTION
$ return type is the 4GL name of the return data type
$ param# is one of: INPUT varname TYPE, OUTPUT varname TYPE or INPUT-OUTPUT varname TYPE. Varname is either the original 4GL variable name or "arg#", if this is a header definition (i.e. an IN SUPER or IN handle) which had its parameter set as i.e. function f1 returns int (input int) in super.
$ can not access PRIVATE members

5. DELETE PROCEDURE and DELETE OBJECT statements
- after the handle is deleted, is marked as invalid, but not set to unknown
- VALID-HANDLE will return false for such handles
- accessing attributes for deleted handles returns in error
- if invoked with the THIS-, SOURCE-, or TARGET-PROCEDURE handle set as parameter, will show different runtime errors.
- an already deleted handle can't be deleted on a second time (runtime error).

6. RUN ... IN handle statement
- the procedure name can be an aribtrary expression
- runtime errors will appear if:
$ not all parameters are specified
$ parameters can not be cast from i.e. character to integer
$ the parameter type is not compatible (i.e. OUTPUT specified for an INPUT parameter)
$ the target procedure is a PROCEDURE ... IN SUPER procedure and it can't be found in the super procedures list
$ the target procedure does not exist or is not a procedure

7. DYNAMIC-FUNCTION and invoking a FUNCTION ... IN handle defined function
Both cases must convert the same. By specifying a FUNCTION ... IN handle "header" definition for the function, this will only enforce compile-time restrictions on the function invocation code (i.e. all passed parameters must have the correct data type and mode). On invocation, each case requires a reference to the handle and, as reflection will be used to invoke the function, both must convert to something like: ControlFlowOps.dynamicFunction(handle, fname, params...).
For DYNAMIC-FUNCTION case, the parameters are not checked on compile time, but on runtime, following similar rules to how the RUN ... IN handle works.

#6 Updated by Constantin Asofiei over 11 years ago

A. Implementation problems:
1. still need to find out how the exposed TRIGGERs work in 4GL, but I'll leave this last.

2. All BlockManager.externalProcedure, internalProcedure and *Function APIs will receive as first parameter a reference to the instance associated with the external procedure (i.e. the instance of the Java class which corresponds to the external procedure). This will help identifying the SOURCE-, TARGET- and THIS-PROCEDURE handles.

3. the TARGET-PROCEDURE handle must be the target of the RUN statement or function invocation call... this will be a little tricky, as we will need access to the handle references which the RUN ... IN handle or DYNAMIC-FUNCTION implementations receive. We could add some other block type to the TransactionManager's block stack to which we will save the "target" handle reference, but not yet sure how this can work.

4. as INTERNAL-ENTRIES shows also procedures defined ... IN SUPER, we need a way to save its definition in the converted code where this statement was encountered. We can do this in a couple of ways:
- keep a constant field to which these are added
- on conversion, emit an body for these cases (with a throws NotImplemented in it) and prefix its name with a "__super__" prefix. This will help us identify all super procedure definitions and report them.

5. INTERNAL-ENTRIES returns the list of procs/functions using the order as they were defined. In Java, Class.getMethods returns the methods in a non-deterministic order... this will pose us problems if there is code which relies on this (i.e. if the code is something like "for every procedure in h:internal-entries do: run this entry" and the code expects to run the procedures each time in the same order...). The only way I see to solve this is to emit for each Java "ext proc" class a constant field with all function/procedure definitions, even if it causes redundancy.
Another way would be to replace all h:INTERNAL-ENTRIES calls with the function/proc list determined at conversion time, but... we can't do this as the RUN statement which persists the procedure allows dynamic names, thus we can't know at conversion time the "type" of the persisted handle.

6. GET-SIGNATURE method returns the parameter's name too. As we convert each parameter name to java-style name, we have no way to set this. Also, if the parameter is a name of a IN handle or IN SUPER proc or function, this will return its signature, even if there is no real body in the current external procedure. Together with the INTERNAL-ENTRIES behavior, this suggests that we need to have the definitions for such func/proc cases emitted in the Java class.

7. phandle attributes and methods.
Compile-time errors appear only if the attribute is not a known procedure or widget handle attribute. Else, trying to access the attribute before the handle is initialized will not be compile time error, but runtime error. The problem here is that in 4GL a handle can be be set at runtime to anything "handle-compatible", so it can be either a widget handle, a procedure handle or even assigned the session handle.
My concern is that we will not be able to emit hard-coded types for a handle (i.e. always phandle if the handle is used in a procedure-handle related statement). Instead, all work must be done using our "handle" type and cast to phandle as necessary... This will help us sort the handle attribute and methods to procedure specific, session specific, etc. Another way would be to add accessors for all known handle-specific attributes and methods, let them throw a "not implemented" error in handle.java and implement them in phandle.java.

8. the dynamic-function and RUN ... IN handle statements process the parameters in a similar way (i.e. their type is not checked at compile time, but at runtime). What I've found was that automatic casting is done from i.e. char to integer and back, so we will need special attention when passing the input parameters or passing them back for output parameters.

B. Summary of implementation steps:
1. we need a SESSION handle implementation in P2J, something like "shandle", which will always keep a single instance per each context. Here, we can keep the persistent procedure list and the session-wide super procedure list.
2. ControlFlowOps.invokeMapped APIs have versions which accept the handle reference. We need to expand these APIs to implement the DYNAMIC-FUNCTION, "RUN ... IN handle" and explicit function calls for functions defined as "FUNCTION ... IN handle". Although we need a way to distinguish between function and procedure, as both functions and procedures convert the same way as Java methods, and we don't know if this is a proc or func until we ran it. But, we can add ControlFlowOps.invokeFunction APIs to handle function calls.
3. the phandle class needs to be enhanced to support all persistent-procedure related attributes and methods.
4. DELETE PROCEDURE will convert to handle.delete. phandle will have its actuall implementation.
5. phandle should override handle.isValid(), to implement how VALID-HANDLE works.
6. we need APIs which return the TARGET-,SOURCE- and THIS-PROCEDURE handles. I sugget to add them to BlockManager.

#7 Updated by Greg Shah over 11 years ago

Some thoughts on the 4GL behavior:

0. a procedure handle will expose public internal procedures, functions and also triggers.

If I understand correctly, calls to these entry points are all done indirectly (e.g. RUN x IN handle) AND with runtime resolution of the target code. I believe this should mean that the complexities of the implementation can be done with reflection AND can be hidden in our runtime.

1. the standard procedure handles

THIS-PROCEDURE is pretty clear.

However, even after re-reading your descriptions for SOURCE-PROCEDURE and TARGET-PROCEDURE, the behavior is not fully clear. Please provide some 4GL code examples to show this better.

The one thing I do hope we can do here is to leverage the BlockManager code to maintain our own analog to the 4GL call stack and super procedure chain. This would allow us to have a simpler walk instead of using Java reflection. Use of reflection is difficult because there are additional Java layers involved in the call stack that must be ignored because they have no relation to the original 4GL implementation that is being duplicated.

3. VALID-HANDLE
...
- invoking functions defined "IN SUPER" or "IN handle" will work, as long as the function definition can be found in the super procedure list (or the handle is valid).

I'm not sure I understand this one. Can you show a 4GL example?

#8 Updated by Greg Shah over 11 years ago

Some thoughts on the implementation:

This suggests that phandle should override handle.isValid().

Yes. The behavior also suggests that there may be an additional state of "deleted". Unless I misunderstood, it seems like the handle can be unknown (already implemented in the handle class), it can be valid, it can be invalid AND for a proc handle, it can be deleted, which is like being "partially valid". Did I misunderstand?

PRIVATE-DATA attribute - writable
User-accessible, can be set and can contain any kind of text.

We need to figure where this should live. It isn't just procedure handles that can support this. Widgets also allow this attribute.

2. All BlockManager.externalProcedure, internalProcedure and *Function APIs will receive as first parameter a reference to the instance associated with the external procedure (i.e. the instance of the Java class which corresponds to the external procedure). This will help identifying the SOURCE-, TARGET- and THIS-PROCEDURE handles.

Do you mean "ConvertedClassName.this"?

I wonder if we can dynamically obtain this via reflection. If it works, the runtime cost of this well justified by the fact that the features it is used for are rarely needed for most code.

Or if we can't do this, then an alternative might be to pass this extra parameter only to the externalProcedure() call and inside there to push the "this" reference onto an "external procedure stack". The idea is that all invocations of internal procedures and functions from that point until the next external procedure is pushed, will share that same reference.

The intention here is to minimize the impact on the generated code.

3. the TARGET-PROCEDURE handle must be the target of the RUN statement or function invocation call... this will be a little tricky, as we will need access to the handle references which the RUN ... IN handle or DYNAMIC-FUNCTION implementations receive.

For these cases, it seems perfectly reasonable to pass the handle referenced in IN clause to the ControlFlowOps method. Is that sufficient?

. as INTERNAL-ENTRIES shows also procedures defined ... IN SUPER, we need a way to save its definition in the converted code where this statement was encountered. We can do this in a couple of ways:
- keep a constant field to which these are added
- on conversion, emit an body for these cases (with a throws NotImplemented in it) and prefix its name with a "__super__" prefix. This will help us identify all super procedure definitions and report them.

5. INTERNAL-ENTRIES returns the list of procs/functions using the order as they were defined. In Java, Class.getMethods returns the methods in a non-deterministic order... this will pose us problems if there is code which relies on this (i.e. if the code is something like "for every procedure in h:internal-entries do: run this entry" and the code expects to run the procedures each time in the same order...). The only way I see to solve this is to emit for each Java "ext proc" class a constant field with all function/procedure definitions, even if it causes redundancy.
Another way would be to replace all h:INTERNAL-ENTRIES calls with the function/proc list determined at conversion time, but... we can't do this as the RUN statement which persists the procedure allows dynamic names, thus we can't know at conversion time the "type" of the persisted handle.

6. GET-SIGNATURE method returns the parameter's name too. As we convert each parameter name to java-style name, we have no way to set this. Also, if the parameter is a name of a IN handle or IN SUPER proc or function, this will return its signature, even if there is no real body in the current external procedure. Together with the INTERNAL-ENTRIES behavior, this suggests that we need to have the definitions for such func/proc cases emitted in the Java class.

The above suggested approaches are all problematic. At a minimum, these approaches make the converted code more messy.

But I think the solution to all of these cases is the same. I think we should enhance the name_map.xml to add the necessary data at conversion time and to augment the services of SourceNameMapper to allow the necessary runtime lookup of that data. I think this approach solves all the problems: non-local definitions, ordering of internal entries, legacy names/parms and so forth in the signatures...

If my idea works, then the result will be very clean indeed.

My concern is that we will not be able to emit hard-coded types for a handle (i.e. always phandle if the handle is used in a procedure-handle related statement). Instead, all work must be done using our "handle" type and cast to phandle as necessary.

Yes. I think this is exactly how we already do it. It works well and handles the stupid (but possible) case of re-using a handle for different resource types.

Another way would be to add accessors for all known handle-specific attributes and methods, let them throw a "not implemented" error in handle.java and implement them in phandle.java.

Not this way.

Please note that we may need a special "partially working" referent that is used inside a phandle that is in "deleted" state. I think we can't get rid of the referent inside the phandle, but I don't think we can allow full usage either. Maybe we have to wrapper it with something that gives the proper "partial" usage and which errors in the right ways.

7. DYNAMIC-FUNCTION and invoking a FUNCTION ... IN handle defined function
Both cases must convert the same. By specifying a FUNCTION ... IN handle "header" definition for the function, this will only enforce compile-time restrictions on the function invocation code (i.e. all passed parameters must have the correct data type and mode). On invocation, each case requires a reference to the handle and, as reflection will be used to invoke the function, both must convert to something like: ControlFlowOps.dynamicFunction(handle, fname, params...).

You are right that DYNAMIC-FUNCTION and FUNCTION IN should be implemented with the same Java backing. Let's call this ControlFlowOps.invokeFunction(). Otherwise, people in the future looking at the API may miss the fact that it does both jobs. That name is also very consistent with the current API in that class.

1. we need a SESSION handle implementation in P2J, something like "shandle", which will always keep a single instance per each context. Here, we can keep the persistent procedure list and the session-wide super procedure list.

We have been placing related functionality in various places in our runtime, rather than duplicate the 4GL SESSION handle itself. For example, some of the SESSION attributes already exist in EnvironmentOps.

In regard to the procedure handle list and chaining, I prefer a new class called ProcedureManager. That would probably also be the place where we might manage the super procedure chain too, as it seems quite closely related.

We need APIs which return the TARGET-,SOURCE- and THIS-PROCEDURE handles. I sugget to add them to BlockManager.

I think this is best in the ProcedureManager. I agree that the BlockManager will be very integral to the maintenance of the chains. But I prefer to move that management out into this new class. I also expect that things like DELETE PROCEDURE would be implemented there.

#9 Updated by Constantin Asofiei over 11 years ago

Answers for #1920-7

If I understand correctly, calls to these entry points are all done indirectly (e.g. RUN x IN handle) AND with runtime resolution of the target code. I believe this should mean that the complexities of the implementation can be done with reflection AND can be hidden in our runtime.

Correct.

However, even after re-reading your descriptions for SOURCE-PROCEDURE and TARGET-PROCEDURE, the behavior is not fully clear. Please provide some 4GL code examples to show this better.

We have the following programs:
t1.p

procedure proc0.
  message "t1: source=" source-procedure:name "target=" target-procedure:name.
end.

t2.p
procedure proc0.
  message "t2: source=" source-procedure:name "target=" target-procedure:name.
  run super.
end.

t3.p
procedure proc0.
   message "t3: source=" source-procedure:name "target=" target-procedure:name.
   run super.
end.

and a runner program t0.p:
def var h1 as handle.
def var h2 as handle.
def var h3 as handle.

run t1.p persist set h1.
run t2.p persist set h2.
run t3.p persist set h3.

h3:add-super-procedure(h2).
h2:add-super-procedure(h1).

run proc0 in h3.

This will result in the following messages:
t3: source= t0.p target= t3.p
t2: source= t0.p target= t3.p
t1: source= t0.p target= t3.p

Here, t1 is a super procedure of t2 and t2 is a super procedure of t3. In all cases, the SOURCE-PROCEDURE is t0.p (as this is where the original RUN statement occurred) and the TARGET-PROCEDURE is t3.p (as this was the target of the original RUN statement).
If we add a t4.p program which doesn't have a proc0 internal procedure, and modify t0.p so that t3 is a super procedure of t4:
def var h1 as handle.
def var h2 as handle.
def var h3 as handle.
def var h4 as handle.

run t1.p persist set h1.
run t2.p persist set h2.
run t3.p persist set h3.
run t4.p persist set h4.

h4:add-super-procedure(h3).
h3:add-super-procedure(h2).
h2:add-super-procedure(h1).

run proc0 in h4.

attempting to run proc0 in h4 will result this:
t3: source= t0.p target= t4.p
t2: source= t0.p target= t4.p
t1: source= t0.p target= t4.p

The SOURCE-PROCEDURE remains the same (as the original RUN statement is executed from t0.p), but the TARGET-PROCEDURE is changed. This does not refer the location where the proc0's body is found (i.e. in t3); instead, it refers the target of the original RUN statement.
The same applies for functions.

The one thing I do hope we can do here is to leverage the BlockManager code to maintain our own analog to the 4GL call stack and super procedure chain. This would allow us to have a simpler walk instead of using Java reflection. Use of reflection is difficult because there are additional Java layers involved in the call stack that must be ignored because they have no relation to the original 4GL implementation that is being duplicated.

From my testing, I'm almost sure my first impression that the super-procedure chain is taken from the call stack is wrong. I say this because in 4GL a super procedure must be explicitly registered with the current procedure or the session... I'll add more details to #1921 after I can see things more clear.

I'm not sure I understand this one. Can you show a 4GL example?

This is a real peculiarity. Even if the procedure handle was deleted and VALID-HANDLE returns false, the following test will succeed in running f5. If we try to run another function which is not "IN SUPER", it will not work (the behavior is only for "IN SUPER" functions):
t1.p

function f5 returns int in super.
procedure p0 in super.
end.

t0.p
def var h as handle.

function f5 returns int.
   message "superf5".
   return 0.
end.
procedure p0.
   message "superp0".
end.

run t1.p persist set h.
delete procedure h.
message valid-handle(h).
message h.

session:add-super-procedure(this-procedure).
dynamic-function("f5" in h).
run p0 in h.
message "end".

after running t0.p will result in this output:
no
1000
superf5
Invalid or inappropriate handle value given to RUN...IN statement. Procedure 't0':17. (2128)

If we attempt to run anything else from handle h (i.e. access a attribute, method or run an internal procedure or a non-super function), will result in a runtime error. Also, an ERROR condition will be raised when such an attempt is noticed.

#10 Updated by Constantin Asofiei over 11 years ago

Answers for entry #1920-8

Yes. The behavior also suggests that there may be an additional state of "deleted". Unless I misunderstood, it seems like the handle can be unknown (already implemented in the handle class), it can be valid, it can be invalid AND for a proc handle, it can be deleted, which is like being "partially valid". Did I misunderstand?

You understood correctly, with a small note: once deleted, a phandle is invalid (VALID-HANDLE will return false for it), but IN SUPER functions can still be invoked using the handle.

We need to figure where this should live. It isn't just procedure handles that can support this. Widgets also allow this attribute.

I see that the GenericWidget class has accessors for the PRIVATE-DATA attribute (although not implemented). Are they emitted during conversion ?
My problem is considering a generic handle var can at a certain point refer a widget handle, I'm inclined to add this attribute to our handle class...

Do you mean "ConvertedClassName.this"?

Yes

I wonder if we can dynamically obtain this via reflection. If it works, the runtime cost of this well justified by the fact that the features it is used for are rarely needed for most code.

Hmm... now that you mention it, we can do it. How does the bytecode keep the outer class: it generates a field name "this$<count>" where <count> is the depth of the inner classes... so, the inner class of a class has a "this$0" field, the inner class of the inner class has a "this$1" field, as in:

public class Test
{
   class Inner1
   {
      /* has a this$0 field refering to the Test.this */
      class Inner2
      {
         /* has a this$1 field refering to the Inner1.this */
         class Inner3
         {
            /* has a this$2 field refering to the Inner2.this */
         }
      }
   }
}

It will take some computing effort to go up the stack and keep resolving this$# fields until this$0 field is found, but it can be done. And I think we can optimize it too (i.e. we need to check the Block instance from the stack only when we reach the i.e. external procedure block).

For these cases, it seems perfectly reasonable to pass the handle referenced in IN clause to the ControlFlowOps method. Is that sufficient?

At this time, I think this is sufficient.

The above suggested approaches are all problematic. At a minimum, these approaches make the converted code more messy.

But I think the solution to all of these cases is the same. I think we should enhance the name_map.xml to add the necessary data at conversion time and to augment the services of SourceNameMapper to allow the necessary runtime lookup of that data. I think this approach solves all the problems: non-local definitions, ordering of internal entries, legacy names/parms and so forth in the signatures...

If my idea works, then the result will be very clean indeed.

Yes, this should work.

Yes. I think this is exactly how we already do it. It works well and handles the stupid (but possible) case of re-using a handle for different resource types.

OK

Not this way.

Please note that we may need a special "partially working" referent that is used inside a phandle that is in "deleted" state. I think we can't get rid of the referent inside the phandle, but I don't think we can allow full usage either. Maybe we have to wrapper it with something that gives the proper "partial" usage and which errors in the right ways.

The question is who will do all the error checking (i.e. when accessing the attributes or methods) ? The referent or the phandle instance? If the error checking is done by phandle, then the referent is not concerned about the handle's state (valid or invalid); if phandle tells it to do something, it will assume it is a valid call and do it. But, if the error checking is done by the referent, then it must now the handle's state (valid or invalid), to be able to do its job.

You are right that DYNAMIC-FUNCTION and FUNCTION IN should be implemented with the same Java backing. Let's call this ControlFlowOps.invokeFunction(). Otherwise, people in the future looking at the API may miss the fact that it does both jobs. That name is also very consistent with the current API in that class.

OK

We have been placing related functionality in various places in our runtime, rather than duplicate the 4GL SESSION handle itself. For example, some of the SESSION attributes already exist in EnvironmentOps.

In regard to the procedure handle list and chaining, I prefer a new class called ProcedureManager. That would probably also be the place where we might manage the super procedure chain too, as it seems quite closely related.

OK, I understand now the approach is not to centralize the SESSION handle in a "global" class, but to split its functionalities in explicit classes (like EnvironmentOps and the new ProcedureManager).

I think this is best in the ProcedureManager. I agree that the BlockManager will be very integral to the maintenance of the chains. But I prefer to move that management out into this new class. I also expect that things like DELETE PROCEDURE would be implemented there.

OK. But, as you mentioned DELETE PROCEDURE, how will we handle DELETE OBJECT ? As DELETE OBJECT works with "generic" handle references (and for procedure handles does the same work as DELETE PROCEDURE), I think that there should be a handle.delete API whith the phandle.delete implementation doing the actual work for procedure handles.

#11 Updated by Guy M over 11 years ago

Hi Greg,

Thanks for adding me in- that is interesting, but what a nightmare!

Why did Progress have to come up with such a bizarre implementation for SUPER functionality? If they had made the stack static for each procedure - which I guess most people would have been happy to use - then you wouldn't have to worry about this.

As it is, because we can mess about with the stack whenever we want, you have to write a complete generic implementation.

I know it's unlikely, but would it possible to build some constraints into the conversion - whereby we could somehow declare that we are using a static stack, and what it is (e.g. via comments), and then the conversion could use the simpler, more intuitive java inheritance?

Regards,
Guy

#12 Updated by Constantin Asofiei over 11 years ago

For #1920-11 - I've added the comment to #1921, please update there all info about super procedures.

#13 Updated by Greg Shah over 11 years ago

The question is who will do all the error checking (i.e. when accessing the attributes or methods) ? The referent or the phandle instance? If the error checking is done by phandle, then the referent is not concerned about the handle's state (valid or invalid); if phandle tells it to do something, it will assume it is a valid call and do it. But, if the error checking is done by the referent, then it must now the handle's state (valid or invalid), to be able to do its job.

This is a good question.

IF this behavior was associated with widgets/frames, we would want to replace the referent with a dynamic proxy (in the "deleted" case). Then it could pass through anything needing pass through and raise the error condition where needed. Because for the widget case, I wouldn't want the behavior encoded in the backing widget classes as there is just too much error handling to add.

But the deleted procedure case may make sense to implement wherever the attributes/methods are implemented. There are a much smaller number of attributes/methods and there is no natural interface to define the proxy.

This also brings into question the matter of the phandle class itself. It was something I put together as a "quick and dirty" approach to getting some sample code to compile. As far as I know, we don't have any production code dependent upon it. I don't know if we want to break the essential contract of a handle as being something that contains the referent that actually implements the attributes/methods. But our initial approach with phandle puts the procedure attributes/methods in the container instead of the referent. The advantage of the current phandle approach is that there is no unwrapping the referent and then casting before accessing the attributes/methods. What do you think? And if we extend this approach, then should be also consider making a whandle for widgets?

OK. But, as you mentioned DELETE PROCEDURE, how will we handle DELETE OBJECT ? As DELETE OBJECT works with "generic" handle references (and for procedure handles does the same work as DELETE PROCEDURE), I think that there should be a handle.delete API whith the phandle.delete implementation doing the actual work for procedure handles.

We must consider the other potential usage of DELETE OBJECT. It can be used to delete sockets, dynamic widgets and so forth. So the conversion of that statement must provide a generic implementation. I think it may be best for there to be a generic helper class that has a static method delete() which is overloaded (delete(phandle), delete(socket)...) to take each different kind of resource that can be deleted. Then each method can simply call the proper backing delete for the resource.

#14 Updated by Constantin Asofiei over 11 years ago

IF this behavior was associated with widgets/frames, we would want to replace the referent with a dynamic proxy (in the "deleted" case). Then it could pass through anything needing pass through and raise the error condition where needed. Because for the widget case, I wouldn't want the behavior encoded in the backing widget classes as there is just too much error handling to add.

But the deleted procedure case may make sense to implement wherever the attributes/methods are implemented. There are a much smaller number of attributes/methods and there is no natural interface to define the proxy.

This also brings into question the matter of the phandle class itself. It was something I put together as a "quick and dirty" approach to getting some sample code to compile. As far as I know, we don't have any production code dependent upon it. I don't know if we want to break the essential contract of a handle as being something that contains the referent that actually implements the attributes/methods. But our initial approach with phandle puts the procedure attributes/methods in the container instead of the referent. The advantage of the current phandle approach is that there is no unwrapping the referent and then casting before accessing the attributes/methods. What do you think? And if we extend this approach, then should be also consider making a whandle for widgets?

My approach would be to add all the attributes/methods specific for a procedure handle (to the maximum extent possible) to our phandle class. I don't see any benefits from adding these to the referent...

About whandle - I don't really know yet how widget handles work... but, if there are attributes/methods which can only be accessed via a GenericWidget instance, then we can use an anonymous proxy approach to delegate these calls to the GenericWidget instance.

We must consider the other potential usage of DELETE OBJECT. It can be used to delete sockets, dynamic widgets and so forth. So the conversion of that statement must provide a generic implementation. I think it may be best for there to be a generic helper class that has a static method delete() which is overloaded (delete(phandle), delete(socket)...) to take each different kind of resource that can be deleted. Then each method can simply call the proper backing delete for the resource.

We can use a HandleOps class, to which we can add all "operators" (like DELETE OBJECT) which can work with various handle sub-types.

#15 Updated by Greg Shah over 11 years ago

OK, that all makes sense.

#16 Updated by Constantin Asofiei over 11 years ago

The next steps for this task (and super-procedures) are:
1a. provide conversion support for all related 4GL features.
1b. Add stub APIs to ProcedureManager and phandle, so the converted code will compile.
2. enhance conversion rules to provide additional info in name_map.xml (i.e. method signature, mark IN SUPER proc/functions, etc).
3. enhance SourceNameMapper to provide additional APIs which access the new name_map.xml data
4. add the implementation for the stub APIs added at step 1

#17 Updated by Greg Shah over 11 years ago

Agreed.

#18 Updated by Constantin Asofiei over 11 years ago

New important findings: the "handle" for the "RUN ... IN handle", "DYNAMIC-FUNCTION" and "FUNCTION ... IN handle" can be actually any expression which returns a handle, i.e. 4GL allows:

def var i as int.
function fh returns handle:
   run value("test" + i + ".p") persist.
   return session:last-procedure.
end.

run proc0 in fh(). /* 1 */
run proc0 in fh(). /* 2 */

On each call, fh() is executed and it returns a different handle... this means that the handle expression for all such cases must be converted as an inner class:
   private class HandleExpr4
   extends GenericExpression
   {
      public BaseDataType resolve()
      {
         return fh();
      }
   }

with an instance field set to it:
HandleExpr4 handleExpr4 = new HandleExpr4();

which will be used in i.e. RUN ... IN handle "DYNAMIC-FUNCTION" calls:
ControlFlowOps.invokeFunction((phandle) handleExpr4.resolve()), "<function>")

When evaluating fh() (part of a i.e. RUN ... in fh() call), fh must not block for user input.

#19 Updated by Greg Shah over 11 years ago

Is the sample missing the empty ()?

run proc0 in fh.

instead of:

run proc0 in fh().

Does the 4GL really allow a user-defined function to be called without the parenthesis?

#20 Updated by Constantin Asofiei over 11 years ago

Sorry, I've changed it now, it was a typo mistake.

#21 Updated by Constantin Asofiei over 11 years ago

Greg, something about the handle attributes in general. In 4GL, a code like this:

def var h as handle.
h = session.
message h:internal-entries.
message "end".

will show an warning message, but the procedure will not be terminated. The result is:
** INTERNAL-ENTRIES is not a queryable attribute for PSEUDO-WIDGET. (4052)
?
end

The 4GL code compiles safely even if the handle is not initialized and a valid attribute is accessed, as in:
def var h as handle.
message h:internal-entries.
message "end".

which shows this output:
Invalid handle.  Not initialized or points to a deleted object. (3135)
Cannot access the INTERNAL-ENTRIES attribute because the widget does not exist. (3140)
end

The idea is if we don't want to pass the attribute as a static text to some handle.java method, we need handle.java to define methods for any valid attribute which can be accessed via a handle (unless the attribute is converted to a static API call in ErrorManager, KeyReader, etc). This is needed to properly implement the error messages shown by 4GL when the handle is not i.e. a phandle.

#22 Updated by Constantin Asofiei over 11 years ago

Another thing about RUN filename PERSIST [SET handle] statement. We can either emit this something like (for a RUN test0.p PERSIST [SET handle] call):

Test0 test0 = new Test0();
/* if the handle is a shared variable, then if it is accessed in test0.p, 
   it will be set to the test0 procedure, even if test0 hasn't finished yet executing. 
   this is why the procedure needs to be registered in the persistent procedure list 
   before execute is called. */
ProcedureManager.addPersistentProcedure(test0, [handle]);
test0.execute();

Another approach would be to make all Java classes emitted for external procedures to extend an "ExternalProcedure" class which provides these c'tors:
  public ExternalPrreal ocedure() {} // default c'tor

  public ExternalProcedure(boolean persist)
  {
    this(persist, null);
  }

  public ExternalProcedure(boolean persist, handle h)
  {
    if (persist) ProcedureManager.addExternalProcedure(this, h);
  }

and the above RUN call will be emitted as:
Test0 test0 = new Test0(true, [handle]);
test0.execute();

I like the second approach better and although it registers the external procedure in the procedure manager list before the actual instance finished initializing, I don't think there will be real problems about this, because if error appear during class initialization, then there is something wrong in the converted code or runtime.

#23 Updated by Greg Shah over 11 years ago

In regard to the handle issue:

The idea is if we don't want to pass the attribute as a static text to some handle.java method, we need handle.java to define methods for any valid attribute which can be accessed via a handle

1. I prefer to keep using Java methods directly, so accessing the attributes via a "attr-name" string is something to avoid.

2. Does this same behavior occur with methods too (e.g. my-uninitialized-handle:method() is not an error but a warning and actually returns something like unknown value)?

3. I prefer to use a "safe null" dynamic proxy object here. We already cast the unwrapped reference to the right type for the attribute or method access. Perhaps the solution is to always use a factory method to unwrap/return the referent. We know the type, so we would call the right factory method for that type. The factory method would return any valid referent or if invalid, then the factory would return a safe null widget that uses dynamic proxy to implement the interface (e.g. GenericWidget for widgets or P2JQuery for queries) to provide the same "safe" behavior as in the 4GL.

A reference to h:sensitive might become:

WidgetHandle.unwrap(h).isSensitive()

instead of:

((GenericWidget) h.get()).isSensitive()

It even reads better this way, in my opinion. We can create a single safe dynamic proxy instance per context (AND per factory type) and always return that when the referent is invalid/uninitialized. I suspect that the invocation handler can probably be shared code between the different widget types.

4. This highlights the inconsistency with the procedure handle being a subclass of handle. I think this is confusing and I prefer if the handle contains a referent that is the phandle. The actual Java reference for the external procedure is not needed to be unwrapped in the converted application code, so why not make it consistent.

In regard to the RUN filename PERSIST [SET handle] statement:

I agree that your second approach is better than the first, but I suspect that we really need to take a 3rd approach. Our previous approach of implementing the simple "RUN ext-proc.p" case is to:

ExtProc extproc0 = new ExtProc();
extproc0.execute();

Once you implement super procs, this is no longer correct. Even the simple RUN cases need to have the target code found AT RUNTIME!

This means that even the simplest cases will be handled by something like ControlFlowOps.invoke*("proc-name").

It seems to me that ControlFlowOps already does handle the instantiation of the found class properly. Any implementation of invokePersistent("proc-name"[, handle]) can easily allow instantiation to finish, add the persistent procedure and then assign the handle (if provided).

What do you think?

#24 Updated by Constantin Asofiei over 11 years ago

1 - This would be my choice too.

2 - Yes

3 - If there is only one handle.unwrap method to access the attributes and methods, then the type of the object returned by unwrap must provide access to all known attributes and methods (for widget, query, phandle, etc), as we can't know at conversion time the handle's type. So, if unwrap returns a dynamic proxy, the interface must define them all.

4 - I agree, with the dynamic proxy approach phandle should be the referent.

About RUN ext-proc.p PERSIST calls - I don't understand why we should make the call dynamic, when the conversion rules can decide for sure that the RUN statement is used to invoke an external procedure. And even if we make the calls dynamic, we can pass ExtProc.class as the parameter for ControlFlowOps.invoke, instead of the actual program name.

#25 Updated by Greg Shah over 11 years ago

About RUN ext-proc.p PERSIST calls - I don't understand why we should make the call dynamic, when the conversion rules can decide for sure that the RUN statement is used to invoke an external procedure. And even if we make the calls dynamic, we can pass ExtProc.class as the parameter for ControlFlowOps.invoke, instead of the actual program name.

Which RUN statement cases can be determined statically/at conversion time?

Yes, I agree that in these cases it is best to pass the .class to ControlFlowOps and let the complexity of everything else be hidden there.

#26 Updated by Constantin Asofiei over 11 years ago

Actually, the only case when conversion knows for sure what to emit is the RUN filename statement, when the filename exists in the list of the converted files. We can't emit internalProc() calls for RUN internal-proc statements because the search is done in the super-procedure list even if the header definition for the procedure is not in the current program.

BTW, what about the BogusClass emitted for the cases when filename does not exist ? I need to test, but I think these cases must be handled by ControlFlowOps APIs, as IIRC 4GL allows such calls.

#27 Updated by Greg Shah over 11 years ago

About the static calculation of called code in a RUN statement, I am concerned with this information you posted in #1921:

The target of a "RUN proc" stmt or DYNAMIC-FUNCTION call ends on first find in:
- THIS-PROCEDURE
- THIS-PROCEDURE:super-procedures
- SESSION:super-procedures
- if an external program with the given name is found (for the "RUN proc" statement without "IN handle" clause).
- search distinguishes between procedures and functions with the same name. On first name match, will attempt to invoke it.
- is not necessarily for the target procedure to have its header defined in the current procedure. Applies only to "RUN proc" statement.
- "RUN proc0." and "proc0()" will end up calling different code (one is a super-proc, the other is a super-function). The body for these two may be in different super-procedures.

I believe it is possible to have an internal procedure named proc.p. Based on the above "search" precedence list, it seems that a "RUN proc.p" will not necessarily resolve to an external file named proc.p. And we cannot necessarily detect that case at conversion time because the internal proc with that name could be in a super procedure. That is why it seems like we have to implement a runtime resolution of RUN statement targets.

Please run some testcases to see if my theory is correct.

#28 Updated by Constantin Asofiei over 11 years ago

Greg, you are correct, 4GL allows '.', '\' and '/' characters as part of an internal procedure's name; so, we can't distinguish between ext procs and internal procs during conversion (as an internal procedure might have the same name as an external procedure), leaving us the only solution of using ControlFlow's APIs to invoke them (with the name sent as a string parameter).

#29 Updated by Greg Shah over 11 years ago

Sorry, I wish I was wrong on this point.

In regards to the RUN <proc> PERSISTENT, is that a case where we know that <proc> is an external procedure (or should be if it is missing)? It seems to me that RUN internal_proc PERSISTENT would not be allowed.

We still may not be able to statically find it since the propath can be assigned at runtime, but it may be a case where the search logic is different.

#30 Updated by Constantin Asofiei over 11 years ago

In regards to the RUN <proc> PERSISTENT, is that a case where we know that <proc> is an external procedure (or should be if it is missing)? It seems to me that RUN internal_proc PERSISTENT would not be allowed.

You are correct, RUN ... PERSISTENT works with only external procedures.

We still may not be able to statically find it since the propath can be assigned at runtime, but it may be a case where the search logic is different.

Even if the external procedure for a RUN <proc> PERSISTENT call can be found during conversion, the ext proc might have been removed from the propath at runtime, at the time of this call... so we can't hard code this case either.
More, when the external procedure can not be found, 4GL generates a STOP condition, which can be ignored if the RUN is enclosed in a block which overrides the default behavior, like in:

do on stop undo, leave:
   run missing-proc.p persist.
end.
message "end".

Here, the "end" message will be displayed.

As the only solution is to use Java's reflection, we should consider caching, for each ConvertedExternalClass.class instance, its method definitions (as returned by the Class.getMethod and other APIs).

#31 Updated by Constantin Asofiei over 11 years ago

One thing about handle.unwrap. 4GL allows these kind of calls:

h:prev-sibling:prev-sibling:prev-sibling:name

which, if we use handle.unwrap, will convert to:
handle.unwrap(handle.unwrap(handle.unwrap(handle.unwrap(h).getPrevSibling()).getPrevSibling()).getPrevSibling()).getName()

i.e. for each COLON, there must be a handle.unwrap call.

#32 Updated by Greg Shah over 11 years ago

Understood and agreed.

#33 Updated by Greg Shah over 11 years ago

I wonder if treating COLON as a binary operator will help with the conversion.

I also wonder if we are parsing chained methods/attributes properly. I think we are not parsing them with COLON treated as an operator.

On the other hand, this bit of chaining is a very tricky part of the parser. So do not make changes there lightly.

#34 Updated by Constantin Asofiei over 11 years ago

A "message h:next-sibling:prev-sibling:next-sibling:name." statement gets parsed this way (from .parser file):

   statement [STATEMENT] @0:0
      message [KW_MSG] @6:1
          [CONTENT_ARRAY] @0:0
            expression [EXPRESSION] @0:0
               : [COLON] @6:10
                  h [VAR_HANDLE] @6:9
                  : [COLON] @6:23
                     next-sibling [ATTR_HANDLE] @6:11
                     : [COLON] @6:36
                        prev-sibling [ATTR_HANDLE] @6:24
                        : [COLON] @6:49
                           next-sibling [ATTR_HANDLE] @6:37
                           name [ATTR_CHAR] @6:50

So, the COLON is treated as an operator.
The resulted code for this statement is (without the handle.unwrap calls):
message(h.getNextSibling().getPrevSibling().getNextSibling().getName());

My problem was that the generated Java AST needs to be "upside-down":

                      <ast col="0" id="730144440366" line="0" text="message" type="STATIC_METHOD_CALL">
                        <ast col="0" id="730144440367" line="0" text="" type="EXPRESSION">
                          <ast col="0" id="730144440372" line="0" text="getName" type="METHOD_CALL">
                            <ast col="0" id="730144440368" line="0" text="getNextSibling" type="METHOD_CALL">
                              <ast col="0" id="730144440370" line="0" text="getPrevSibling" type="METHOD_CALL">
                                <ast col="0" id="730144440371" line="0" text="getNextSibling" type="METHOD_CALL">
                                  <ast col="0" id="730144440369" line="0" text="h" type="REFERENCE" />
                                </ast>
                              </ast>
                            </ast>
                          </ast>
                        </ast>
                      </ast>

The main change was to move the convert/methods_attributes.rules code from walk-rules to ascend-rules, plus some other changes to set the apropriate peerid for each node. I still need to test some of the other handle cases, like FILE-INFO and SESSION attributes/methods (plus cases which start with static calls like THIS-PROCEDURE or SESSION:FIRST-PROCEDURE), but I think I'm on a good track to solve this.

#35 Updated by Greg Shah over 11 years ago

So, the COLON is treated as an operator. 

Not really. What I mean is this: normally, a chained operator will parse with each nested operator as the first child. This is different in the chained methods and attributes case. For example:

def var num as int initial 30.
def var i   as int initial 14.

i = num + 5 + i.

Here is a fragment of AST:

    <ast col="0" id="21474836513" line="0" text="assignment" type="ASSIGNMENT">
      <ast col="3" id="21474836514" line="4" text="=" type="ASSIGN">
        <annotation datatype="java.lang.Long" key="peerid" value="128849018916"/>
        <ast col="1" id="21474836517" line="4" text="i" type="VAR_INT">
          <annotation datatype="java.lang.Long" key="oldtype" value="2333"/>
          <annotation datatype="java.lang.Long" key="refid" value="21474836498"/>
          <annotation datatype="java.lang.Long" key="peerid" value="128849018917"/>
        </ast>
        <ast col="0" id="21474836520" line="0" text="expression" type="EXPRESSION">
          <ast col="13" id="21474836521" line="4" text="+" type="PLUS">
            <annotation datatype="java.lang.Long" key="peerid" value="128849018918"/>
            <ast col="9" id="21474836524" line="4" text="+" type="PLUS">
              <annotation datatype="java.lang.Long" key="peerid" value="128849018919"/>
              <ast col="5" id="21474836527" line="4" text="num" type="VAR_INT">
                <annotation datatype="java.lang.Long" key="oldtype" value="2333"/>
                <annotation datatype="java.lang.Long" key="refid" value="21474836483"/>
                <annotation datatype="java.lang.Long" key="peerid" value="128849018920"/>
              </ast>
              <ast col="11" id="21474836530" line="4" text="5" type="NUM_LITERAL">
                <annotation datatype="java.lang.Long" key="peerid" value="128849018921"/>
              </ast>
            </ast>
            <ast col="15" id="21474836532" line="4" text="i" type="VAR_INT">
              <annotation datatype="java.lang.Long" key="oldtype" value="2333"/>
              <annotation datatype="java.lang.Long" key="refid" value="21474836498"/>
              <annotation datatype="java.lang.Long" key="peerid" value="128849018922"/>
            </ast>
          </ast>
        </ast>
      </ast>
    </ast>

Notice how the PLUS operators are parsed under the EXPRESSION node. We don't do that with handles and COLON. I think that is what is causing your problems. We aren't really treating COLON as an operator here, though we do partially parent things there.

But changing that will be non-trivial. If you can properly handle things with less intrusive changes, then we will go with that for now.

#36 Updated by Constantin Asofiei over 11 years ago

All RUN cases are handled by these rules:
- ControlFlowOps.invoke(String name, Object ... param) will be emitted for all "RUN [proc | ext-proc | native-proc]." cases.
- ControlFlowOps.invoke(String name, handle h, Object ... param) will be emitted for all "RUN [proc | ext-proc | native-proc] IN handle." cases.
- ControlFlowOps.invokePersistent(String name, Object ... param) will be emitted for all the "RUN ... PERSIST." cases.
- ControlFlowOps.invokePersistent(String name, handle h, Object ... param) will be emitted for all the "RUN ... PERSIST SET handle." cases.

For native procedures, as at the conversion time we can't determine if this is an internal native procedure call or not, we let the runtime do the work (all the external-proc info stuff will be saved in name_map.xml).

The "Object caller" parameter was not added because we will need to establish the current THIS-PROCEDURE at runtime.

I have not removed the code which generates the prog.filename and prog.int_proc nodes for the RUN <ext-proc.p> and RUN <int-proc> cases (i.e. the parser for explicit filenames and explicit internal procedure names was not touched, the conversion rules just treats them the same as the RUN VALUE cases).

For the function cases all is good, except I've stumbled on a problem. A call like this:

def var char as ch.
ch = "func0".
dynamic-function(ch in h)

will have the dynamic-function's AST generated like this:
    <ast col="0" id="871878361612" line="0" text="assignment" type="ASSIGNMENT">
      <ast col="0" id="871878361613" line="0" text="expression" type="EXPRESSION">
        <annotation datatype="java.lang.Long" key="peerid" value="876173328686"/>
        <ast col="1" id="871878361614" line="64" text="dynamic-function" type="FUNC_POLY">
          <annotation datatype="java.lang.Long" key="oldtype" value="538"/>
          <annotation datatype="java.lang.Boolean" key="builtin" value="true"/>
          <annotation datatype="java.lang.Boolean" key="returnsunknown" value="true"/>
          <annotation datatype="java.lang.Long" key="peerid" value="876173328687"/>
          <ast col="18" id="871878361616" line="64" text="ch" type="VAR_CHAR">
            <annotation datatype="java.lang.Long" key="oldtype" value="2339"/>
            <annotation datatype="java.lang.Long" key="refid" value="871878361293"/>
            <annotation datatype="java.lang.Long" key="peerid" value="876173328688"/>
            <ast col="21" id="871878361618" line="64" text="in" type="KW_IN">
              <ast col="24" id="871878361621" line="64" text="h" type="VAR_HANDLE">
                <annotation datatype="java.lang.Long" key="oldtype" value="2339"/>
                <annotation datatype="java.lang.Long" key="refid" value="871878361348"/>
                <annotation datatype="java.lang.Long" key="peerid" value="876173328689"/>
              </ast>
            </ast>
          </ast>
        </ast>
      </ast>
    </ast>

Note how the KW_IN/VAR_HANDLE sub-ast is emitted under the VAR_CHAR, instead of being emitted as its sibling. From the progress.g, I think this was added to handle cases when IN handle was something widget specific... my solution would be to leave the generated AST as is, and in post_parse_fixups.xml to move the KW_IN node on the correct place, if it was emitted under the first child in the DYNAMIC-FUNCTION's AST call.

#37 Updated by Constantin Asofiei over 11 years ago

Greg, should I add support for the "h:<read-only-attribute> = value" cases too ? Because 4GL does allow them (shows a runtime error), but I think would be nicer to leave our APIs clean so that the user knows which attributes can be set (i.e. they have setters) and which do not.

#38 Updated by Greg Shah over 11 years ago

Some questions/comments:

1. What do you mean by "native-proc"? Do you mean the invocation of something defined as PROCEDURE EXTERNAL?

2. Please provide a RUN native-proc IN handle example.

3. What do you mean by "internal native procedure call" in this:

For native procedures, as at the conversion time we can't determine if this is an internal native procedure call or not, we let the runtime do the work

?

4. In regards to this:

my solution would be to leave the generated AST as is, and in post_parse_fixups.xml to move the KW_IN node on the correct place, if it was emitted under the first child in the DYNAMIC-FUNCTION's AST call.

That is a very messy part of the parser. Not easy to fix properly. Your solution is fine.

5. For this:

Greg, should I add support for the "h:<read-only-attribute> = value" cases too ? Because 4GL does allow them (shows a runtime error), but I think would be nicer to leave our APIs clean so that the user knows which attributes can be set (i.e. they have setters) and which do not.

I agree that we don't want to make things more messy. For now, go with this approach. BUT do document the limitation in the conversion reference.

We should be prepared that eventually, there will be an app that depends upon this runtime error processing. In that case we will be implementing this anyway.

#39 Updated by Constantin Asofiei over 11 years ago

1 - yes, I refer to PROCEDURE ... EXTERNAL case

2 - I don't have a fully working example (i.e. I don't have a dll too, to be sure proc0 is executed from the dll), but 4GL allows this:

procedure proc0 external "somelib.dll".
/* define parameters, but P2J conversion doesn't support the kind of 
   types 4GL needs for external procedures (4GL needs the type of the 
   parameters to be DOUBLE, CHARACTER, LONG and others - i.e. 4GL types
   can't be used to define parameters for an external procedures)*/
end.

def var h as handle.
h = this-procedure.
run proc0 in h.

3 - sorry, that was confusing. what I mean is that when RUN <something> is encountered by the conversion rules, it can't decide that <something> is an external procedure (for same reasons we can't decide that RUN <filename> is meant to run an external procedure).

4 - OK

5 - I think is pretty simple to add support for the assignment of a read-only attribute. As we have a dynamic proxy approach, all is left is to:
  • emit setters for these attributes
  • add the setters to our dynamic proxy interface
  • the dynamic proxy will intercept these calls and will show the error/warning message, as needed

As I'm already touching the procedure-related attributes, I suggest to add support only for these ones at this time, so it will be a lot easier to add support for other attributes later on.

#40 Updated by Greg Shah over 11 years ago

2 - I don't have a fully working example (i.e. I don't have a dll too, to be sure proc0 is executed from the dll), but 4GL allows this:

procedure proc0 external "somelib.dll".
/* define parameters, but P2J conversion doesn't support the kind of 
   types 4GL needs for external procedures (4GL needs the type of the 
   parameters to be DOUBLE, CHARACTER, LONG and others - i.e. 4GL types
   can't be used to define parameters for an external procedures)*/
end.

def var h as handle.
h = this-procedure.
run proc0 in h.

The IN handle part is what I find strange, since this is just an imported API call (function pointer in C). It seems to me that PROCEDURE apiFunctionCallName EXTERNAL "somelib.dll" should call the same API no matter in which procedure it is defined. I would expect that such a definition would cause the runtime to load the DLL and import the entry point (obtain the entry point as a function pointer). I would expect that multiple instances of the same definition in multiple procedures, would just be backed by the same call to the same pointer.

But I guess that since a Progress internal procedure (or external procedure) can also be apiFunctionCallName, I guess this is consistent with the runtime dispatching of Progress' RUN mechanism. I assume it will just call a common pointer in the end, once resolved.

As far as being able to test, you can just use a common Windows API like this:

DEF VAR pid AS INT.

PROCEDURE GetCurrentProcessId EXTERNAL "kernel32":
   DEFINE RETURN PARAMETER result AS LONG.
END.

RUN GetCurrentProcessId (OUTPUT pid).

MESSAGE pid.

#41 Updated by Constantin Asofiei over 11 years ago

My conclusions about native-procedure (PROCEDURE ... EXTERNAL cases) are:
  • when a 4GL external procedure has a header definition for a native procedure, the 4GL external procedure treats it as an internal procedure (i.e. it registers it in its "internal-entries" list).
  • when 4GL needs to invoke an internal procedure, its header definition informs it that this is (or is not) a native call and invokes it properly.
  • all the persistent procedure and super procedure behavior found for non-native internal procedures applies to native calls too (the procedure search rules are the same).
  • the h:internal-entries attribute places the native procedure definitions in the same location as normal internal procedures.
  • the h:get-signature(<native-method-name>) looks like:
    DLL-ENTRY,,<parameter list>
    

    where <parameter list> is emitted in the same way as for normal internal procedures.
  • the datatype for any parameter set for a native procedure must be one of: CHARACTER, BYTE, SHORT, UNSIGNED-SHORT, LONG, FLOAT or DOUBLE. But P2J doesn't support VAR_DOUBLE, VAR_SHORT, VAR_USHORT and VAR_FLOAT (conversion fails).

I don't think we should dig deeper into how native calls work for this task, I will only add support so that the native proc is emitted in name_map.xml (so that internal-entries and get-signature will work OK). The 4GL implementation of these native calls is a little more complex, as native libraries can be unloaded using RELEASE EXTERNAL and the PROCEDURE's EXTERNAL clause supports another flag, PERSISTENT (i.e. PROCEDURE ... EXTERNAL "somelib.dll" PERSISTENT) which tells 4GL not to unload the dll from memory until RELEASE EXTERNAL is called or the application ends. So, more testing is needed to understand full complexity.

#42 Updated by Greg Shah over 11 years ago

We have task #1634 which is where we will work on the rest of the native procedure support. I agree with your plan for handling things partially here.

Very interesting findings. We will have to implement this slightly different than I originally expected (we will need methods for the 'wrapper" internal procedures that then call the API.

#43 Updated by Constantin Asofiei over 11 years ago

Attached there is a test update which contains:
1. rules/ - all rules files which affect the conversion of persistent and super procedure related stuff
2. progress.g - a needed change to properly handle FUNCTION ... IN handle and IN SUPER cases
3. very-very early versions of the Java files (I've only added them so that you will not get a lot of errors when converting the testcases)
4. testcases/uast/t2012* - these are not-yet-final testcases (I will need to clean them up, but I will do this in the process of testing the runtime). If you want to convert them all, just use the "cas*.p" pattern and you will get them all.

About the Java files:
- HandleOps contains the API for DELETE PROCEDURE (HandleOps.delete accepts only handles at this time)
- handle.java was added the "unwrap" definition
- HandleDef is the interface which should be used by the dynamic proxy (handle.unwrap returns this type). I think there should be better name for it, but I lack the inspiration at this time <g>
- ProcedureManager - API definitions for THIS-PROCEDURE, TARGET-PROCEDURE, SOURCE-PROCEDURE and SESSION's ADD-PERSISTENT-PROCEDURE, REMOVE-PERSISTENT-PROCEDURE, FIRST-PROCEDURE, LAST-PROCEDURE, ADD-SUPER-PROCEDURE, REMOVE-SUPER-PROCEDURE, SUPER-PROCEDURES plus the RUN SUPER and SUPER statements (both RUN SUPER and SUPER statements convert to ProcedureManager.runSuper, I don't think there is a need to differentiate between them - if I find there is a need during runtime testing, I will change the conversion). I've just realized I miss tests for RUN SUPER and SUPER cases when the procedure/function has parameters, I will test this too.
- ControlFlowOps contains only stubs for invoke() and invokeFunction(), the others APIs (although not deleted) were renamed, I didn't want to remove the code from them yet, as I will need it.
- progress.g and some rule files were added some bogus procedure attributes, named PREV-SIBLING[1|2|3] and NEXT-SIBLING[1|2|3]. These return a handle and were added only for testing purposes, to make sure the call chain is converted properly (I had troubles determining that the conversion rules were not messing up the call order, so I used these to test).

Please review this early update and let me know if you see any issues/problems.

#44 Updated by Constantin Asofiei over 11 years ago

The MAJIC sources, beside the ControlFlowOps and handle.unwrap changes, has this change: frame.getScreeValue(widget) and frame.setScreenValue(widget, value) calls are emitted as frame.getWidget().getScreenValue() and frame.getWidget().setScreenValue(widget) - is there a problem if I leave it as this ? The widget's screen-value setter and getter dispatches the call to the frame, via "frame.getScreeValue(this)", so I don't think this should be a problem.

#45 Updated by Greg Shah over 11 years ago

That is fine with me.

#46 Updated by Constantin Asofiei over 11 years ago

Greg,
When collecting all the handle-related APIs in a "global" interface, I found some differences between the return type of some APIs:
  • getColumn, getHeight, getRow, getWidth : widget returns decimal, for frame returns integer. Any idea why these return different values ? To me, it looks like the frame versions are wrong, and they should return decimal.
  • moveAfterTab(GenericWidget) : widget returns logical, for frame returns boolean. Here, I think is safe to make it return logical for frame.
  • getNextSibling, getPrevSibling : for widget returns GenericWidget, for plain handles it must return a handle instance. Not sure yet if it is safe to make the GenericWidget versions return a handle instance.

#47 Updated by Greg Shah over 11 years ago

I thought we were going to use resource-specific interfaces instead of a global interface. I prefer it to be resource-specific. We already have interfaces for most if not all of the resources and there is no need for a massive global interface. The resource-specific factory method would then return the resource-specific null proxy for the specific case being used.

And doing so avoids this kind of issue. With that said, it would be very good to rationalize the frame and widget interfaces IF doing so does not cause a problem.

getColumn, getHeight, getRow, getWidth : widget returns decimal, for frame returns integer. Any idea why these return different values ? To me, it looks like the frame versions are wrong, and they should return decimal.

I don't recall. I do know that Nick (the original author of these interfaces), had determined that they needed to be different. However, he may have made some mistakes. You had better check this with a tectcase.

moveAfterTab(GenericWidget) : widget returns logical, for frame returns boolean. Here, I think is safe to make it return logical for frame.

This was probably just a mistake. I am fine with the change.

getNextSibling, getPrevSibling : for widget returns GenericWidget, for plain handles it must return a handle instance. Not sure yet if it is safe to make the GenericWidget versions return a handle instance.

It seems to me that these should return a properly instantiated handle to the resource.

#48 Updated by Constantin Asofiei over 11 years ago

The idea is, to be able to use handle.unwrap to access an attribute for any kind of handle (procedure handle, widget handle, etc), handle.unwrap's return type must be an interface which defines ALL accessible attributes/methods via a generic handle, to be able to execute a handle.unwrap(h).getHeight() call. This call will be done via a dynamic proxy, which uses the handle's resource as the instance for the actual method call (if the resource defines such an API). I don't see another way of implementing this, as at runtime we can't know the type of resource a handle might contain... plus that the handle's resource might change from i.e. persistent procedure to widget and back.

Attached you have the current stage of work, with these notes:
  • HandleDef is a massing global interface which collects all interfaces which exposes attributes/methods used by a handle
  • for persistent procedures, the handle's value is an instance of ExternalProgramWrapper. This class extends TransparentWrapper, while also implementing the PersistentProcedure interface (which defines all the persistent-procedure related APIs).
  • handle's dynamic proxy is not fully implemented yet.
  • the code is missing all the various errors/warnings which are encountered when i.e. setting read-only atributes, accessing invalid attributes, etc.
  • for RUN SUPER/SUPER calls, still need to properly implement the iterator over the determined super-procedure list
  • CommonWidget and CommonWindow are interfaces which define the widget and window related APIs.
  • the ProcedureManager keeps everything per external procedure instance. The PersistentProc inner class is a container which holds all the needed info for an external procedure.

#49 Updated by Greg Shah over 11 years ago

In addition to the general messiness that comes from having a global or "umbrella" interface, there is another practical problem with this. In the past, we had serious problems using the J2SE dynamic proxy: we consumed the permgen because our frame classes all "contained" a full CommonFrame interface even though it was actually implemented in the parent class. We had to create our own dynamic proxy implementation to avoid this, but I don't think the same trick will work again.

You are right that handles can be reset to reference different resource types at runtime.

BUT we have (in the past) been saved by the fact that attributes and methods are usually specific to one resource type.

There are 2 limited cases that don't fit this model.

1. Attributes like PRIVATE-DATA, NAME, HANDLE, PREV-SIBLING, NEXT-SIBLING are things that exist across many (most?) resource types. These are candidates for implementation in the handle class. It seems we can cleanly solve this common case without a global interface.

2. Attributes/methods like CONNECT which have a small number of different implementations (socket, web service and appserver in this case). These are VERY rare, but they do exist.

Assuming I am right about the solution to case 1, then at conversion time we will always know the resource type that is expected except in case 2.

So the real problem is when case 2 is in use AND the code was written to such that we cannot determine what resource is really being used. This is so rare that I prefer to implement the cleaner way (resource-specific interfaces and factory methods for unwrapping) and to require additional review of those locations before conversion to assess any danger. Those would be places that we would target for 4GL code changes to use separate handles and/or to refactor if needed so that the same code was not used to process different resources. We might even be able to handle some automated refactoring using hints.

Please evaluate my assessment. If I haven't properly considered something, let me know. But I am willing to take a limitation on our implementation in order to get a much better result.

#50 Updated by Constantin Asofiei over 11 years ago

If I understood right, then there will be "handle.unwrap<Resource>" APIs, like:
  • handle.unwrapWidget for widget handles
  • handle.unwrapProcedure for procedure handles
  • handle.unwrapWindow for window handles
  • etc

The conversion rules will emit a specific unwrap method, depending on the accessed attribute/method. Their return type would be a specific interface, which defines the available APIs. i.e. for widgets will return CommonWidget, for procedures will return PersistentProcedure, etc.

One more note: each handle will have a single proxy instance, for each accessed resource type (which get build when the resource is set/changed to a not-yet-used type). We don't need to re-build the proxies (or discard the previous used one) every time the resource is changed.

But, if dynamic proxies are so expensive (in terms of used resources), then why not replace them with direct calls ? We will still use handle.unwrap<Resource> to get a Common* interface instance, but this will be done by casting the referent to the proper type (based on the resource it belongs). Also, the unwrap implementation will fail if the current referent is not of the expected resource type. The only limitation will be when the referent is for a resource and the accessed attribute is for another resource. In this case, casting the referent to the required Common* interface will not be possible and this means we will not be able to implement the proper error messages for calls like "h:persistent" when h is not yet initilized or is for i.e. a widget, not a persistent procedure (because we will not know which attribute/method will be accessed after the unwrap call, unless we pass its name to the unwrap methods, during conversion).

About the "PRIVATE-DATA, NAME, HANDLE, PREV-SIBLING, NEXT-SIBLING": I think the best solution is to leave their implementation at the resource, and not in the handle class, because these attributes live as long as the resource lives... This would mean there will be a root CommonHandle interface which defines APIs for these attributes, and the other Common* interfaces will extend it. Also, access to these attributes can be done in two ways:
  1. via the handle.unwrap(h).[set/get][Attribute] - which means the access is done via the dynamic proxy
  2. directly, via the h.[set/get][Attribute], which bypasses the dynamic proxy; i.e. access is done directy by casting the referent to the CommonHandle interface and calls the proper API.

#51 Updated by Greg Shah over 11 years ago

If I understood right, then there will be "handle.unwrap<Resource>" APIs, like:

Yes.

The conversion rules will emit a specific unwrap method, depending on the accessed attribute/method. Their return type would be a specific interface, which defines the available APIs. i.e. for widgets will return CommonWidget, for procedures will return PersistentProcedure, etc.

Yes.

One more note: each handle will have a single proxy instance, for each accessed resource type (which get build when the resource is set/changed to a not-yet-used type). We don't need to re-build the proxies (or discard the previous used one) every time the resource is changed.

Actually, I was thinking that the user's session would only have to have 1 proxy.

But, if dynamic proxies are so expensive (in terms of used resources), then why not replace them with direct calls ? We will still use handle.unwrap<Resource> to get a Common* interface instance, but this will be done by casting the referent to the proper type (based on the resource it belongs).

I envisioned the proxy only being used for the invalid case. I agree that if the contained resource is valid, then you just return the contained resource (which must implement the proper interface).

 Also, the unwrap implementation will fail if the current referent is not of the expected resource type.

Exactly. If the resource is not of the correct type, then a proxy to an invalid resource should be returned. And the subsequent attribute/method access should then fail with the proper error behavior. I'm not sure if the same proxy can be used for the "right resource type but no longer valid" as for the "wrong resource type" case. But multiple proxies can be used to duplicate the different failure modes.

The only limitation will be when the referent is for a resource and the accessed attribute is for another resource. In this case, casting the referent to the required Common* interface will not be possible and this means we will not be able to implement the proper error messages for calls like "h:persistent" when h is not yet initilized or is for i.e. a widget, not a persistent procedure (because we will not know which attribute/method will be accessed after the unwrap call, unless we pass its name to the unwrap methods, during conversion).

Hopefully my note above clarified this.

About the "PRIVATE-DATA, NAME, HANDLE, PREV-SIBLING, NEXT-SIBLING": I think the best solution is to leave their implementation at the resource, and not in the handle class, because these attributes live as long as the resource lives... This would mean there will be a root CommonHandle interface which defines APIs for these attributes, and the other Common* interfaces will extend it.

Yes, this makes sense. I like the consistency.

via the handle.unwrap(h).[set/get][Attribute] - which means the access is done via the dynamic proxy

My idea is to not use a dynamic proxy for the valid resource case.

#52 Updated by Constantin Asofiei over 11 years ago

Ok, now I undestand what you mean: when the resource is of a known type (widget, persistent procedure,etc), unwrap will return the actual instance, and not a proxy; when the resource is not set yet, it will return a dynamic proxy.
But I think there are some problems which complicates things when using this approach. If a handle is used by this code:

def var h as handle.
def var h2 as handle.
h = session.
h:add-super-procedure(h2).
h = this-procedure.
h:add-super-procedure(h2).

this should convert to something like:
h.assign(LogicalTerminal.sessionHandle()); /* returns a proxy to map the session attributes/methods */
handle.unwrapSession(h).addSuperProcedure(h2);
h.assign(ProcedureManager.thisProcedure());
handle.unwrapProcedure(h).addSuperProcedure(h2);

But, how can I decide at conversion time that the add-super-procedure call is for the session, and not for a persistent procedure ? This information is needed, to be able to emit the correct handle.unwrap method. This could be solved by extracting the common super/persistent-procedure related methods/attributes (for session and persistent procedures) into a common interface, with the interface hierarchy for this case being:
                   ---------->CommonProcedure <------------\
                  /            - addSuperProcedure         |
                 /             - removeSuperProcedure      |
                /                                          |
PersistentProcedure                                    CommonSession
 - all other proc-related APIs                         - all other session-related APIs

and emit something like:
h.assign(LogicalTerminal.sessionHandle()); /* returns a proxy to map the session attributes/methods */
handle.unwrapCommonProcedure(h).addSuperProcedure(h2);
h.assign(ProcedureManager.thisProcedure());
handle.unwrapCommonProcedure(h).addSuperProcedure(h2);

This way, the conversion will emit the same unwrap API for the add-super-procedure and remove-super-procedure calls.

Note that this is only a first example. If there are other attributes/methods cross-used by other handle types, then this will complicate things for them too, in terms of interface hierarchy design and emitted unwrap APIs (still need to finish my review, this is only a first example I found).

Also, about the "PRIVATE-DATA, NAME, HANDLE, PREV-SIBLING, NEXT-SIBLING": the "PRIVATE-DATA, NAME, PREV-SIBLING, NEXT-SIBLING" attributes are not available for the FILE-INFO, SESSION, ERROR-STATUS and LAST-EVENT handles. Also, PREV-SIBLING (for some reason) is not available for QUERY handles.

Finally, if we want to provide support for a handle to be assigned to FILE-INFO, SESSION, ERROR-STATUS and LAST-EVENT handles, then this code:

h = SESSION.
h = ERROR-STATUS.
h = LAST-EVENT.
h = FILE-INFO.

I think should convert to:
h.assign(LogicalTerminal.sessionHandle());
h.assign(ErrorManager.handle());
h.assign(KeyReader.handle());
h.assign(FileSystemOps.handle());

#53 Updated by Greg Shah over 11 years ago

Yes, you've got the idea.

I think your implementation approach (e.g. CommonProcedure) makes sense.

One note: using LogicalTerminal for the SESSION handle was a temporary thing done back when the only features used in that handle were UI features (using it for trigger registration). That is no longer appropriate. Please move the session handle elsewhere (ProcedureManager, TransactionManager or a new SessionManager).

#54 Updated by Constantin Asofiei over 11 years ago

About the invalid API calls (when the invoked API is not valid for the current referent). 4GL shows these kind of messages:
**<API> is not a queryable attribute for <referent-type>. (4052)

where:
  • <referent-type> is i.e. "PSEUDO-WIDGET" for FILE-INFO, SESSION, ERROR-STATUS, LAST-EVENT handles or i.e. "PROCEDURE" for PROCEDURE handles
  • <API> is the actual 4GL API name being invoked. Is not that easy to fully support each 4GL API name, as the name is lost during conversion. For now, is best to let <API> be the Java name of the actual invoked method, and in the future produce a map which has as key the Java API name and as value its 4GL equivalent (I will make it so that this will be pretty easy to implement, or better I will add support for a few APIs and will complete them later, is just tedious work).

About the FILE-INFO, SESSION, ERROR-STATUS, LAST-EVENT: their associated unwrap methods return a special "static" dynamic proxy which is constructed using a specified interface and a list of classes (it maps each method from the interface to a public static method in one of the classes). There is one proxy for each handle, global per application (I've added a StaticProxy class to the goldencode.proxy package). This proxy is obtained using the "asHandle" API added to the FileSystemOps, SessionUtils (see bellow info about it), ErrorManager and KeyReader classes.

For the SESSION handle, I've added the util.SessionUtils class (I named it SessionUtils to disambiguate from the net.SessionManager class). Currently it contains only the asHandle API, but other methods can be moved/added here from LogicalTerminal and other classes. The "static" proxy for this handle is constructed using these classes: SessionUtils, EnvironmentOps, date, LogicalTerminal, ProcedureManager. This way we can still keep the session-related APIs in dedicated classes.

#55 Updated by Greg Shah over 11 years ago

All of your decisions are good ones. I agree with the deferral of the old API name mapping.

#56 Updated by Constantin Asofiei over 11 years ago

Which ErrorManager API maps the ERROR-STATUS:ERROR flag ? isError or isErrorFlag ?

#57 Updated by Greg Shah over 11 years ago

ErrorManager.isError() is what should be used from converted application code, to access ERROR-STATUS:ERROR. If I recall, isErrorFlag() could be used (but today it is not) from the client to get the same value.

#58 Updated by Constantin Asofiei over 11 years ago

Summary for what is tested and known to be working:
  • FILE-INFO, SESSION, ERROR-STATUS, LAST-KEY, CURRENT-WINDOW handles (conversion and API access)
  • access to THIS-PROCEDURE/TARGET-PROCEDURE/SOURCE-PROCEDURE handles
  • VALID-HANDLE
  • DELETE PROCEDURE/DELETE OBJECT
  • procedure related attributes and methods: persistent, type, private-data, internal-entries, get-signature
Need to complete testing of (although some features were used for the above cases and they did work, I need to complete running/building their dedicated tests):
  • persistent procedure chains (SESSION handle's first-procedure, last-procedure + proc:next-sibling and proc:prev-sibling)
  • super-procedure related attributes and methods (for SESSION and PROCEDURE handles): super-procedures, add-super-procedure(SEARCH-TARGET | SEARCH-SELF), remove-super-procedure()
  • RUN statement (IN handle, PERSISTENT [SET h]), with combination of external program, procedure, super-procedure.
  • SUPER and RUN SUPER statements (with SEARCH-TARGET and SEARCH-SELF)
  • DYNAMIC-FUNCTION [IN handle]
  • proper error messages for read-only and API calls when the referent is not of the expected type.
  • handle equality tests
  • handle used in UI statements (like message and display)

#59 Updated by Constantin Asofiei over 11 years ago

Summary for more tested issues:
  • handle equality tests
  • persistent procedure chains (SESSION handle's first-procedure, last-procedure + proc:next-sibling and proc:prev-sibling)
  • super-procedure related attributes and methods (for SESSION and PROCEDURE handles): super-procedures, add-super-procedure(SEARCH-TARGET | SEARCH-SELF), remove-super-procedure()
  • RUN statement (IN handle, PERSISTENT [SET h]), with combination of external program, procedure, super-procedure.
  • FUNCTION ... IN handle and DYNAMIC-FUNCTION [IN handle]. There are still some issues related to the error messages displayed when the handle is invalid/not of proper type, but 4GL seem to know that the function (when was defined FUNCTION .. IN handle) invocation comes from a simple function call, a DYNAMIC-FUNCTION call or a DYNAMIC-FUNCTION call, because it displays different error messages. Currently, all these convert to ControlFlowOps.invokeFunction and ControlFlowOps.invokeFunctionIn, but I'm not sure if we should add different APIs, like ControlFlowOps.invokeFunction for "simple" calls, ControlFlowOps.invokeFunctionDyn for DYNAMIC-FUNCTION and ControlFlowOps.invokeFunctionDynIn for DYNAMIC-FUNCTION.
Need to complete testing of:
  • SUPER and RUN SUPER statements (with SEARCH-TARGET and SEARCH-SELF)
  • proper error messages for read-only and API calls when the referent is not of the expected type.
  • handle used in UI statements (like message and display)

Also, next on the queue is to regression-test the conversion code (not the runtime).

#60 Updated by Greg Shah over 11 years ago

Good work! I'm really excited to see this come together.

Currently, all these convert to ControlFlowOps.invokeFunction and ControlFlowOps.invokeFunctionIn, but I'm not sure if we should add different APIs, like ControlFlowOps.invokeFunction for "simple" calls, ControlFlowOps.invokeFunctionDyn for DYNAMIC-FUNCTION and ControlFlowOps.invokeFunctionDynIn for DYNAMIC-FUNCTION.

Yes, please do implement 3 different entry points to separate these cases. They can all use the same worker method, but this way the error messages can be done properly.

#61 Updated by Constantin Asofiei over 11 years ago

Summary for more tested issues:
  • fixed the DYNAMIC-FUNCTION's error messages
  • testing of RUN SUPER and SUPER calls are in progress. SEARCH-SELF works nicely, need to finish understanding how SEARCH-TARGET keeps its pointers.
Need to complete testing of:
  • proper error messages for read-only and API calls when the referent is not of the expected type.
  • handle used in UI statements (like message and display)
  • need to fix a conversion problem related to the "apply <event> to h." statement, where h is a handle variable.

#62 Updated by Constantin Asofiei over 11 years ago

In 4GL, code with read-only attributes on the right-side of an expression gets compiled, as in:

def var h as handle.
h:name = "bar".
message "after". /* this statement is reached. */

This only displays a runtime warning message (no error is raised). I don't think is any of use to provide full support for such cases (i.e. adding setters for read-only attributes)... instead, I propose to convert this using something like:
h.readOnlyError("name");

instead of
handle.unwrap(h).setName("bla");

This allows us to fully support the read-only attribute errors in legacy code, while discouraging use of such cases in code written by hand.

#63 Updated by Greg Shah over 11 years ago

Yes, your approach makes perfect sense. The 4GL "feature" is absurd.

#64 Updated by Constantin Asofiei over 11 years ago

Summary for more tested issues:
  • merged files with latest bzr versions
  • proper error messages for read-only assignments
  • proper error messages for API calls when the referent is not of the expected type
  • fixed a conversion problem related to the "apply <event> to h." statement, where h is a handle variable (it converts to handle.unwrapWidget(h).apply)
  • first phase of conversion regression testing passed (code did compile and all changes were expected).
Need to complete testing of:
  • testing of RUN SUPER and SUPER calls with SEARCH-SELF/SEARCH-TARGET, both functions and procedures (plus SOURCE-PROCEDURE and TARGET-PROCEDURE when there are nested RUN SUPER/SUPER calls).
  • handle used in UI statements (like message and display).

#65 Updated by Constantin Asofiei over 11 years ago

Another issues which we need to decide if we support it or not is the attribute's mode set when the function/procedure is invoked. In 4GL, if the mode is different than the expected one (i.e. output passed for input-only defined parameter), then a specific error is shown. If we need to provide support for this, then the ControlFlowOps.invoke APIs will need to know the mode for each parameter (as set on the function/procedure call statement). Currently, name_map.xml collects the parameter's mode (as set on the function/procedure definition), as it is needed by GET-SIGNATURE.

#66 Updated by Greg Shah over 11 years ago

If I understand properly, the problem is that the calling side does not supply the missing signature information (essentially the list of modes for the parameters). When combined with the fact that most cases of the dispatching of a RUN or function call cannot determine the target until runtime, this means that we can't statically determine this error, but must raise it at runtime. But we don't have enough data to raise the error.

I prefer to avoid adding much information to each call. For example, to add a list of integers to every call would be messy.

One idea is to define the set of all mode lists in the name_map.xml. Then allow a reference to the mode list via a symbolic name as part of the call and the runtime would obtain the real mode list from the name_map as needed.

#67 Updated by Constantin Asofiei over 11 years ago

Yes, you understood correctly, we need the parameter modes set to the RUN or function call statement.

I prefer to avoid adding much information to each call. For example, to add a list of integers to every call would be messy.

I agree an int array would be messy. What about sending the argument list as in: (mode, value), but still use varargs for the ControlFlowOps.invoke APIs, as in:

ControlFlowOps.invoke("procedure", Mode.INPUT, v1, Mode.OUTPUT, v2, ...).

This way we would keep the mode together with the value.

One idea is to define the set of all mode lists in the name_map.xml. Then allow a reference to the mode list via a symbolic name as part of the call and the runtime would obtain the real mode list from the name_map as needed.

Hmm... this would hide the modes in name_map.xml, but... this means name_map.xml needs to contain n*3 modes, for n arguments.

PS: I've attached the update with the javadocs added. This is almost final, the remaining known issues are:
  • testing of RUN SUPER and SUPER calls with SEARCH-SELF/SEARCH-TARGET, both functions and procedures (plus SOURCE-PROCEDURE and TARGET-PROCEDURE when there are nested RUN SUPER/SUPER calls).
  • parameter modes set to caller.
  • handle used in UI statements (like message and display).

#68 Updated by Greg Shah over 11 years ago

Hmm... this would hide the modes in name_map.xml, but... this means name_map.xml needs to contain n*3 modes, for n arguments.

This is not exactly what I mean. The idea is that the name map would have a set of all possible mode lists. Each call site would be analyzed for its mode list. The mode list for one call site might be this: MODE_I_I_IO = { INPUT, INPUT, INPUT_OUTPUT } and for another call site it might be MODE_I_O = { INPUT, OUTPUT }. Each unique list would be saved in the name map using some symbolic name (MODE_...). Then all call sites that use the same mode list would simply pass the proper mode list symbolic name as the 1st or 2nd parm. I guess it is best to just pass the mode list name as a string.

Thinking about it, I guess we could skip the name map part and just encode the modes as a string like I suggest above with the mode list names. My whole objective here is to condense the mode list information into the minimum space that is also manageable/readable.

#69 Updated by Constantin Asofiei over 11 years ago

Ok, I think I understand now. What you are suggesting is to condense the parameter modes into a string and emit it as a parameter. Some 4GL behavior: for explicit function calls, as in:

function f0 returns int(output j as int).
end.
def var j as int.
f0(j).

the parameter mode, if not set, it is assumed to be the one at the function definition (it does not default to INPUT). If DYNAMIC-FUNCTION or RUN statement is used, then no parameter mode means it defaults to INPUT mode.
Also, beside normal parameters we need to consider TABLE and BUFFER parameters. For TABLE, INPUT/OUTPUT/INPUT-OUTPUT modes do apply, so nothing to do here; for BUFFER, there is no "mode". In conclusion, we need can use:
- I for INPUT
- O FOR OUTPUT
- U for INPUT-OUTPUT
- B for BUFFER
This way, we can avoid the "_" separator and the string will be like "IIOUIB" for some parameter list sent as in "input, input, output, input-output, input, buffer".

Thus, all ControlFlowOps.invoke APIs will be added a new version, which receives the parameter list. If no parameter list is sent, no validation is done (it assumes all parameter modes default to their correct mode).

#70 Updated by Greg Shah over 11 years ago

I like it.

#71 Updated by Constantin Asofiei over 11 years ago

The parameter mode for function calls (DYNAMIC-FUNCTION or simple function call) is removed by some code I guess in progress.g (the .parser node does not contain the kw_input, kw_output, etc nodes for such cases). Any idea which part of code removes them ?

#72 Updated by Constantin Asofiei over 11 years ago

Constantin Asofiei wrote:

The parameter mode for function calls (DYNAMIC-FUNCTION or simple function call) is removed by some code I guess in progress.g (the .parser node file does not contain the kw_input, kw_output, etc nodes for such cases). Any idea which part of code removes them ?

#73 Updated by Greg Shah over 11 years ago

I think the issue is caused by this code:

func_call_parm [boolean builtin]
   :
      (
         options { generateAmbigWarnings = false; }
         :
           { !builtin }?
           KW_INPUT!
         | KW_OUTPUT!
         | KW_IN_OUT!
      )?
      expr
   ;

Notice the ! after each keyword. This drops the node from the resulting tree.

We previously didn't need that information (and I considered it a quirk, not a feature). So I dropped it.

#74 Updated by Constantin Asofiei over 11 years ago

This update adds:
- a first phase of SEARCH-TARGET mode for super calls
- support for parameter mode validation for dynamic, IN handle or IN SUPER functions and RUN statements. Normal function calls do not need parameter validation, as this generates compile error in 4GL. The same for SUPER and RUN SUPER calls - compiler error if the passed mode is not the same.
- previous update had a missing file, now it contains it
- note that p2j/util/phandle.java file is no longer used, can be removed

What is left:
- finish testing SEARCH-TARGET
- go through all testcases again for one last time

You can go ahead and start reviewing the files, I don't expect to have major changes from now on. Note that progress.g and methods_attributes.rules still have support for NEXT-SIBLIG123 and PREV-SIBLING123 attributes, I'll remove them when I'll send the final update. The same for methods_attributes.rules, at the end of the file there are some comments which need to be moved somewhere else.

#75 Updated by Constantin Asofiei over 11 years ago

Some other notes related to functions/procedures which have buffer parameters. For these cases, the actual parameter class is the anonymous proxy class. The Java's runtime provides no way to determine which interface should map that proxy. So, as the only kind of proxy the parameter can get is a buffer proxy, I've added the possibility to determine the proxy from the invocation handler, via the RecordBuffer.getDmoInterfaceForProxy API. When ControlFlowOps validatate the arguments, if the argument is a proxy, then it tries to get the DMO interface for that proxy; if none found, it means is not a known proxy, and it fails validation.

#76 Updated by Greg Shah over 11 years ago

Feedback:

0. First: WOW! Really good work.

1. The read_only_attribute function in common-progress.rules seems like it will be painful to maintain over time, since every time we add support for a new writable attribute, we have to add something there. I'd prefer something more complete. A better approach might be to add a boolean flag into the progress.g sym.addAttributeOrMethod() method call. Then we can use that database to add an annotation at parse time. Whatever we decide, I will have someone else fill out the changes to make things complete (it involves going through every attribute in the 4GL reference and I need you to work on other things). But I want to come up with the best approach as part of your work on this task.

2. Some code at the bottom of user_functions.rules seems to be always bypassed. That stuff used to be needed for the non-dynamic forms. Should it be dead? If so, then remove it and I would like to understand why no wrapping is needed anymore (not that I mind).

3. The paramater_modes function should be named parameter_modes.

4. This comment in control_flow.rules may be confusing to future readers:

                 for native procedures, as at the conversion time we can't
                 determine if this is an internal native procedure call,
                 we let the runtime do the work

I think the term "internal native procedure call" is the part that is confusing. I think it may be better to say it this way:

                 for native procedures, as at the conversion time we can't
                 differentiate between a native call and a call to an internal
                 procedure, we let the runtime do the work

5. In control_flow.rules, the RUN VALUE ... case just emits "naturally" now (as an expression that evaluates to a character result)?

That is the feedback so far on the rules. I will review the Java and ANTLR stuff tomorrow.

#77 Updated by Constantin Asofiei over 11 years ago

1. The read_only_attribute function in common-progress.rules seems like it will be painful to maintain over time, since every time we add support for a new writable attribute, we have to add something there. I'd prefer something more complete. A better approach might be to add a boolean flag into the progress.g sym.addAttributeOrMethod() method call. Then we can use that database to add an annotation at parse time. Whatever we decide, I will have someone else fill out the changes to make things complete (it involves going through every attribute in the 4GL reference and I need you to work on other things). But I want to come up with the best approach as part of your work on this task.

If there is a way to emit this information at parse time, it's fine with me.

2. Some code at the bottom of user_functions.rules seems to be always bypassed. That stuff used to be needed for the non-dynamic forms. Should it be dead? If so, then remove it and I would like to understand why no wrapping is needed anymore (not that I mind).

Yes, it should be dead. The conversion rules no longer emit Java method calls for simple 4GL function calls because I couldn't find a good way to manage setting the proper TARGET- and SOURCE-PROCEDURE, for simple function calls cases. The idea is, the runtime needed to know if the function call came from ControlFlowOps API or from a Java method call (to set the TARGET and SOURCE-PROCEDURE handles, if it originated from a Java method call). But, considering the same function can at least be invoked using DYNAMIC-FUNCTION too, the only way was to look in the stack trace and see if ControlFlowOps was involved. But this is very expensive, and I don't think is a viable solution.

3. The paramater_modes function should be named parameter_modes.

OK, changed.

4. This comment in control_flow.rules may be confusing to future readers:

OK, changed.

5. In control_flow.rules, the RUN VALUE ... case just emits "naturally" now (as an expression that evaluates to a character result)?

Yes, that is correct.

I've merged my files with the latest bazaar files and I will put it here in the morning your time. Please wait for this update before continuing review.

#78 Updated by Constantin Asofiei over 11 years ago

Attached update has the following new fixes:
  • javadoc fixes
  • merged with latest bazaar files
  • fixed conversion of functions with empty body, as in:
    function f0 returns int:
    end.
    
  • fixed conversion of buffer parameters for functions with no body (defined IN SUPER or IN handle):
    function f0 returns int(buffer buf for tt1) in super.
    function f0 returns int(buffer buf for tt1) in h.
    
Something I haven't considered yet: DYNAMIC-FUNCTION calls can be used in logical expressions. Here, we have several cases:
1. when the expected value is a logical value, as in:
  • if <dynamic-function-call> or WHILE <function-call> cases - this must "cast" the returned value to logical (4GL has some rules, explained bellow)
  • <dynamic-function-call> = <logical expr>, logical test - same as above, the returned value is "cast" to logical. This is done only if the type of one of the operands is known for sure to be logical (i.e. the expression's type can be statically determined, as when a logical variable is used or a simple function call which returns logical).
    Cast to logical is done following these rules:
    1. if unknown, returned value is unknown
    2. when character, it must be one of "yes" or "no" values
    3. when a number, for non-zero is true, for everything else is false
    4. when date: is there a "zero" equivalent for dates ? Because all my testing showed that any non-unknown date value is converted to true
2. if none of the operands is determined to yeld a logical value, then it compares them following these rules:
  • if a number is compared to a text, and the text representation of the given number matches the text, then it returns true.
  • if a date is compared to a text, and the text representation of the date number matches the text, then it returns true.
  • still need to test the other combinations, like date and numeric types, etc.
The only problem with this is that we use ControlFlowOps.invokeFunction calls for simple calls, which now are incompatible with how the converted code looks for cases like:
function f0 returns logical.
return true.
end.
if f0() then message "true".
if dynamic-function("f0") then message "true".

do while f0():
  message "true".
end.

repeat while f0():
  message "true".
end.

do while dynamic-function("f0"):
  message "true".
end.

repeat while dynamic-function("f0"):
  message "true".
end.


is converted as:

            if ((ControlFlowOps.invokeFunction("f0")).booleanValue())
            {
               message("true");
            }
            if ((ControlFlowOps.invokeDynamicFunction("f0")).booleanValue())
            {
               message("true");
            }

            LogicalExpression whileClause0 = new LogicalExpression()
            {
               public logical execute()
               {
                  return ControlFlowOps.invokeFunction("f0");
               }
            };

            doWhile("loopLabel0", whileClause0, new Block()
            {
               public void body()
               {
                  message("true");
               }
            });

            LogicalExpression whileClause1 = new LogicalExpression()
            {
               public logical execute()
               {
                  return ControlFlowOps.invokeFunction("f0");
               }
            };

            repeatWhile("loopLabel1", whileClause1, new Block()
            {
               public void body()
               {
                  message("true");
               }
            });

            loopLabel2:
            while ((ControlFlowOps.invokeDynamicFunction("f0")).booleanValue())
            {
               message("true");
            }

            LogicalExpression whileClause2 = new LogicalExpression()
            {
               public logical execute()
               {
                  return ControlFlowOps.invokeDynamicFunction("f0");
               }
            };

            repeatWhile("loopLabel3", whileClause2, new Block()
            {
               public void body()
               {
                  message("true");
               }
            });

For such cases (and only such cases), I think we should emit special APIs:
  • ControlFlowOps.invokeLogicalDynamicFunction tries to cast the returned value to logical (following the rules above).
  • ControlFlowOps.invokeLogicalFunction will try to cast the returned value directly to logical, and will throw a RuntimeException if it fails (because simple function calls in such IF and WHILE cases always must return logical, else would be 4GL compile error).
  • an addition is the SUPER function used in the if super() and while super() cases, for which we will also need a special API, ControlFlowOps.runSuperLogical

#79 Updated by Constantin Asofiei over 11 years ago

Another idea for the IF <func-name>() case would be: why not emit directly something like if (((logical)ControlFlowOps.invokeFunction(...).booleanValue()) ? This way the conversion code will cast the result to the logical type. For the IF DYNAMIC-FUNCTION() case, at runtime we have no idea we need to evaluate the result as logical (which can't be done in a simple cast), so the emitted code will be if (ControlFlowOps.invokeLogicalDynamicFunction(...).booleaValue() ... or, I think better would be if (ControlFlowOps.invokeDynamicFunction(...).castLogical().booleanValue(). And we can use this for the simple function calls too, as in if (ControlFlowOps.invokeFunction(...).castLogical().booleanValue()

#80 Updated by Greg Shah over 11 years ago

If there is a way to emit this information at parse time, it's fine with me.

I will add a task to add a readOnly flag to the parser for all attributes, as a 3rd parm to sym.addAttributeOrMethod(). Then the parser will also need a change to annotate this flag. Finally, post-parse fixups will have to change to use the flag instead of the function.

In regard to the function/dynamic-function returning logical thing I don't understand from your example:

do while f0():
  message "true".
end.

...

do while dynamic-function("f0"):
  message "true".
end.

Why does the first convert to a BlockManager.doWhile() but the second converts to a Java while() loop? If the dynamic-function can raise a 4GL condition, then this seems wrong.

As far as a solution, I agree that using the if (ControlFlowOps.invokeDynamicFunction(...).castLogical().booleanValue() and if (ControlFlowOps.invokeFunction(...).castLogical().booleanValue() is the best approach. It allows the logic for transformation of the various return data to be "morphed" into a logical result. I would also say that there should be 2 versions of it:

castLogical() for when we need to pass the result through to an operator
castBoolean() which would be a shorter form of castLogical().booleanValue(), which would only be used in the cases where a direct boolean result is needed

#81 Updated by Constantin Asofiei over 11 years ago

Why does the first convert to a BlockManager.doWhile() but the second converts to a Java while() loop? If the dynamic-function can raise a 4GL condition, then this seems wrong.

Thanks seeing this, a DYNAMIC-FUNCTION call can raise ERROR condition. I will check why the WHILE loop doesn't convert to the proper API call.

As far as a solution, I agree that using the if (ControlFlowOps.invokeDynamicFunction(...).castLogical().booleanValue() and if (ControlFlowOps.invokeFunction(...).castLogical().booleanValue() is the best approach. It allows the logic for transformation of the various return data to be "morphed" into a logical result. I would also say that there should be 2 versions of it:

castLogical() for when we need to pass the result through to an operator
castBoolean() which would be a shorter form of castLogical().booleanValue(), which would only be used in the cases where a direct boolean result is needed

Understood.

#82 Updated by Greg Shah over 11 years ago

Feedback on the latest rules changes:

1. Did you check on the effect to Majic of removing of the move of the define parameter of a buffer in record_scoping_post.rules?

2. If the record_scoping_post change is safe to do, then it seems like parmtarget and parmidx are no longer needed.

3. Was there a reason that the post_parse_fixups additions were not put into a separate rules file (e.g. fixups/command_tokens). That is the design we are trying to stay with if it works.

#83 Updated by Constantin Asofiei over 11 years ago

1. Did you check on the effect to Majic of removing of the move of the define parameter of a buffer in record_scoping_post.rules?

Conversion is in progress, I'll let you know the results.

2. If the record_scoping_post change is safe to do, then it seems like parmtarget and parmidx are no longer needed.

I'll look into it.

3. Was there a reason that the post_parse_fixups additions were not put into a separate rules file (e.g. fixups/command_tokens). That is the design we are trying to stay with if it works.

No, there wasn't one. I'll check if it's safe to merge all the new rule-sets into a separate fixups/functions_procedures file and call this from fixups/post_parse_fixups.xml

#84 Updated by Constantin Asofiei over 11 years ago

Unfortunately, another issue appears: DYNAMIC-FUNCTION calls can be used in complex expressions (arithmetic, logical, etc), as in this example for int type:

def var i as int.
function f0 returns int.
  return 1.
end.

function f1 returns char.
  return "1".
end.

i = dynamic-function("f0") + dynamic-function("f0"). /* this evaluates to 2 */
i = dynamic-function("f0") + dynamic-function("f1"). /* this evaluates to 11 */
i = dynamic-function("f0") + dynamic-function("f0") + dynamic-function("f1"). /* this evaluates to 21 */

This means that when dynamic-function is involed, until a non-numeric value is encountered, arithmetic operations are done. When a non-numeric value is found, it concats the string representations of both values; when the assignment is done, the resulted string is converted to its numeric value, if possible. For these cases, we need MathOps APIs which take a BaseDataType as a parameter. As we no longer allow simple function calls to convert to Java method calls, we need these new MathOps APIs to be able to pass runtime regression.

I still need to test how 4GL behaves when numeric types get used with the other data types (like date, rowid, etc).

#85 Updated by Constantin Asofiei over 11 years ago

1. Did you check on the effect to Majic of removing of the move of the define parameter of a buffer in record_scoping_post.rules?

Conversion has passed, there are no differences between the generated files.

#86 Updated by Greg Shah over 11 years ago

The conversion rules no longer emit Java method calls for simple 4GL function calls because I couldn't find a good way to manage setting the proper TARGET- and SOURCE-PROCEDURE, for simple function calls cases. The idea is, the runtime needed to know if the function call came from ControlFlowOps API or from a Java method call (to set the TARGET and SOURCE-PROCEDURE handles, if it originated from a Java method call). But, considering the same function can at least be invoked using DYNAMIC-FUNCTION too, the only way was to look in the stack trace and see if ControlFlowOps was involved. But this is very expensive, and I don't think is a viable solution.

I am still unclear on why we can't make the function calls directly. When we make a function call to the corresponding converted method, the converted method is instrumented with BlockManager.*Function() calls. Can't we hide the SOURCE-PROCEDURE stuff in there? When we call DYNAMIC-FUNCTION, that certainly would have to got through ControlFlowOps, but then we can determine things by lookup like now except perhaps it would have to set some flag or pass something to bypass the "normal" BlockManager SOURCE-PROCEDURE setup. It seems like we should be able to do something like that here and the result will be MUCH better in the converted code. It also will reduce the potential problem with the castLogical() stuff.

In regard to https://projsrv01/redmine/issues/1920#note-84 (about the dynamic nature of the expression processing when dynamic-function() is called):

Yuck. The issue is going to be much more than just the MathOps. Rather, this is likely just an extension of the castLogical() issue. I suspect that there will be similar problems with using this as a parameter to other function calls (including DYNAMIC-FUNCTION calls) and so forth. The code due for milestone 4 has 12692 across 1921 files.

Progress is dynamically evaluating the expression at runtime, which is what leads us to this mess.

My quick look at the reports with these calls show the vast majority of calls have a string literal as the 1st child (the function name). In those cases, it seems like we can lookup the function return value and base our code generation on that, like are are doing with castLogical(). If that handles the majority of cases, then it may cut down the problem immensely.

The thing is that ultimately, I wonder if there are many (if any) scenarios where the data type of the overall expression result can be dynamic. Progress doesn't allow function parameters or return values to be flexible in type. Progress does not allow the left side of the assignment operator to be flexible in type and so forth. Normally, when we do type analysis (using ExpressionConversionWorker.expressionType()) we do it from the bottom up. In other words, the terminal nodes (leaves) of the tree are ultimately a typed value. Then we look at the level above them (operators, function calls...) and calculate the type there based on all the operands/parameters and so on, rolling upwards until we have the type of the sub-tree.

There is the possibility that we can infer the types in some cases by the surrounding nodes (siblings or parent) instead of only be the children. The big issue is that there are definitely cases where that won't work. BUT, all 3 examples above could be determined this way. All of them are assigned to an integer i with a + operator as the top of the right side.

I am going to look into how many of these calls are really dynamic and then we can get the scope of the problem.

#87 Updated by Constantin Asofiei over 11 years ago

First, about the DYNAMIC-FUNCTION's case: idea is, 4GL does compile such code (where the type of the operand is evaluated at runtime). From testing, only the + operator can be used with both character and numeric values (maybe because it confuses it with the concatenation operator for characters). What troubles me is that 4GL does attempt a conversion from character to numeric value, if the left-side is a numeric value... and if the text is a valid number, it will convert it OK. For the "-" operator, from what I saw it fails on the first reach of a character operand.
About your idea to look at the passed function name, determine the function definition and get its return type - from my understanding, all function/procedure search is done using the name, and not the full signature. I can't tell now, but I wonder if it is possible for a function to be defined IN handle or IN SUPER, and at the location where its body is defined, its signature has a different return type than at the original IN handle (or IN SUPER) definition. I know this is something far reach, but I want to understand how it behaves. If the return type can't be different, than we can go ahead and cast the function call to its return type (if we can't determine its definition from the passed name).

Now, about the simple function calls being converted to ControlFlowOps.invokeFunction calls - I'm still not conviced that the TARGET- and SOURCE-PROCEDURE are not needed when trying to resolve the correct external procedure to which the function belongs. I'll spend some time tomorrow to find a case where this is possible (or prove that it isn't possible). If I can prove that it isn't, then we can go back a step and allow again simple function calls to emit as Java method calls. But, if these are needed before function invocation, beside using ControlFlowOps APIs, I don't know of any other way to differentiate between the simple and dynamic function calls ... except looking into the stacktrace. You need to consider that, when the function is invoked by the ControlFlowOps APIs, they can't pass any new parameter to the BlockManager APIs, because they are not invoked directly. And setting a flag into BlockManager is not feasible either - consider nested simple and dynamic function calls, we need to differentiat between them, to allow the proper value to be set.

Another thought: if a function is defined IN SUPER or IN handle and the code uses a simple function call to invoke it, this still needs to be converted to ControlFlowOps.invokeFunction APIs. But, for such cases, we can safely cast the function call to its proper return type... and, if the conclusion will be that simple calls for functions which are not defined as IN SUPER or IN handle still need to get converted to ControlFlowOps.invokeFunction APIs, then we can cast the function call to the proper return type too.

And one final thgouth: if the code we need to convert and compile for this milestone does use DYNAMIC-FUNCTION with a non-static function name, then we have no choice and add MathOps APIs which work with BaseDataType values, to allow the code to compile. A first implementation would be to just cast the operand to its numeric value, and leave the weird casting for a future task (its a little unrealistic for someone to do this on purpose - let a function return a character and use that returned value later as a numeric value in another expression, but you never know how the code can end up after years of maintenance).

#88 Updated by Greg Shah over 11 years ago

Here are some results from my searches.

In my latest scan, there are 12690 instances of DYNAMIC-FUNCTION. Of those, 12614 are called with a "static" target (a string literal). The other 76 are "complex expressions" that really cannot be evaluated at compile time. Some of these really dynamic cases are just references to a VAR_CHAR and others are building up a function name like "set" + some-text-var.

A thought: for the time being, it may be perfectly fine to manually code a hint for each of the 76 dynamic cases. The hint would simply describe the expected return type of the function. I believe that the return type should always be the same type because of how the other features of the language are statically typed. We may find otherwise, but if this works out then it could solve the problem for us. We would still need to build the proper cast helpers and/or provide alternate forms of the operators. I like the cast methods better because there are a huge number of possible operators/functions/... that might have to be modified, but if we focus on doing the tranformation before it is passed into the operator/function, then there should be a much more limited set of "casts" needed.

What do you think?

#89 Updated by Greg Shah over 11 years ago

Now, about the simple function calls being converted to ControlFlowOps.invokeFunction calls - I'm still not conviced that the TARGET- and SOURCE-PROCEDURE are not needed when trying to resolve the correct external procedure to which the function belongs. I'll spend some time tomorrow to find a case where this is possible (or prove that it isn't possible). If I can prove that it isn't, then we can go back a step and allow again simple function calls to emit as Java method calls. But, if these are needed before function invocation, beside using ControlFlowOps APIs, I don't know of any other way to differentiate between the simple and dynamic function calls ... except looking into the stacktrace. You need to consider that, when the function is invoked by the ControlFlowOps APIs, they can't pass any new parameter to the BlockManager APIs, because they are not invoked directly. And setting a flag into BlockManager is not feasible either - consider nested simple and dynamic function calls, we need to differentiat between them, to allow the proper value to be set.

I'm not sure I fully understand what you are saying here, but I haven't gotten as deep into the search processing and super/in handle cases as you have.

The idea of a flag for BlockManager could be implemented in different ways. For example, what if we emitted a function for the simple call and a worker function that could be called by both the simple form and by ControlFlowOps? The worker would have an extra boolean flag which would be passed through to the BlockManager.*Function() call. The simple method with the original signature would call the worker with the flag set one way and ControlFlowOps would call the worker with the flag the other way. The BlockManager would know when to handle the SOURCE/TARGET-PROCEDURE stuff and when not to do so.

I going through the Java part of the update now.

#90 Updated by Constantin Asofiei over 11 years ago

I don't think DYNAMIC-FUNCTION's return type is "static"... I think it's something more like the Object type in Java. Depending on where is called, its returned value may be "cast" to something different (see the IF dynamic-function case, when the returned value is not logical - the reason why castLogical is needed).

I agree with the hints, we can use them as long as a certain dynamic-function call will always return a value of that type (i.e. all the functions reached by that call have the same return type). I guess the hints will need to specify something like: "the dynamic-function call on line x col y will return <type>". I don't have much experience using hints, it would help a lot if you can show me a quick example on how the .p.hints entry should look like (and how to access it).

#91 Updated by Constantin Asofiei over 11 years ago

About entry 89 - I need to keep in mind that at least TARGET-PROCEDURE handle might be used in cojunction with the SEARCH-TARGET mode for THIS-PROCEDURE and IN SUPER and IN handle clauses for the function definition... anyway, I will have a final conclusion about this tomorrow. But I'm not sure I understand how you want to pass that flag... consider that the BlockManager.*function API is executed only after the "method.invoke" call (which executes the method via reflection). So, if we have these two calls:

public integar bar() { ... }
public integer foo()
{
    return integerFunction(..., new Block .... public void body {  bar(); } );
}
...
ControlFlowOps.invokeFunction("foo");
foo();

How will ControlFlowOps.invokeFunction know to inform the integerFunction call (because until there we can't touch anything) that this call was "dynamic" and not "simple" ?

#92 Updated by Greg Shah over 11 years ago

When I said the DYNAMIC-FUNCTION return type is static, I mean that if the target is hard coded, then the DYNAMIC-FUNCTION call will always have the same return type and we can treat it as static.

In regard to the hints, look at com/goldencode/p2j/uast/UastHints.readUastGeneric(). This allows the creation of hints in a .hints file with a type, name and value. These hints (when present) are parsed and put into a map. They can be read by Java code directly from the UastHints instance or by TRPL code using the UastHintsWorker.

So, one way to do this (coded inside an XML file named my-program.p.hints):

<uast name="dyn-func1" datatype="string" value="integer" />
<uast name="dyn-func2" datatype="string" value="logical" />

The dyn-func1 would be the type of the 1st dynamic-function in the file my-program.p and dyn-func2 would be the 2nd entry.

It even supports String[] allowing an array to be coded in a uast element with nested array-val elements.

For a very similar example of how we use this (during call graph processing) to code the possible filenames used for RUN VALUE, see com/goldencode/p2j/uast/CallGraphWorker.CallGraph.addCallGraphNodeValueHint().

#93 Updated by Greg Shah over 11 years ago

I was thinking more along the lines of:

public integer bar() { ... }
public integer foo()
{
    return fooWorker(false);
}
public integer fooWorker(boolean dynFlag)
{
    return integerFunction(..., dynFlag, new Block .... public void body {  bar(); } );
}

When a dynamic case is required, then inside ControlFlowOps.invokeFunction("fooWorker") the runtime code would call it with an extra 1st parameter "true".

But the common case could be called just via foo().

#94 Updated by Greg Shah over 11 years ago

And of course, the BlockManager.*Function() methods would need variants that take the extra parameter.

#95 Updated by Constantin Asofiei over 11 years ago

OK, I understand now what you mean, emit workers in the generated code. About the hints - if a file contains multiple DYNAMIC-FUNCTION calls, then the hints file will contain the datatype for only those with dynamic function names and the order of the hints which define the type should be the same as the dynamic-function statements, right?

#96 Updated by Greg Shah over 11 years ago

 About the hints - if a file contains multiple DYNAMIC-FUNCTION calls, then the hints file will contain the datatype for only those with dynamic function names and the order of the hints which define the type should be the same as the dynamic-function statements, right?

Yes, the generic uast hints with a name prefix of "dyn_func" would only be used for those DYNAMIC-FUNCTION invocations that don't have a string literal function name.

Yes, the order of the "truly dynamic" DYNAMIC-FUNCTION calls would match one to one with the order and number of generic uast hints with a name prefix of "dyn_func".

#97 Updated by Greg Shah over 11 years ago

Feedback on the Java implementation:

1. StaticProxy.invoke() is missing the method level description text in javadoc.

2. ExternalProgramWrapper.isValid() is missing javadoc.

3. progress.g needs a copyright date update.

4. handle.isProcedure() needs javadoc.

5. There are some TODOs in handle, ExternalProgramWrapper... What is the plan for these?

I didn't read every line of ControlFlowOps and ProcedureManager. There was too much to get through in a first pass. However, I still skimmed those files to get the idea. I am OK with the implementation. There are a few things here or there that I might do differently, but I really don't see a major reason to change.

One general concern: I do worry that the performance of the result will suffer due to significantly more overhead in all the setup/dispatching/teardown that now goes on in ControlFlowOps. I realize that this implementation very closely matches the real 4GL behavior so that added work is necessary and important. But I also am expecting this to be an area where we look for some performance improvement in the future.

This is really, really impressive work!

#98 Updated by Constantin Asofiei over 11 years ago

Considering your idea to emit worker methods, to be able to differentiate between simple and dynamic function calls, my conclusion is to follow this plan:
1. go back a step and use Java method calls for simple function calls
2. if at a certain point we find that workers are needed (to differentiate between a dynamic and a simple call for the same mathod), then the worker methods will be put in place. Idea is, I will not spend time on SEARCH-TARGET and TARGET-PROCEDURE issue today, I will focus on making the code compile.
3. cast all ControlFlowOps.invokeFunction calls to the function's return type (as this will be emitted for simple function calls, when the function is defined IN SUPER or IN handle).
4. for DYNAMIC-FUNCTION calls with a static function name (and WITHOUT an IN handle clause), determine the function's definiton and cast the function call to its return type. Please check how many DYNAMIC-FUNCTION calls have the IN handle clause - we will need to emit hints for them too, as the function's body may not be in the current file. If the number is very high, then we need to go back and at least create new MathOps APIs. For assignments, as all BaseDataType's are Undoable, the assign(Undoable) will do the job. I'll let you know if I find other APIs which need to be added.
5. implement the hint support for DYNAMIC-FUNCTION's return type.

#99 Updated by Constantin Asofiei over 11 years ago

The major cost in ControlFlowOps is the reflection done to determine the correct method. If we cache this, we will have a major improvement in dynamic procedure/function invocation.

The TODO's which are left in these files are:
  • collect_name.rules
    1. add support for the FUNCTION ... MAP TO actual-name IN handle clause - are there such uses in the server files we need to convert ?
    2. when collect the name of the buffer from a BUFFER FOR <buf-name> parameter, the name must be taken from the location where the buffer was defined (idea is, it must use the same case sensitivity, as this is how is displayed in the signature)
  • ControlFlowOps.java
    1. all invoke* APIs must return a BaseDataType value. What if a function is called and it returned a non-BDT value ? Should we log an error and raise a STOP condition (or maybe RuntimeException) ?
    2. cache the methods by class/name/signature - this is for future improvements
  • ExternalProgramWrapper
    1. getType - this one is obsolete, was added to rembember to check if TYPE can be anything else other than "PROCEDURE", for procedure handles.
    2. getSignature - that should return blank, I've fixed it
    3. the currentWindow and setProcCurrentWindow - that is left for when CURRENT-WINDOW will be implemented
  • handle.java
    1. populateApis - maybe we should move this map somwhere else, as this map will need to contain all the attributes/method names. I think we should glue this with the progress.g read-only attribute work
    2. WorkArea.map - the TODO was added to keep in mind that this map poses a memory leak problem. I did not dig any more into how and why is used.
  • ProcedureManager.java - the TODO is related to SEARCH-TARGET mode, if the super-procedures list gets changed while in super calls for this mode, no more super calls are possible. I will adress this for this task.
  • SourceNameMapper.java - should we log the errors related to loading the name_map.xml file ?

#100 Updated by Greg Shah over 11 years ago

I'm generally fine with this plan. Some thoughts:

2. if at a certain point we find that workers are needed (to differentiate between a dynamic and a simple call for the same mathod), then the worker methods will be put in place. Idea is, I will not spend time on SEARCH-TARGET and TARGET-PROCEDURE issue today, I will focus on making the code compile.

Please create a testcase that shows the problem. Then create a Redmine task for the resolution and attach that testcase (you should also check it into bzr).

3. cast all ControlFlowOps.invokeFunction calls to the function's return type (as this will be emitted for simple function calls, when the function is defined IN SUPER or IN handle).

It may be worth have different variants of invokeFunction that are type specific (invokeLogicalFunction(), invokeIntegerFunction()). I think that reads a bit cleaner than the cast.

Please check how many DYNAMIC-FUNCTION calls have the IN handle clause - we will need to emit hints for them too, as the function's body may not be in the current file.

I did a search and there are 11829 DYNAMIC-FUNCTION name IN handle instances. Ouch. All 76 of the "non-static" DYNAMIC-FUNCTION calls are IN handle, but we already know we need to create hints for those.

If some of these cases also have a function declaration (for a function with the same "static" name), can we use that and only hint out the ones that are completely runtime based?

Then I randomly started checking on files. The 1st on I checked, did have a local function definition for one function and an IN SUPER declaration for the other. But the 2nd file I checked had no local/forward/IN declarations. However, the strange thing was that both files were using the same static function names! I checked the reports and there are a handful of functions used thousands of times.

But that got me wondering: how often does a developer create multiple functions of the same name that return different data types? It is possible, but it is also probably not the common case.

Key question: how many cases break this rule?

So I wrote a TRPL program to map all function names to the set of all return types specified in definitions for that name. Then I made a list of all dynamic-function calls with a string literal function name. I cross-referenced the dynamic-function names to the set of possible return types. Here is what I found:

There are 286 unique string literals used in the 12614 DYNAMIC-FUNCTION calls that are "static". Of these 249 match with a name where all function definitions for that name in the entire project always return the same type. Interestingly, the other 37 instances:

- 1 instance is a function that has 2 possible return types (integer and character)
- 36 instances are of function names that have NO definition anywhere in the project!

I guess, the idea here is that we should be able to calculate the vast majority of the string literal function name used with IN handle. There is really only 1 case where we need to provide a hint. And the 36 cases need to be reported back to the customer as these seem like bugs.

#101 Updated by Constantin Asofiei over 11 years ago

It may be worth have different variants of invokeFunction that are type specific (invokeLogicalFunction(), invokeIntegerFunction()). I think that reads a bit cleaner than the cast.

OK, I can do that. I guess the same should apply for invokeDynamicFunction - there should be invokeDynamicLogicalFunction(), etc

There are 286 unique string literals used in the 12614 DYNAMIC-FUNCTION calls that are "static". Of these 249 match with a name where all function definitions for that name in the entire project always return the same type.

This means that we need to collect all the function definitions from the entire project sometime before annotations phase, to be able to determine the return type when the name is used by DYNAMIC-FUNCTION during annotations phase. Can a dictionary be populated and survive for the annotations.xml phase to use it?

Interestingly, the other 37 instances:

- 1 instance is a function that has 2 possible return types (integer and character)

Hints will be needed for each location where this function is invoked (unless we can use a per-project hint file).

#102 Updated by Greg Shah over 11 years ago

 FUNCTION ... MAP TO actual-name IN handle clause - are there such uses in the server files we need to convert ?

I checked. This construct is NOT used at all.

What if a function is called and it returned a non-BDT value ? Should we log an error and raise a STOP condition (or maybe RuntimeException) ?

Yes, log it and raise a stop condition.

populateApis - maybe we should move this map somwhere else, as this map will need to contain all the attributes/method names. I think we should glue this with the progress.g read-only attribute work

OK.

SourceNameMapper.java - should we log the errors related to loading the name_map.xml file ?

Not right now, leave TODOs there.

This means that we need to collect all the function definitions from the entire project sometime before annotations phase, to be able to determine the return type when the name is used by DYNAMIC-FUNCTION during annotations phase. Can a dictionary be populated and survive for the annotations.xml phase to use it?

Annotations is designed such that every rule-set is run against the entire set of source files before the next rule-set starts. This means that as long as we do the gathering early enough in annotations to be before you need it, it will work.

I think a simple map of function names to their set of possible return types will do in this case (you probably don't need a dictionary). Please see the attached file for the code I used to create the list. So long as this variable is defined in annotations.xml, you will be able to use it from multiple "downstream" rule-sets.

Hints will be needed for each location where this function is invoked (unless we can use a per-project hint file).

Yes and yes. The hints are designed to be specified in a per-file or directory basis. And there can be a hints file in every directory from the file up to the project root. Please see the class javadoc for UastHintsWorker for how this works. Plus, we can always add a custom parameter to the global configuration (p2j.cfg.xml) and then access it using the Configuration class. But worst case, per location hints will be used.

Please create additional tasks for each of the open TODOs (which we have decided you won't do right now). Link them back to this task.

#104 Updated by Constantin Asofiei over 11 years ago

Annotations is designed such that every rule-set is run against the entire set of source files before the next rule-set starts. This means that as long as we do the gathering early enough in annotations to be before you need it, it will work.

I don't think I understand how to make annotations.xml execute (before executing the other rule-sets in annotations.xml)a rule-set for each and every file to collect the data and then pass this to the other rule-sets. I've added a rule-set just after the init-rules node in annotations.xml, to collect the data, but this doesn't work. This new rule-set is executed for each file, at the same time with all other rule-sets in annotations.xml. See in attached file how my changes look.

I think what we need to do is to collect and save this map in a xml (or other serializable) form before annotations.xml is executed (maybe part of the post-parse-fixups), and in annotations.xml's init-rules load this xml in memory and save it in a map variable, accessible for all rule sub-sets executed by annotations.xml.

#105 Updated by Greg Shah over 11 years ago

Yes, I think you are right. We process 1 AST sequentially through each rule-set.

I think what we need to do is to collect and save this map in a xml (or other serializable) form before annotations.xml is executed (maybe part of the post-parse-fixups), and in annotations.xml's init-rules load this xml in memory and save it in a map variable, accessible for all rule sub-sets executed by annotations.xml.

Yes, this makes sense.

Don't forget that function names are matched case-insensitively.

#106 Updated by Constantin Asofiei over 11 years ago

This version does the following:
- a few fixes for issues discovered when using MAJIC
- collect the return types for all functions across the project and save them to a file
- emit different invokeFunction and invokeDynamicFunction APIs based on the function's return type (from the function definition or from the file's hints)
- a fix attempt in record_scoping_prep.rules record_scoping_post.rules, but this doesn't showed some issues, so it is de-activated.

#107 Updated by Eric Faulhaber over 11 years ago

Constantin, as discussed, I have added isProxyClass(Class) to the ProxyFactory class and attached the update here. I have compiled, but not tested it. Please let me know if it works for your purposes.

#108 Updated by Constantin Asofiei over 11 years ago

Eric, thanks, I've tested it with the class of P2J frame instance and it returns true, so it will work with the class of a DMO instance to.

#109 Updated by Constantin Asofiei over 11 years ago

Greg: can you ran a report and let me know which attributes/methods are used in the project ? I want to make sure we provide implementations for all. If you can also check which attributes are used in assignments (on the left side) and which methods are used in complex expressions, would be great (at least some of the query-handle related methods return a value, while our implementation returns void).

#110 Updated by Constantin Asofiei over 11 years ago

Don't forget that function names are matched case-insensitively.

About the case-insensitivity: I see that this applies for internal procedures too (they are matched case-insensitively). I need to double check the SourceNameMapper changes, to be sure the search is done right. From what I see, external program names can be case sensitive or insensitive, based on a the "file-system/case-sensitive" directory setting, right ?

#111 Updated by Greg Shah over 11 years ago

From what I see, external program names can be case sensitive or insensitive, based on a the "file-system/case-sensitive" directory setting, right ?

Yes, exactly.

 can you ran a report and let me know which attributes/methods are used in the project ? I want to make sure we provide implementations for all.

I like the way you are thinking.

However, we already have these split up across many other tasks. Since we need to also provide stub implementations so that the compilations will work, we thought it would be best to implement them as sets of related functionality (e.g. all the sockets related features get conversion support at the same time). Do you still need to list, or can we finalize your current work without it?

#112 Updated by Constantin Asofiei over 11 years ago

I ask this because I noticed that some attributes are read-only for some handle types and writable for others (thus we need to make sure we don't assume a read-only attribute, when it is used as writable). For some other attributes, we emit APIs for them, but they are not implemented (like height-chars, virtual-height/width-chars, virtual-height/width-pixels). Also, there are attributes like FGCOLOR and BGCOLOR which are stubbed only in WindowWidget class, but they are used by widgets too.

My work depends on this only to the extent where a attribute/method might be accessed via a handle and this might need to rework/add new handle.unwrap APIs or to re-work/add APIs to the interfaces.

I will submit another update later today, which is stable in terms that it almost passed regression testing in MAJIC (some tests have failed with what looks like a comon cause). If you want me to work on something else, I'm OK with putting this on hold for a while.

#113 Updated by Greg Shah over 11 years ago

Let's go ahead and finalize your update and get it regression tested and checked in. If we have rework for some of the handle unwrapping or APIs, we will make changes later. There are many updates pending and I really want to get yours in next.

#114 Updated by Constantin Asofiei over 11 years ago

Issues fixed/changed by this update:
  1. function/internal procedures are searched case-insensitive (both in the conversion phase where function's return type is saved/determined and in the SourceNameMapper APIs)
  2. fixed the record_scoping_post.rules problem (the buffer_scope node just needed to be moved near the define_parameter node)
  3. when ? literal is used as the handle for DYNAMIC-FUNCTION, FUNCTION ... IN handle or RUN ... IN handle, emit a new handle() instance instead of new unknown()
  4. fixed RecordBuffer.getDMOInterfaceForObject
  5. removed all the APIs which accepted a BaseDataType instance instead of handle instance (there were APIs for these statements, which accept a handle: ADD/REMOVE-SUPER-PROCEDURE, RUN IN handle, FUNCTION ... IN handle, DYNAMIC-FUNCTION). These are no longer needed, the reason they were added in the first place is no longer valid.
  6. in hand written-code, we can end up with executing "external procedure" blocks in Java static methods, called outside a real external procedure instance. In such cases, the Block instance has no this$ field; thus, we assume that the previous THIS-PROCEDURE is the correct one, so we return the top of the thisProcedures stack. (see the cfg.Config.tempUserDef method in MAJIC - it executes an Block.externalProcedure).
  7. fixed handle validation when both the target and the handle are invalid, in a RUN or DYNAMIC-FUNCTION case - handle validation is done first.
Remaining steps:
  • focus on fixing regression errors
  • finish creating the SEARCH-TARGET tests and add a new sub-issue if something is not working

#115 Updated by Greg Shah over 11 years ago

More research and thoughts on dynamic-function() return types:

1. I have been looking at all the call sites for the 36 NODEF cases (DYNAMIC-FUNCTION with a string literal function name that does not exist in the project). I have found these cases:

/* the return type can be determined by the lvalue type */
my-var = DYNAMIC-FUNCTION("something" IN some-handle).

/* the return type doesn't matter, because this is just a standalone expression executed to get the side-effects only */
DYNAMIC-FUNCTION("something" IN some-handle).

/* the DYNAMIC-FUNCTION return type can be determined by the function return type */
RETURN DYNAMIC-FUNCTION("something" IN some-handle).

/* the return type must be comparable to a character value */
IF DYNAMIC-FUNCTION("something" IN some-handle) = "" THEN

/* the return type must be comparable to the my-field type */
FOR EACH table WHERE my-field = DYNAMIC-FUNCTION("something" IN some-handle).

/* the following all show an explicit coercion of the return value to a specific type */
/* option1: we can eliminate the STRING()/INTEGER()/DATE() and do the conversion in the runtime */
/* option2: implement BaseDataType versions of these constructors (where not already available) and let the runtime to the work */
STRING(DYNAMIC-FUNCTION("something" IN some-handle)).
INTEGER(DYNAMIC-FUNCTION("something" IN some-handle)).
DATE(DYNAMIC-FUNCTION("something" IN some-handle)).

I believe we can calculate the return value of all of these at conversion time. We can either enhance the ExpressionConversionWorker.expressionType() processing or add a specific rule-set to calculate this and leave behind some annotations.

2. Looking at all the call sites for the 1 multiple return type case (DYNAMIC-FUNCTION with a string literal function name that has 2 possible return types in the project). This case appears in 2 files and both files have code that looks like this:

FUNCTION some-func RETURNS INTEGER ( my-parm AS CHARACTER ) FORWARD.

my-var = SUBSTRING(text-var, 1, DYNAMIC-FUNCTION("some-func" IN TARGET-PROCEDURE, text-var)).

header-var = header-var + FILL("-", DYNAMIC-FUNCTION("some-func" IN TARGET-PROCEDURE, text-var)) + " ".

FUNCTION some-func RETURNS INTEGER ( my-parm AS CHARACTER ):
   ... real code here ...
END.

The other instance of a function with this some-func name is in a completely unrelated file and it is not called via DYNAMIC-FUNCTION. The question here: can we use the IN TARGET-PROCEDURE "clue" to have us give priority to the "local" some-func? The other thing we can see is that it is used in places that require an integer value. AND we can look at the parameters passed to DYNAMIC-FUNCTION to see if we can disambiguate the signatures. I guess we can encode hints for this if we have to, but I prefer to calculate the right answer if possible

3. Which leaves us with the 76 non-static calls. I have reviewed each of the call sites. Virtually all the cases are actually a subset of those in item 1 above:

my-var = DYNAMIC-FUNCTION("something" IN some-handle).
DYNAMIC-FUNCTION("something" IN some-handle).
RETURN DYNAMIC-FUNCTION("something" IN some-handle).
STRING(DYNAMIC-FUNCTION("something" IN some-handle)).

But in addition there are a couple of instances of this:

IF DYNAMIC-FUNCTION("something" IN some-handle) = ? THEN

We can't tell the return type of the function BUT we can easily return BaseDataType and test for unknown.

With this in mind, I think we can build rules to handle all possible cases of DYNAMIC-FUNCTION without hints. What do you think?

#116 Updated by Greg Shah over 11 years ago

Code review feedback:

1. I'm not sure how important it is, but it may be better to use ExpressionConversionWorker.progressToJavaString() intsead of removeQuotes() in functions.rules. And I think you should also use it before you store off the name in the XML file. The call sites don't have to use the same quote chars for their string literals.

2. Is there still a buffer scoping issue caused by the changed code in record_scoping_post.rules? The TODO suggests so, but your history in this task said otherwise.

3. Why removeNote("casttype") in builtin_functions.rules? It is useful information to leave behind in the AST.

#117 Updated by Constantin Asofiei over 11 years ago

About the review feedback:
1. in functions.rules I need the 4GL function name from the static string passed to the DYNAMIC-FUNCTION call. When collecting the function's return type in the post_parse_fixups phase, I take the function's name directly from the kw_funct's "name" annotation (so for this case I don't need to remove any quotes). Going back to the DYNAMIC-FUNCTION's case, ecw.progressToJavaString is not used when the "name" annotation is saved at the function definition, in the parser phase... so, unless I use ecw.progressToJavaString for both cases (functions.rules and the function return type save in fixups/functions_procedures.rules), I don't think this will be OK. In 4GL, can a function name contain escape characters ?

2. I forgot to delete the TODO, I removed it now.

3. If I don't remove the "casttype" annotation, it will add an explicit cast node (see lines 930), even if it uses special ControlFlowOps.invoke APIs, based on the function's return type (in DYNAMIC-FUNCTION case). But I can move it in a "ignoredcasttype" annotation.

#118 Updated by Constantin Asofiei over 11 years ago

1. please check the return types for all functions - currently, only these return types are supported: integer, decimal, character, logical, date, raw, memptr, handle

2. DYNAMIC-FUNCTION("something" IN some-handle) - for this, I already emit the generic ControlFlowOps.invokeDynamicFunctionIn API, as there is no need to disambiguate based on the returned type

3. RETURN DYNAMIC-FUNCTION("something" IN some-handle) - no special API is needed, as the BlockManager.returnNormal API can get any BaseDataType instance.

4. STRING(DYNAMIC-FUNCTION("something" IN some-handle)) - character.valueOf(BaseDataType) is emitted for such cases, thus no special API needs to be emitted

5. INTEGER(DYNAMIC-FUNCTION("something" IN some-handle)) and DATE(DYNAMIC-FUNCTION("something" IN some-handle)) - I prefer to emit BaseDataType c'tors for integer and date classes, and let the runtime do the work. If the conversion can determine the return type for the function, it will still emit the dedicated API based on the function's return type (which may be different than integer or date).

5. my-var = DYNAMIC-FUNCTION("something" IN some-handle) - if the function's return type can not be determined, then the ControlFlowOps.invokeDynamicFunction APIs (which return a BaseDataType) will be emitted; as all BaseDataType values are Undoable, the BaseDataType.assign(Undoable) does the job - so no special work needed

6. IF DYNAMIC-FUNCTION("something" IN some-handle) = "" THEN - this is converted using CompareOps._isEqual(BaseDataType, String) API, so no special changes are needed, regardless if the function's return type can be determined or not.

7. FOR EACH table WHERE my-field = DYNAMIC-FUNCTION("something" IN some-handle) - at a first glance, no special changes should be needed regardless if the return type can be determined or not, because all where clause arguments are passed in a Object[] array. The only problem is we need to upper-case the result, when the function returns a character - and for this I think we need to rely on the field's type.

But there is a conversion problem, because for a code like:

for each tt1 where tt1.f1 = dynamic-function("fi"): end.

the converted code looks like:
            forEach("loopLabel17", new Block()
            {
               AdaptiveQuery query17 = null;

               public void init()
               {
                  query17 = new AdaptiveQuery(tt1, "tt1.f1 =", null, "tt1.id asc");
               }

               public void body()
               {
                  query17.next();
               }
            });

The converted code does compile safely, but I don't know why the dynamic-function call is not emitted properly as a substitute parameter in the converted where clause and as an entry in the new Object[] {} array. Eric: can you give me a hint where to look ?

8. IF DYNAMIC-FUNCTION("something" IN some-handle) = ? THEN - no special attention is needed, as this will convert using CompareOps._isUnknown(BaseDataType), so we are covered regardless if the return type can be determined or not.

9. about your case 2 - I will come up with an answer later today.

#119 Updated by Eric Faulhaber over 11 years ago

Constantin Asofiei wrote:

7. FOR EACH table WHERE my-field = DYNAMIC-FUNCTION("something" IN some-handle)
[...]
The converted code does compile safely, but I don't know why the dynamic-function call is not emitted properly as a substitute parameter in the converted where clause and as an entry in the new Object[] {} array. Eric: can you give me a hint where to look ?

Where clause analysis and query substitution parameter extraction happens in a series of rule sets in annotations:

  • where_clause_pre_prep.rules
  • where_clause_prep.rules
  • where_clause_prep2.rules
  • where_clause.rules
  • where_clause_post.rules
  • where_clause_post2.rules

As you can see, the analysis happens in layers. Even though not all of them are directly relevant to what you are looking for, you probably should look at them in order to understand what the overall process is.

#120 Updated by Greg Shah over 11 years ago

In 4GL, can a function name contain escape characters ?

Not as far as I know.

This should not affect your work, but it should be mentioned that the 4GL does support what I call "malformed" symbols as names of functions and procedures (as well as frames, labels and queries). A malformed symbol doesn't use the documented valid symbol characters. Instead it starts with one or more NUM_LITERAL (e.g. 878796), DEC_LITERAL (3434.555), DATE_LITERAL (12-12-2053), MINUS (-) or UNKNOWN_TOKEN ('`' | '!' | '#' | '$' | '%' | '&' | '{' | '}' | '|' | ';') and then it ends with a symbol or even a reserved keyword! There cannot be any intervening whitespace between any of the tokens.

But I can move it in a "ignoredcasttype" annotation.

OK.

only these return types are supported: integer, decimal, character, logical, date, raw, memptr, handle

Please add recid, rowid and int64.

I will have to merge my datetime, datetimetz changes with yours and I will handle those changes at that time. Later (this week I hope), we will add longchar support too.

#121 Updated by Constantin Asofiei over 11 years ago

Please add recid, rowid and int64.

Note that I will need to add BlockManager.*function APIs for these types too.

For case 2 on note 115: you are correct, in such cases we need to determine the parameter's type and wrap the ControlFlowOps.invokeDynamicFunction call in i.e a new integer(ControlFlowOps.invokeDynamicFunction(...)) call. This always needs to be done, even if we determine the function's return type, as 4GL seems to "auto-cast" the returnd value to the parameter's type, even if the function returned something of different type.

What I solved/changed with this update:
  • fixed DYNAMIC-FUNCTION conversion used in a WHERE clause. Note that ExpressionConverWorker supports only these builtin functions: ABS, ACCUM, MIN, MAX, IF, INPUT, DYNAMIC-FUNCTION. If you didn't already, we need to check what other builtin funcs are used in a WHERE clause.
  • always emit the parameter modes as a string value (so they will never emit as a character instance). Cleaned up some (of my) dead code with this too.
  • do not emit parameter modes for simple function calls, even if ControlFlowOps APIs are used (if modes are incorrect, there will be compile error in 4GL);
  • wrap dynamic-function calls using the proper c'tor in cases when is used as parameter for the SUBSTRING or FILL builtin functions
  • first pass at fixing nested function calls, when simple and dynamic-function is involved
  • I've modified the function's return type lookup process to search using these steps:
    1. Build a map of function names to their return map for the current procedure. This local map is used to determine the return type for all simple function calls and as a first search for dynamic-function calls.
    2. if not found in the local map and this is a dynamic-function call, the global, per-project, map is searched.
    3. if not found in the global map and is a dynamic-function call, the file's hints is used. Thus, the hint lookup is left as a last resort.
The hints will not be needed in the following cases (and this are the cases when warnings will not be logged either), regardless of function name being static or from an expression:
  1. function calls outside an expression (as in f0()., dynamic-function(). or super().)
  2. return dynamic-function(). case
  3. var = dynamic-function(). case (i.e. a simple assignment)
This and the new function-name resultion steps should cover all our cases, so hints will not be needed for this project. With these changes, hints will be needed only in DYNAMIC-FUNCTION case when return type can not be determined, as the following applies:
  • the function name is dynamic OR the function name is static and the same name is used to define two or more functions, with different return types, and there is no function with that name defined in the current procedure, and
  • the DYNAMIC-FUNCTION call is not used in one of the cases mentioned above.

#122 Updated by Constantin Asofiei over 11 years ago

... it almost passed regression testing in MAJIC (some tests have failed with what looks like a comon cause).

My conclusion is that the tests have failed because of lightning load (and maybe because of some of the additional work inherent with the reflection invocation), as a subsequent regression testing with yesterday's fix showed the failed tests as passed.

#123 Updated by Greg Shah over 11 years ago

Code review feedback:

1. In ExpressionConversionWorker, the history entry needs to be 036. It also should not be specific to WHERE clause processing (see 2 below).

2. In ExpressionConversionWorker, the error message that starts with "WARNING: WHERE clause needs return type" is misleading. The expressionType() method is not just used for WHERE clause processing. It is a general purpose sub-expression type processor. Just change this to be "WARNING: 'casttype' return type annotation expected".

Otherwise it is good.

#124 Updated by Constantin Asofiei over 11 years ago

This version adds support for functions returning int64, rowid and recid. Note that if such functions are used with DYNAMIC-FUNCTION and is passed as parameter to other functions or assigned to variables, the converted code will not compile, as not all P2J data types have BaseDataType c'tors. The only fully supported types are integer and date (because of the DATE and INTEGER functions, which convert to new date and new integer). As STRING function converts to character.valueOf, I didn't add a BaseDataType c'tor to character class.
To be safe, I suggest to add no-op BaseDataType c'tors to all P2J data type classes, and implement them as needed on task #1972.

#125 Updated by Greg Shah over 11 years ago

Code review feedback: the latest code looks good.

To be safe, I suggest to add no-op BaseDataType c'tors to all P2J data type classes, and implement them as needed on task #1972.

I agree. Please add a history entry providing the suitable details.

Questions (in general):

1. What is still left to be done on this task?

2. I don't want to open a "can of worms" here, but I wonder if we can reduce the number ControlFlowOps APIs using generics. It seems the primary problem we are solving with the invokeTYPENAME*() methods is to ensure that the return type is properly set for javac. For example, this is how your code does it now:

invokeDateDynamicFunction("func-name", arg1, arg2, arg3)

But what if we generated the code like this instead:

invokeDynamicFunction(date.class, "func-name", arg1, arg2, arg3)

This would allow us to implement a single common worker method (and eliminate a whole range of other methods):

public static <T> T invokeDynamicFunction(Class<T> clazz, String name, Object... args)
{
   ... the return type processing could use reflection on the .class instance as needed ...
}

Most importantly, it would MASSIVELY cut ControlFlowOps down in size, with very little negative impact on the generated code. This becomes important for maintenance and for enhancements (we are about to add datetime, datetimetz and longchar).

What do you think? This should work for all of the type-specific APIs in ControlFlowOps, not just the invokeDynamicFunction().

#126 Updated by Constantin Asofiei over 11 years ago

[...]

I agree. Please add a history entry providing the suitable details.

Understood.

1. What is still left to be done on this task?

  1. I there is time, I would like to get confortable with current SEARCH-TARGET implementation
  2. don't know if you did this already or not, but, as a note, we should identify all builtin functions used in WHERE clauses; if they are not explicitly referenced in ExpressionConversionWorker (as I referenced kw_dyn_func), the converted where clause will be messed up (for the runtime, not for the compile time).
  3. the other subtasks I added here

2. I don't want to open a "can of worms" here, but I wonder if we can reduce the number ControlFlowOps APIs using generics. It seems the primary problem we are solving with the invokeTYPENAME*() methods is to ensure that the return type is properly set for javac. For example, this is how your code does it now:
[...]
What do you think? This should work for all of the type-specific APIs in ControlFlowOps, not just the invokeDynamicFunction().

Hmm... yes, this will work. And I think we should change BlockManager.*function APIs too (and the way BlockManager handles with per-type implementation was the reason I didn't think twice on how to implement this). Btw, as a side-note, I didn't write each and every function version by hand, I "templated" the generic versions in a file, ran a script and generated all their per-type versions automatically...

#127 Updated by Constantin Asofiei over 11 years ago

And another thought about using generics: I would like to leave the non-generic versions of the functions in place, to allow function calls which return BaseDataType values, without forcing it to specify a return type.

#128 Updated by Greg Shah over 11 years ago

I there is time, I would like to get confortable with current SEARCH-TARGET implementation

I think we need to move on to other things for the next 2-3 weeks, until we get everything compiling. Let's leave this one piece of the runtime until then.

don't know if you did this already or not, but, as a note, we should identify all builtin functions used in WHERE clauses; if they are not explicitly referenced in ExpressionConversionWorker (as I referenced kw_dyn_func), the converted where clause will be messed up (for the runtime, not for the compile time).

Understood. I will add a suitable task for this.

 I would like to leave the non-generic versions of the functions in place, to allow function calls which return BaseDataType values, without forcing it to specify a return type.

Actually, I meant to handle this case but the code I wrote was not correct. I think this form of my example is better:

public static <T extends BaseDataType> T invokeDynamicFunction(Class<T> clazz, String name, Object... args)
{
   ... the return type processing could use reflection on the .class instance as needed ...
}

But I think you still might need a different version to eliminate the clazz parm (if I understand you correctly). Do what you think is best.

I want to get this into testing as soon as Eric's is through. That may also require merging of code. Do you think we can get this into testing tomorrow?

#129 Updated by Constantin Asofiei over 11 years ago

  • File ca_upd20130123e.zip added

Actually, I meant to handle this case but the code I wrote was not correct. I think this form of my example is better:

Yes, this is how I wanted to implement it too.

But I think you still might need a different version to eliminate the clazz parm (if I understand you correctly). Do what you think is best.

Correct, I need the version without the clazz parm.

I want to get this into testing as soon as Eric's is through. That may also require merging of code. Do you think we can get this into testing tomorrow?

Yes, I'm almost done emitting the .class reference in the conversion rules, and I hope to leave a conversion on lightning over night.

#130 Updated by Constantin Asofiei over 11 years ago

  • File deleted (ca_upd20130123e.zip)

#131 Updated by Constantin Asofiei over 11 years ago

I've deleted 0123e.zip, as it wasn't supposed to get uploaded.

#132 Updated by Constantin Asofiei over 11 years ago

OK, now I have the correct version of 0123e.zip, which does the following:
  1. ControlFlowOps APIs are re-written to use generics and the return class is emitted as first parameter, as needed. I have some tests which I can't run now, but will test them tomorrow morning.
  2. added TYPENAME c'tors for all classes

#133 Updated by Constantin Asofiei over 11 years ago

Added testcases used for #1920 and #1921. bzr repository was updated too.

#134 Updated by Constantin Asofiei over 11 years ago

This update does the following:
  1. removed the bogus next-sibling/prev-sibling attributes from progress.g and methods_attributes.rules
  2. fixed the .class reference emitted for DYNAMIC-FUNCTION cases (now it emits this reference in the same cases as when the per-TYPENAME APIs were emitted)

This is OK to go in testing, if there is nothing for me to merge first.

#135 Updated by Greg Shah over 11 years ago

I am OK with the update.

But Eric's buffer handle changes are currently in staging and the testing yesterday may have found 3 broken tests. So it may be that he has changes to make.

Do you have a clean staging_ca build where we can test this update? If so, please go ahead. If this passes testing, then you can check it in and distribute it. And in that case we will reset staging to your build and merge Eric's changes on top of it. If your build has trouble, we will see who gets done first and the other person will be merging.

#136 Updated by Greg Shah over 11 years ago

  • % Done changed from 30 to 90

#137 Updated by Constantin Asofiei over 11 years ago

staging_ca/1920/staging is a staging copy, without Eric's update (is almost in sync with what is in bazaar, only some javadoc related changes are missing, which I guess were added after I made the copy). The 1920/staging.merged4/ folder contains 1920/staging + ca_upd20130124b.zip - I can test there once current regression testing is finished (in ~2 hours). 1920/staging can be safely used if you need to check the generated sources in /gc/convert/staging/ for differences.

#138 Updated by Greg Shah over 11 years ago

Apply the missing changes from bzr to 1920/staging.merged4/ and start testing there as soon as the TIMCO run finishes.

#139 Updated by Constantin Asofiei over 11 years ago

GSO_419 and tc_po_receiving_009 have failed because I think an ERROR condition (caused in a external program invocation which tried to init a shared var which wasn't created) didn't propagate properly up the stack. Will search the root cause tomorrow morning my time.

#140 Updated by Constantin Asofiei over 11 years ago

The regressions were caused by the fact that I didn't push the correct this-procedure when a trigger was invoked and that the ConditionException's raised during external program intantiation were ignored, instead of propagating up the stack. The attached update fixes this. I will re-test as soon as Eugenie finishes.

#141 Updated by Constantin Asofiei over 11 years ago

This should not affect your work, but it should be mentioned that the 4GL does support what I call "malformed" symbols as names of functions and procedures (as well as frames, labels and queries). A malformed symbol doesn't use the documented valid symbol characters. Instead it starts with one or more NUM_LITERAL (e.g. 878796), DEC_LITERAL (3434.555), DATE_LITERAL (12-12-2053), MINUS (-) or UNKNOWN_TOKEN ('`' | '!' | '#' | '$' | '%' | '&' | '{' | '}' | '|' | ';') and then it ends with a symbol or even a reserved keyword! There cannot be any intervening whitespace between any of the tokens.

Just a note: P2J doesn't support malformed function/procedure names yet, right ? I've tested with a NUM_LITERAL prefix, and the resulted method name is invalid for Java. Do we have these kind of "malformed" function names for the current project ?

#142 Updated by Greg Shah over 11 years ago

Do we have these kind of "malformed" function names for the current project ?

A really good question. I will write some new reports to find the answer.

#143 Updated by Constantin Asofiei about 11 years ago

The attached update is merged with bzr revision 10154.

#144 Updated by Constantin Asofiei about 11 years ago

Passed regression testing, applied to lightning and committed to bzr revision 10155.

#145 Updated by Ovidiu Maxiniuc about 11 years ago

While working on some methods and attributes I found the following testcase that fails to convert correctly:

DEFINE VARIABLE eh AS HANDLE NO-UNDO EXTENT 10.
DEFINE VARIABLE i AS INTEGER NO-UNDO INIT 1.
/* eh initialization is not important */
DISPLAY eh[i]:NAME.

The eh[i]:NAME part will be converted into something like:

subscript(handle.unwrap(eh))i.getName()

instead of
handle.unwrap(subscript(eh, i)).getName()

#146 Updated by Constantin Asofiei about 11 years ago

Fixed the case mentioned by Ovidiu. Attached is the fix and testcase.

#147 Updated by Greg Shah about 11 years ago

I am OK with the change so long as amref is never null at line 1372. I suspect it is not ever null, but I haven't investigated it.

We have some other changes ahead of this one. Let's see if we can get it tested tomorrow.

#148 Updated by Constantin Asofiei about 11 years ago

I've added non-null protection for the amref case you mentioned. Also, I've fixed another case found by Ovidiu: BUFFER-VALUE - although in the latest OE documentation is marked as a method - in real life is an attribute, which can accept a subscript (I think someone who updated the OE docs got confused by the paranthesis). I've added a function in methods_attributes.rules where all attributes which can accept a subscript must be registered.

Another problem. In 4GL, a code which has a method name as a l-value compiles correctly:

h:get-signature("f0") = "bogus".

and on rutime gets the message:
**GET-SIGNATURE is not a setable attribute for PROCEDURE widget. (4052)

Looks like 4GL treats methods as read-only attributes. It would worth a check if there are methods as l-values in an assignment.

#149 Updated by Greg Shah about 11 years ago

I am fine with the proposed changes. There is a large update in "front" of this for staging (my datetime, datetimetz and int64 update). When that is through, you can merge and go into official conversion testing. We don't need any runtime testing.

Looks like 4GL treats methods as read-only attributes. It would worth a check if there are methods as l-values in an assignment.

I checked on this and there are no such cases. Please create a new redmine task for this "feature".

However, I do see almost 500 cases like this:

my-handle:BUFFER-FIELD("field-name"):BUFFER-VALUE = <some_expression>.

Do we properly handle chaining to an attribute as an l-value?

#150 Updated by Constantin Asofiei about 11 years ago

my-handle:BUFFER-FIELD:BUFFER-VALUE = <some_expression>.

This case is not covered yet; the method's parameters (as BUFFER-FIELD is a method in this case) are emitted in the wrong place. I'll work on a fix for it.

#151 Updated by Constantin Asofiei about 11 years ago

This update solves call chains like:

h = h:buffer-field(0):buffer-field1(1):buffer-field2(2):buffer-field3(3).
h:buffer-field(0):buffer-field1(1):buffer-field2(2):buffer-field3(3):buffer-value = "bogus".

#152 Updated by Greg Shah about 11 years ago

Looks good.

#153 Updated by Constantin Asofiei about 11 years ago

Added tests for the subscript and the method chain cases.

#154 Updated by Constantin Asofiei about 11 years ago

Conversion regression testing has passed (no changes in the generated sources).

#155 Updated by Greg Shah about 11 years ago

I know that recently we have had some issues related to handle chaining as well as with handle usage as an l-value. No Vadim may have found some other cases.

I am wondering how much of this is due to our static approach to unwrapping.

In addition, Eric pointed out that the generated code would look even better if we did this:

myHandle.unwrapMyType().whatever()

instead of this:

handle.unwrapMyType(myHandle).whatever()

In particular, it can also make chaining read better too. my-handle:BUFFER-FIELD:BUFFER-VALUE is a common case. I think we would convert it like this today (I'm not sure about the method names):

handle.unwrapBufferField(handle.unwrapBuffer(myHandle).getBufferField("name"))).getValue()

But without our static approach, it might look like this:

myHandle.unwrapBuffer().getBufferField("name").unwrapBufferField().getValue()

This is shorter and reads better since its chaining is more intuitive.

So the key questions here: how much of the current issues are related to the static vs instance approach and should be consider making a change sooner rather than later.

#156 Updated by Constantin Asofiei about 11 years ago

I agree chained unwrap and method calls instead of nested unwrap calls will look better, and the change is pretty simple (just emit java.method_call instead of java.static_method_call for the unwrap nodes), but the root cause is related to the fact that the generated Java ASTs for these chains must be "upside down" when compared to the 4GL ASTs. For example:

h:next-sibling:prev-sibling:name.

has this hierarchy:
         expression [EXPRESSION] @0:0
            : [COLON] @5:7
               h [VAR_HANDLE] @5:6
               : [COLON] @5:20
                  next-sibling [ATTR_HANDLE] @5:8
                  : [COLON] @5:33
                     prev-sibling [ATTR_HANDLE] @5:21
                     name [ATTR_CHAR] @5:34

while the generated Java AST hierarchy is "upside-down":
                        <ast col="0" id="2130303778854" line="0" text="getName" type="METHOD_CALL">
                          <annotation datatype="java.lang.Long" key="peerid" value="2121713844260"/>
                          <ast col="0" id="2130303778855" line="0" text="unwrap" type="METHOD_CALL">
                            <annotation datatype="java.lang.Long" key="peerid" value="2121713844258"/>
                            <ast col="0" id="2130303778856" line="0" text="getPrevSibling" type="METHOD_CALL">
                              <annotation datatype="java.lang.Long" key="peerid" value="2121713844258"/>
                              <ast col="0" id="2130303778858" line="0" text="unwrap" type="METHOD_CALL">
                                <annotation datatype="java.lang.Long" key="peerid" value="2121713844255"/>
                                <ast col="0" id="2130303778859" line="0" text="getNextSibling" type="METHOD_CALL">
                                  <annotation datatype="java.lang.Long" key="peerid" value="2121713844255"/>
                                  <ast col="0" id="2130303778857" line="0" text="unwrap" type="METHOD_CALL">
                                    <ast col="0" id="2130303778853" line="0" text="h" type="REFERENCE">
                                      <annotation datatype="java.lang.Long" key="peerid" value="2121713844256"/>
                                    </ast>
                                  </ast>
                                </ast>
                              </ast>
                            </ast>
                          </ast>
                        </ast>

This Java AST will convert to:
h.unwrap().getNextSibling().unwrap().getPrevSibling().unwrap().getName()

The fact that the nodes are processed and converted in reverse order, it means that we have to manually move any parameters for the attribute/method to the correct place (as they are emitted before the methods_attributes.rules has a chance to run).

#157 Updated by Constantin Asofiei about 11 years ago

Two cases Vadim has found:

hField:COLUMN-LABEL= ch.

This one should convert properly, if you add both getter and setter in methods_attributes.rules and register kw_col_lab in common-progress.rules - read_only_attribute function.

ch = hField:STRING-VALUE(1)

will be fixed too, if you register the KW_STR_VAL with the is_subscript_attr function in methods_attributes.rules.

#158 Updated by Constantin Asofiei about 11 years ago

Added an email dispatched to the team:

If you encounter problems with attributes/methods converting using an incorrect Java syntax (or you have other questions related how the attributes/methods accessed via a handle work - in a generic way), please inform me. I know some of you are working on adding support for other attributes/methods at this time, so the main info you need to know is that:

1. any writable attribute must be registered with the read_only_attribute function in common-progress.rules (this function returns true if the attribute is not a writable attribute).

2. any attribute which can accept subscript must be registered with the is_subscript_attr function in methods_attributes.rules

3. the hwrap variable in methods_attributes.rules contains only a suffix, which will be added to the "handle.unwrap" prefix. So, the final emitted method name will be "handle.unwrap".concat(hwrap). Note that if you do not explicitly set this variable for a certain keyword, it will default to "Widget".

4. finally, the main resource about how the attributes/methods get converted (in a generic way) is task #1920.

#159 Updated by Constantin Asofiei about 11 years ago

Regression testing of the ca_upd20130131d.zip changes has passed, applied to bzr revision 10162.

#160 Updated by Constantin Asofiei about 11 years ago

This update fixes a bug in dynamic-function's func name resolution, for cases like:

dynamic-function("func0":U)

The :U suffix needs to be ignored when searching for the func's name.

#161 Updated by Constantin Asofiei about 11 years ago

Handle the case of single quotes used with function names and others (removeQuotes was improved to remove enclosing single quotes too).
PS: I'm running conversion regression testing.

#162 Updated by Constantin Asofiei about 11 years ago

No changes in generated MAJIC sources. Greg, if the update looks good to you, I will release it.

#163 Updated by Greg Shah about 11 years ago

Code Feedback:

1. Strings can be qualified with more than just the :U. Your code assumes there can only be 1 qualifier. But the possibilities are actually quite complex. From our lexer:

/**
 * Matches the options that can be appended to a Progress string literal.
 * These options consist of 4 parts, some of which are optional.  String 
 * options always start with a ':' which must follow the string's ending 
 * single or double quote with no intervening whitespace.  After that
 * colon, there must be one or more of a justification, size allocation in
 * characters and a flag noting if this string is translatable or not.
 * <pre>
 *    justification          'r' | 'l' | 'c' | 't'
 *    translatable           'u'
 *    size                   integer literal (one or more contiguous digits)
 * </pre>
 * <p>
 * These options can come in any order and at least one of them must appear.
 * This causes some problems since there is no simple syntax to specify
 * such conditions in the grammar.  Whitespace cannot be placed between any
 * of the options.
 * <p>

...

STR_OPTIONS
   :

...

      (
         ':'
         (
            options { generateAmbigWarnings = false; }:
               ('r' | 'l' | 'c' | 't') ('u') (DIGIT)*
            |  ('r' | 'l' | 'c' | 't') (DIGIT)+ ('u')?
            |  ('r' | 'l' | 'c' | 't')
            |  ('u') ('r' | 'l' | 'c' | 't') (DIGIT)*
            |  ('u') (DIGIT)+ ('r' | 'l' | 'c' | 't')?
            |  ('u')
            | (DIGIT)+
              (
                   ( ('r' | 'l' | 'c' | 't') ('u')? )
                 | ( ('u') ('r' | 'l' | 'c' | 't')? )
              )?
            | {
                 // We match the EMPTY alternative if one of the valid
                 // string options is not found above!!!!!
                 // Abort! Reset state to remove the already matched ':'.
                 commitFlag = false;
                 setText( save );
                 rewind( back );
              }
         )
      )

character.progressToJavaString() handles all these cases and may be a better solution. It also handles escape sequences, doubled internal quotes and so forth...

2. The CommonAstSupport copyright date needs to change.

#164 Updated by Constantin Asofiei about 11 years ago

About issue 1: I don't think is OK to convert the string to Java, as currently I need to match with a Progress legacy function name... this would work only if the function name at the definition is processed using progressToJavaString too.

#165 Updated by Greg Shah about 11 years ago

OK. You can use the same approach as progressToJavaString() for dropping the quotes (and any following options):

      // are we starting with ' or " 
      char lead = progress.charAt(0);

      String contents = null;

      try
      {   
         // this will always work as long as the input string is valid
         contents = progress.substring(1, progress.lastIndexOf(lead));
      }

      catch (IndexOutOfBoundsException iob)
      {
         throw new IllegalArgumentException("Missing closing quote (" +
                                            lead + ").");
      }

#166 Updated by Constantin Asofiei about 11 years ago

OK, so I'm dropping the functions.rules change and I'll change CommonAstSupport.removeQuotes. Note that removeQuotes is called for:
  1. "dllname" in a EXTERNAL "dllname" clause - which is OK, as "dllname" supports options
  2. "func-name" in a DYNAMIC-FUNCTIN("func-name") clause - which is OK, as "dllname" supports options
  3. schema/p2o.xml, for date initializers (although there will never be a :option clause, is OK)
  4. ui_statements.rules match_widget_type function - which I think is for a VALUE case, which again will be OK

#167 Updated by Constantin Asofiei about 11 years ago

The removeQuotes change, I'm running regression conversion testing again.

#168 Updated by Greg Shah about 11 years ago

It looks good.

#169 Updated by Constantin Asofiei about 11 years ago

Conversion regression testing has passed, committed to bzr revision 10210.

#170 Updated by Constantin Asofiei about 11 years ago

Fixed attribute access when there are extent widgets, like:

def var i as int.
def var ch as char.
form i[1] with frame f1.
ch = i[1]:LABEL IN FRAME f1.

I'll put this in conversion regression testing ASAP.

#171 Updated by Constantin Asofiei about 11 years ago

Passed conversion regression testing, no changes in generated sources.

I'll merge the file with latest version and commit it.

#172 Updated by Constantin Asofiei about 11 years ago

Added merged update, which was committed to bzr revision 10258.

#173 Updated by Constantin Asofiei about 11 years ago

Fixed conversion support for HANDLE attribute, when used with system handles:

def var h as handle.
h = session:handle.
h = last-event:handle.
h = file-info:handle.
h = error-status:handle.

#174 Updated by Greg Shah about 11 years ago

The code is OK. If it passes conversion testing you can check it in and distribute it.

#175 Updated by Constantin Asofiei about 11 years ago

Fixes the case when a method param has subscript:

def var chx as int extent 5.
def var h as handle.

assign h = h:buffer-field(chx[3]).

Built on top of 22a.zip. Will release them in order.

#176 Updated by Greg Shah about 11 years ago

The code is OK. It can be committed and distributed when it passes testing.

#177 Updated by Constantin Asofiei about 11 years ago

Merged with latest bzr version, contains both CA 0322a.zip and CA 0322f.zip changes. committed to bzr revision 10313.

#178 Updated by Constantin Asofiei about 11 years ago

Note 80 from #2080 shows a missing error for function invocation:

class <Not found in code>

Example:

DEFINE VARIABLE myhand AS HANDLE NO-UNDO. 
FUNCTION qwerty RETURNS INTEGER (INPUT parm1 AS INTEGER) IN myhand. 
RUN /home/vadimn/doubler PERSISTENT SET myhand. 
qwerty(4) NO-ERROR. 
MESSAGE "had error: " ERROR-STATUS:ERROR. 
MESSAGE "message: " ERROR-STATUS:GET-MESSAGE(1). 

4GL had error: no
4GL Messages:
External user defined function 'qwerty' is not defined in the procedure context given. (2779) 

#179 Updated by Ovidiu Maxiniuc about 11 years ago

This update fixes:
  • unknown parameter in function/procedure calls. The unknown values for some data types (date, datetime, datetimetz, raw) are not build using the default constructor, instead the dedicated method from BDT is used.
  • if some errors occurs when converting compatible parameters (ex: integer 32bit overflow from decimal/int64) the function/procedure call is not performed any more (without this fix some push/pop operations were desynchronized and few exceptions were thrown later).

#180 Updated by Constantin Asofiei about 11 years ago

Review of om_upd20130404a.zip - do not use /* for adding comments, always use // single-line comments. Please make these changes and reupload.

#181 Updated by Ovidiu Maxiniuc about 11 years ago

Comment type changed to single-line.

#182 Updated by Constantin Asofiei about 11 years ago

You can go ahead and commit/release om_upd20130405a.zip

LE: just runtime changes, no conversion regression testing needed.

#183 Updated by Ovidiu Maxiniuc about 11 years ago

Some more remarks about parameter passing in subroutines (procedures and functions).
I investigated how arguments can be passed to Progress subroutine using all three modes (INPUT / OUTPUT / INPUT-OUTPUT) using numeral type of data but most likely the other groups of datatype will behave the same (CHARACTER / LONGCHAR and DATE / DATETIME / DATETIME-TZ).

For OUTPUT mode, the current incomplete-coded solution is to pass a special "conversion" constructor, with an extra boolean parameter, that will store the reference variable/field for initialization when subroutine ends.
It seems like this is not the best solution because these classes will also be used in other places where the extra coding is not necessary or even unneeded.
A new class that will implement Scopable will be implemented. It will take the reference to output variable in constructor and will be registered with TransactionManager.registerScopeable(). In scopeFinished() the reference will be updated.

#184 Updated by Constantin Asofiei about 11 years ago

For OUTPUT mode, the current incomplete-coded solution is to pass a special "conversion" constructor, with an extra boolean parameter, that will store the reference variable/field for initialization when subroutine ends.
It seems like this is not the best solution because these classes will also be used in other places where the extra coding is not necessary or even unneeded.
A new class that will implement Scopable will be implemented. It will take the reference to output variable in constructor and will be registered with TransactionManager.registerScopeable(). In scopeFinished() the reference will be updated.

We should not break the signature of the function/procedure during conversion, for thiscase. How I see this working is like this:
  • if this is an output parameter, emit an i.e. new integer(var, true) call during conversion, instead of var.
  • this c'tor which takes a boolean parameter will create and register (with TransactionManager.registerScopeable) an OutputParameter instance, which will save references for the passed var and for the new instance (i.e. this)
  • as you noted, this new class will implement Scopeable and in scopeFinished the saved reference will be updated (be careful on boundary checks for integer/int64).

For now, the conversion rules should emit this special c'tor only for types which are known to cause problems (the list you mentioned). Do not emit the c'tors for all types at once, we need to do it progressively, as they are implemented (as you work on integer/int64, you need touch only these two).

#185 Updated by Constantin Asofiei about 11 years ago

See note 25 from #1884 for other thoughts:
GES

Normally, we always define parameters as the exact match to the type originally specified. I prefer to stay with that approach as it is more clear to the reader of the code, what the intent was.

I think using a base class for the hierarchy makes sense. This parameter case is messy, but it also should be rare, since it has little possible useful purpose. So if there is a bit of extra messiness in this case, I can accept that.

How about using a technique similar to the decimal.PrecisionResetter inner class? This is a case where in downstream procedure, the precision of a shared decimal variable can be temporarily forced to a new value and then reset upon leaving the nested scope. In this longchar/char param case, it seems like for OUTPUT/INPUT-OUTPUT parms we could save a reference to the original var in an inner class instance and then reset the original parm values as needed based on the scope notifications. A special wrapping constructor or static factory method would be used to prepare the parm before the call. It can setup the new instance with the proper type and the inner class instance prepped. The registration would have to be done inside the function, although we can make a facility similar to TransactionManager.registerNextExternal() to hide this away.

#186 Updated by Ovidiu Maxiniuc about 11 years ago

The problem occurs only for OUTPUT parameters only. For INPUT-OUTPUT parameters the subroutines works directly on the argument variable/field.

FUNCTION f1 RETURNS LOGICAL (INPUT-OUTPUT p AS CHARACTER): RETURN TRUE. END.
FUNCTION f2 RETURNS LOGICAL (INPUT-OUTPUT p AS LONGCHAR): RETURN NO. END.

DEFINE VAR ch1 AS CHAR INITIAL "a".
DEFINE VAR ch2 AS LONGCHAR INITIAL "b".
MESSAGE f1(ch1) f2(ch2).
/* MESSAGE f1(ch2) l-c(ch1). -- will not compile */
/* MESSAGE DYNAMIC-FUNCTION("f1", ch1) DYNAMIC-FUNCTION("f2", ch2). --will compile but fail at runtime */

Also, from the last line looks like you cannot call functions with INPUT-OUTPUT parameters at all dynamically, even if the arguments match the parameter type.

This simplifies a little the cases we need to handle.

#187 Updated by Greg Shah about 11 years ago

  • Status changed from WIP to Closed
  • Estimated time set to 300.00
  • % Done changed from 0 to 100

I'm closing this task since:

1. It is essentially complete.
2. The small set of remaining work is defined in separate tasks which are assigned and scheduled or which can be safely deferred.

#188 Updated by Greg Shah over 7 years ago

  • Target version changed from Milestone 7 to Runtime Support for Server Features

Also available in: Atom PDF