Feature #1611: implement QUOTER utility - Base Language - Golden Code Redmine

Feature #1611

implement QUOTER utility

Added by Greg Shah over 11 years ago. Updated over 7 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Costin Savin

Target version:

FWD - Runtime Support for Server Features

Start date:

10/20/2012

Due date:

% Done:

Estimated time:

12.00 h

billable:

vendor_id:

GCD

version_reported:

version_resolved:

quoter_util_test_column.zip (1021 Bytes) Costin Savin, 10/29/2012 01:53 PM

cs_upd20121103a.zip (12.6 KB) Costin Savin, 11/03/2012 08:07 AM

cs_upd20121126a.zip (9.94 KB) Costin Savin, 11/26/2012 01:24 PM

cs_upd20121213a.zip (11.9 KB) Costin Savin, 12/13/2012 01:26 PM

cs_upd20121214a.zip (11.7 KB) Costin Savin, 12/14/2012 06:16 AM

cs_upd20121217a.zip (11.5 KB) Costin Savin, 12/17/2012 12:57 PM

cs_upd20121219a.zip (11.6 KB) Costin Savin, 12/20/2012 10:03 AM

Related issues

History

#1 Updated by Greg Shah over 11 years ago

The QUOTER utility in the 4GL is a separate executable that is part of the Progress runtime software BUT it is called as a child program (e.g. via OS-COMMAND). I don't know why they didn't implement it as 4GL syntax.

We are NOT going to implement it the same way. Instead, we will write a Java class that provides the same capabilities via a static method (or multiple overloaded static methods to handle the different options). The conversion should be modified to detect when the quoter utility is being called and to change the converted result to properly call the static methods in the new Quoter class. The class should be located in com.goldencode.p2j.util.

#2 Updated by Costin Savin over 11 years ago

Status changed from New to WIP

#3 Updated by Costin Savin over 11 years ago

The only solution I came up with yet to get the quoter utility call the method from Quoter class is by checking the first command parameter (first COMMAND_TEXT child of COMMAND_TOKENS to have quoter text)

        <!-- rule to check for QUOTER OS-COMMAND call-->
        <rule>this.type==prog.kw_os_cmd and this.getChildAt(1).getChildAt(0).text.toUpperCase()=="QUOTER",
             <action>createPeerAst(java.static_method_call,
                          "Quoter.launch",
                          parentid)
            </action>
            <!--If rule isn't true apply the default behaviour -->
            <action on="false">
            createPeerAst(java.static_method_call,
                          "ProcessOps.launch",
                          parentid)
            </action>
         </rule>
...

This will call the launch method from the Quoter class with the same parameters as the ProcessOps one (the last 2 parameters: silent and wait may be removed if they are not needed ). ()Is this solution ok?

#4 Updated by Greg Shah over 11 years ago

The general idea is OK. Yes, we need to detect when any child process invocation is using QUOTER and then change the call to our Java static method.

Some problems with your specific approach:

1. You have hard coded this to OS-COMMAND. But QUOTER can also be called from other statements like UNIX, DOS, OS2, BTOS... But this coding is not needed. Just match on the COMMAND_TOKENS node. This will be the 1st, 2nd or 3rd child of the KW_OS_CMD/KW_UNIX/KW_DOS... node.

2. Your current code always looks at the 2nd child of KW_OS_CMD. I believe that this is incorrect too. As noted above the COMMAND_TOKENS node can be either the 1st, 2nd of 3rd child. BUT, the same solution to #1 will solve this problem too. By matching directly on the COMMAND_TOKENS node you don't care about its child index.

3. Your text comparison assumes that the QUOTER will never have a prefix path, but it is possible for this to be specified (e.g. /$DLC/bin/quoter). That should match too. Also, on Windows, there might be a suffix like .exe on the end too, right?

4. I assume you are proposing these changes for rules/convert/process_launch.rules. I don't want the standard processing of child processes to be the false path of the quoter stuff. This means the detection of this case must be much earlier in the processing. I think this is a good case for adding a rule-set to the rules/annotations/annotations.xml processing. This code can detect if any of the evalLib("process_launch", this) cases are actually QUOTER cases. If so, then leave behind a boolean annotation named "quoter" that is true only if we found a match during annotations. Then in rules/convert/process_launch.rules the default processing should be bypassed if this annotation is true. And the quoter processing can be in its own section below the default stuff.

5. I need more information on the SILENT case. What does the QUOTER utility output to STDOUT/STDERR? Does it output anything at all? Implementing SILENT is easy for us. I'm hoping QUOTER doesn't ever output anything to the screen and it is always essentially SILENT.

6. On NO-WAIT, please document (in both the process_launch.rules and in the Java code that implements Quoter) that NO-WAIT is not supported at this time. It is probably sufficiently rare that we don't care right now. If we find otherwise, we will put it in at that time.

#5 Updated by Costin Savin over 11 years ago

Thanks for clearing things up.

#6 Updated by Costin Savin over 11 years ago

Done several test of Quoter utility use on Progress,
It seems any errors from quoter are only output to the destination file and not in other places.
Using quoter with "-c" options seems to give some weird results and it also excludes usage with "-d" parameter(the best guess is that column range represents the position (number of character range) in which a column is contained when you have a file where info is arranged to look like columns) also in the tests I tried a weird character appears "ï»" in the start of the transformed file, not sure why this happens.

#7 Updated by Greg Shah over 11 years ago

Make sure you keep all of your testcases. Add them to your update, so they get checked in.

a weird character appears "ï»" in the start of the transformed file

What is the value of that character in hexidecimal?

#8 Updated by Costin Savin over 11 years ago

EFBB

#9 Updated by Greg Shah over 11 years ago

Questions:

1. What is the encoding (e.g. codepage) of the input file?
2. What is the encoding of the output file?
3. Does this byte sequence always appear in the output?
4. Is this byte sequence always the first bytes in the file?
5. Does this happen in the case where you run QUOTER manually from the Windows command prompt?
6. Does this happen in the case where you run QUOTER via OS-COMMAND?

#10 Updated by Costin Savin over 11 years ago

File quoter_util_test_column.zip added

1-2.Attached the progress test , input file and result from os-command call and from windows console for quoter with -c option
3. For the tests I done the character are not always the same other weird character can appear too depending on input and column length ("»" ->hex BB seemed to appear in all tests I performed).
4.The wrong characters appeared in the tests as the first entity at the begining of the file followed sometimes by correct text or partially correct text. (not all the lines seem to be processed-in this example only 2 out of 3 lines are output).
5-6. The same result happens when calling via OS-COMMAND and manually from command prompt

#11 Updated by Costin Savin over 11 years ago

added

      <rule>parent.type == prog.command_tokens and type == prog.command_text and this.text.toUpperCase().contains("QUOTER")
          <action>def = copy.parent.parent</action>
          <action>def.putAnnotation("quoter", true)</action> 
      </rule>

inside rules/annotation/input_output.rules

and

         <rule>getNoteBoolean("quoter") 
            <action>createPeerAst(java.static_method_call,
                          "Quoter.launch",
                          parentid)
            </action>
            <!--If quoter isn't true apply the default behaviour -->
            <!-- the method call that creates the child process -->
            <action on="false">
            createPeerAst(java.static_method_call,
                          "ProcessOps.launch",
                          parentid)
            </action>
         </rule>
...

inside rules/convert/process_launch.rules

This should solve the conversion problem except this.text.toUpperCase().contains("QUOTER") may cause false quoter calls and will have to be improved

#12 Updated by Greg Shah over 11 years ago

Yes, this is much closer. But there are still 2 problems:

1. getNoteBoolean() can and would return null, you should instead code something like:

 isNote("quoter") and getNoteBoolean("quoter")

2. Only the 1st COMMAND_TEXT child (childIndex == 0) should be checked to see if it contains "QUOTER". That will reduce the number of possible invalid matches. The other thing that can be done is to ensure that QUOTER is not part of a larger base name (e.g. NOTQUOTER is bad but quoter.exe is good).

#13 Updated by Greg Shah over 11 years ago

Target version set to Milestone 7

#14 Updated by Costin Savin over 11 years ago

The weird characters that appeared in - c tests seem to be caused by test data being altered at ftp transfer(though it was not visible in editors).
After creating another data sample I got results without weird characters though there are still some issues:
- If there is no empty new line at the end of the input file the last row won't be processed(and outputed)
- In some case - - - character appear between lines (not sure yet why).
- The column data doesn't have any spaces removed though in the specification of Progres it does have a vague and interpretable description about - c option producing a data file without trailing blanks
- Column ranges: starting value is 1 ,starting column is inclusive and end column is also inclusive

#15 Updated by Greg Shah over 11 years ago

Good work. Finish off the testing to ensure that you have a complete specification. It is common for the Progress documentation to be wrong, incomplete and vague. Sometimes it is all 3 at once! <g>

Anyway, get the full set of specs and then match it exactly.

Make sure you have a set of testcases that can be used to check the resulting P2J. For example, you can make a 4GL file that calls OS-COMMAND quoter... multiple times with various input files. Capture the resulting output for each case. All of that should be added to your update (in the testcases/uast/ directory, you can create a quoter/ subdirectory and put everything there). The idea is that you will then run the same program in P2J and do a binary comparison of the P2J output files with the captured 4GL output files. As long as your testcases exercise all the different parts of the specification (including all the weird stuff), then you should have confidence that it is a compatible implementation.

Document all the results here.

#16 Updated by Costin Savin over 11 years ago

It seems that Progress replaces null resulting values with -, That and the fact that new line character(enter) is different from linux to Windows explains the previous - - - occurrence. A null value will result when a column starting point exceeds the size of the line, if starting point doesn't exceed the line length but the end point exceeds it the column value will be taken from the starting point to the end of the line.

#17 Updated by Costin Savin over 11 years ago

File cs_upd20121103a.zip added

Added proposed update for quoter utility.

#18 Updated by Greg Shah over 11 years ago

Is this ready for review? That would mean that you have no known issues and the update fully meets the requirements of the task,

If ready for code review, put the task into "Review" status.

#19 Updated by Costin Savin over 11 years ago

Status changed from WIP to Review

#20 Updated by Greg Shah over 11 years ago

Feedback:

1. The history entry in rules/annotations/input_output.rules needs to be updated from 009 to 010.

2. There is the file src/com/goldencode/p2j/util/Quoter.java~ which should not be in the update zip.

3. In rules/convert/process_launch.rules, please remove the T column in the history section.

4. When you run the quoter utility using OS-COMMAND SILENT, is there any difference in behavior?

5. Can you run the quoter utility using OS-COMMAND NO-WAIT?

Quoter.java issues:

6. Remove the T column.

7. Please don't import by class. Use * to import everything in a package.

8. Please put a blank line between the imports and the class javadoc.

9. The following method signature needs better formatting. Change this:

   private static void
         writeLine(String[] tokens, BufferedWriter writer) throws IOException
   {

into this:

   private static void writeLine(String[]       tokens,
                                 BufferedWriter writer)
   throws IOException
   {

10. Every member (data and methods) of all classes (including inner classes) MUST have proper javadoc. writeLine(), writeColumns() and all members of ColumnRange are missing javadoc.

11. Make ColumnRange a static inner class instead of a peer class that is in the same file. If it is useful enough to make it a peer class, we do this in a separate file. But if the class is only for internal use, then make it an inner class. By making it static, instances of ColumnRange won't store a reference to "this".

12. Add a private constructor for Quoter. This will ensure that no one can make instances of the class. Everything is static so they aren't necessary.

13. The ColumnRange class javadoc should be properly capitalized and punctuated.

14. The throws clause should be on the next line in writeColumns().

15. The implementation of writeColumns() is very duplicative. Instead of this:

         if (first)
         {
            if (columnText != null)
            {
               writer.append('"');
               writer.write(columnText);
               writer.append('"');
            } else
            {
               /* when column text is null Progres puts a "-" mark */
               writer.write("-");
            }
            first = false;
         } else
         {
            writer.write(" ");
            if (columnText != null)
            {
               writer.append('"');
               writer.write(columnText);
               writer.append('"');
            } else
            {
               /* when column text is null Progres puts a "-" mark*/
               writer.write("-");
            }
         }

it should be this:

         if (first)
         {
            first = false;
         }
         else
         {
            writer.write(" ");
         }

         if (columnText != null)
         {
            writer.append('"');
            writer.write(columnText);
            writer.append('"');
         }
         else
         {
            /* when column text is null Progres puts a "-" mark */
            writer.write("-");
         }

16. Please note that this:

         } else
         {

is incorrect. The coding standards require this instead:

         }
         else
         {

17. Please put empty lines in between the end of each method and the beginning of the javadoc for the next method. You don't need an extra empty line after the opening { for the class or before the closing } of the class.

18. writeLine() is duplicative. Change this:

         if (i == 0)
         {
            writer.append('"');
            writer.write(tokens[i]);
            writer.append('"');
         } else if (i < tokens.length)
         {
            writer.write(" ");
            writer.append('"');
            writer.write(tokens[i]);
            writer.append('"');
         }

to this:

         if (i > 0 && i < tokens.length)
         {
            writer.write(" ");
         }

         if (i >= 0 && i < tokens.length)
         {            
            writer.append('"');
            writer.write(tokens[i]);
            writer.append('"');
         }

19. Please add detailed specifications (not the copied 4GL documentation but rather, the specifications that you have researched by testing) to the class javadoc. The launch() method should also have full details about the syntax that it accepts. You don't have to duplicate that syntax in the class javadoc (just explain that the primary entry point is the launch() method and refer to it from there). Include example command lines, including common variations.

20. The catch blocks are improperly formatted. This:

   } catch (IOException e1)

should be this:

   }

   catch (IOException e1)

21. The main quoter() method is much too large. Please try to refactor it into separate pieces. For example, the column range parsing seems like it could be a separate method. This will make the code much easier to read/understand.

22. Please document how error handling works. And please analyze to detect if there are any differences between our error handling and how the 4GL does it.

I will let you make these changes and I will look at the quoter() logic more carefully then. The documentation will also help me evaluate your work better.

#21 Updated by Costin Savin over 11 years ago

File cs_upd20121126a.zip added

Added proposed update, commityed testcases to bazaar, run regression tests (to see if they don't fail because of the rule change - and everything seems ok).

#22 Updated by Greg Shah over 11 years ago

Status changed from Review to WIP

This is much better.

Feedback:

1. What are the results of your testing with these options? I see that you have excluded them from process_launch.rules, but there is no discussion here or in the JavaDoc about whether these things can be used in Progress.

4. When you run the quoter utility using OS-COMMAND SILENT, is there any difference in behavior?

5. Can you run the quoter utility using OS-COMMAND NO-WAIT?

In Quoter.java:

2. Please format this code:

 ** -#- -I- --Date-- --JPRM-- ----------------Description-----------------
 ** 001 CS  20121102          Created initial version.

 */
package com.goldencode.p2j.util;

like this:

 ** -#- -I- --Date-- --------------------Description---------------------
 ** 001 CS  20121102 Created initial version.
 */

package com.goldencode.p2j.util;

3. launch() should be named something else. We are no longer launching a child process so this is misleading. Let's go with process().

4. There is no reason to ever pass a 1st parameter of "quoter" to the launch()/process() method, right? Please drop it during conversion.

5. There is a flaw in the conversion and in the launch()/process() method: these only support OS-COMMAND/UNIX and other shell launching methods. For example, you always parse out the destination file using the ">" as a delimiter. But it seems to me that you could easily call quoter using INPUT THROUGH or maybe even INPUT-OUTPUT THROUGH and OUTPUT THROUGH. Please add testcases for all 3 of these possibilities and see if it is possible. Such cases will likely be reading from STDIN and/or writing to STDOUT instead of to a named file or files.

6. Why are you formatting things like this:

            quoterWorker(readFile, 
                         destFile,
                         columnRanges,
                         separator);

when you can easily format it like this:

            quoterWorker(readFile, destFile, columnRanges, separator);

Putting it all on 1 line does not go over 78 characters and it compresses the file, which is good.

7. As noted in the previous feedback #17:

17. Please put empty lines in between the end of each method and the beginning of the javadoc for the next method. You don't need an extra empty line after the opening { for the class or before the closing } of the class.

These are still a problem.

Between methods there needs to be a blank line.

      /**
       * This represents the starting point from where a column will  be
       * subtracted from a  String  line.
       /
      private int startColumn;
      /*
       * This represents the end point to where a column will be subtracted
       * from a  String  line.
       */
      private int endColumn;

The extra blank line at the end is not needed:

      public int getEndColumn()
      {
         return endColumn;
      }
   }

}

8. JavaDoc formatting still needs some work. Things like this:

      /**
       * Constructor with 2 parameters. Which will set the start and end index
       * defining a column. 
       * @param   startColumn
       *          The index point from where the column starts(position 
       *          starts from 1).

should be this:

      /**
       * Constructor with 2 parameters. Which will set the start and end index
       * defining a column. 
       *
       * @param   startColumn
       *          The index point from where the column starts(position 
       *          starts from 1).

9. The closeResourcesPreExit() has some issues:

- It does not need to be called from multiple places. You should close the resources in the same method that opens the resources.

- It should not be using exceptions for normal flow of control.

- As a general rule, if you are just going to return no matter what happens, then you can just ignore the exceptions anyway. Closing a stream is an example of a good place to ignore IOExceptions. They simply don't matter at that point since there is nothing you can do to "solve" a problem with them AND you are trying to close the resource anyway, so you have done the best you can do and ignoring the IOE is OK.

See if you can centralize the closing of the resources in launch()/process(). Let any real exceptions flow back up to that level. Don't use exceptions where you can just return. If you really need to inform the caller about success/failure, then return a boolean flag.

10. Does Progress do the same thing as quoteColumns() where it reads the entire input file (but outputs nothing) if columnRanges.size() == 0? Normally, we would just never read the input file at all to save time BUT if Progress actually reads it, it would be important for us to do the same. For example, if we don't it might cause us to process OUTPUT THROUGH and INPUT-OUTPUT THROUGH differently.

11. The refactored code makes it MUCH easier to understand what is happening. But looking more carefully at the refactored code, it seems like we could simply condense it down to this:

   private static void quoterWorker(BufferedReader     readFile,
                                    BufferedWriter     destFile,
                                    List<ColumnRange>  columnRanges,
                                    String             delimiter)
   throws IOException,
          Exception
   {
      boolean delMode = (delimiter != null);
      boolean colMode = (!delMode && columnRanges != null);
      int     colSize = colMode ? columnRanges.size() : 1;
      boolean simMode = (!delMode && !colMode);
      String  line    = null;
      boolean first   = true;

      while ((line = readFile.readLine()) != null)
      {
         if (delMode || (colMode && colSize > 0) || simMode)
         {
            if (first)
            {
               first = false;
            }
            else
            {
               destFile.newLine();
            }

            if (delMode)
            {
               String[] tokens = line.split(separator);
               writeDelimitedLine(tokens, destFile);         
            }
            else if (colMode)
            {
               writeColumnsLine(line, columnRanges, destFile);
            }
            else
            {
               destFile.write('"' + line + '"');
            }
         }
      }
   }

Note that the protection logic for readFile != null should be in the caller. There is no reason to call quoterWorker() at all if there is no input to read. Likewise, I assume that the input and output streams will be closed in the caller.

I am not 100% sure that the logic is the same, but it is very close and I think it can be made the same. This takes the common code from the 3 different versions and reuses it. I think the overall version is still easy to understand.

11. I think there still needs to be full details in JavaDoc of the syntax that is supported AND how the 3 different possible modes work:

19. Please add detailed specifications (not the copied 4GL documentation but rather, the specifications that you have researched by testing) to the class javadoc. The launch() method should also have full details about the syntax that it accepts. You don't have to duplicate that syntax in the class javadoc (just explain that the primary entry point is the launch() method and refer to it from there). Include example command lines, including common variations.

Put that javadoc in the launch()/process() method so that callers will immediately know what that method really does. Just saying it parses and implements the quoter utility is not sufficient.

#23 Updated by Costin Savin over 11 years ago

Made the changes according to feedback, there's some issue though:

4. There is no reason to ever pass a 1st parameter of "quoter" to the launch()/process() method, right? Please drop it during conversion.

The problem with this is that the first parameter "quoter" is part of the COMAND_TOKENS -> COMMAND_TEXT (in normal case it represents the command name of the utility) together with all the other parameters, I'm not sure yet how it can be removed without affecting other cases I'll have to think some more about how I can do this.

#24 Updated by Costin Savin over 11 years ago

5. There is a flaw in the conversion and in the launch()/process() method: these only support OS-COMMAND/UNIX and other shell launching methods. For example, you always parse out the destination file using the ">" as a delimiter. But it seems to me that you could easily call quoter using INPUT THROUGH or maybe even INPUT-OUTPUT THROUGH and OUTPUT THROUGH. Please add testcases for all 3 of these possibilities and see if it is possible. Such cases will likely be reading from STDIN and/or writing to STDOUT instead of to a named file or files.

Tested quoter with all 3 options (they work - extra rules changes are necessary to cover them).
Can you give an example of INPUT-OUTPUT TROUGH quoter usage where reading from STDIN and/or writing to STDOUT?

#25 Updated by Greg Shah over 11 years ago

Can you give an example of INPUT-OUTPUT TROUGH quoter usage where reading from STDIN and/or writing to STDOUT?

The idea of this approach is to create the INPUT-OUTPUT stream, then write some content to it that you want to be quoted and then read the quoted content back out. This is a kind of "co-processing", like shell pipelines except that the writing and the reading are both done by the 4GL program.

Something like this (I haven't tested this, it is just a skeleton):

def var i as int.
def var data-list as character extent 6.
def var num as character format "x(2)" label "Number" extent 6.
def var text as character format "x(17)" label "Text" extent 6.
def var my-code as character format "x(3)" label "Code" extent 6.

data-list[1] = "72 Wind Chill Hockey BBB".
data-list[2] = "91 Low Key Checkers  DKP".
data-list[3] = "92 Bing's Ping Pong  SLS".
data-list[4] = "93 Lana del Ray Soup KEK".
data-list[5] = "94 Rice and Beans    STK".
data-list[6] = "23 Barb's cruel      NOT".

define stream s.
input-output stream s through quoter -C 1-2,4-20,22-26.

/* write the raw output to quoter */
do i = 1 to 6:
   put stream s data-list[i].
end.

/* inform quoter that there is no more data */
input stream s close.

/* read the quoted data back */
do i = 1 to 6:
   set stream s num[i] text[i] my-code[i].
end.

/* close our output */
output stream s close.

/* display the results on the screen */
do i = 1 to 6:
   display num[i] text[i] my-code[i] with frame rpt down 6.
   down frame rpt.
end.

#26 Updated by Constantin Asofiei over 11 years ago

Greg, some thoughts about the QUOTER utility: as this is an external program invoked via the OS-COMMAND and other process-launching commands (like INPUT ... THROUGH), the conversion rules can't determine for sure that process-launching command invokes the QUOTER utility; if we have something like:

def var cmd as char.
cmd = "quoter". 
qparams = "". /* set the parameters */
os-command value(cmd +" " + qparams).

then 4GL will execute it properly. More, I think it is possible to pass the output from another executable to the quoter utility, as in:

cmd = "ls -al | quoter".

and viceversa:

cmd = "quoter ... | someotherprogram".

As it is an external program, and all external programs are executed on the client side, then I think the solution is to build a separate utility which we will be invoked on the client side. I say this because, as 4GL has allowed the QUOTER utility to be invoked as a separate process, then we can't assume it will be invoked always in its simplest form; to be able to handle all the I/O redirection properly (as allowed by the command passed to any process invocation statement), we need a separate utility, as 4GL has.

#27 Updated by Costin Savin over 11 years ago

Tested INPUT THROUGH and OUTPUT THROUGH stream case and they are viable.
The INPUT-OUTPUT THROUGH stream case doesn't work (it gets stuck writing the output to quoter).
This means also the implementation will have to be changed to work with streams and the parameter parse will have to be changed also.

#28 Updated by Greg Shah over 11 years ago

Constantin: you make very good points here. Yes, these other cases are problems. And the client-side invocation is also a problem.

This suggests the following approach:

1. Add testcases to show these additional scenarios.

2. I don't want to implement a native utility (e.g. in C). Create a main() for Quoter.java and implement the same interface as in the 4GL. That main() will never see anything like > or | because those will be handled by the shell and the stdin/stdout will already be set properly. I assume that the interface is that it will either read from the last parameter and write to stdout OR it will read from stdin and write to stdout. Obviously, it will still see its options if present.

3. The key approach here is to transform the "quoter" text to the proper replacement text like "java com.goldencode.p2j.util.Quoter". It might be possible to do this during conversion, but the os-command value(expression) case is tricky. We could hack it with hints, but this would be a never-ending process where we would constantly need to add new kinds of hints to identify the nodes where we need to do the transformation. But upon additional thought, it seems we can solve this problem at runtime. The ProcessDaemon.java will receive the final text of the expression and can do the transformation based on some matching rules. We will have to see if the classpath (or any other option) needs to be set in the standard replacement text, but we certainly know that the p2j.jar and java are both available on that client where the ProcessDaemon is running, so it should be feasible.

4. Quoter.java will have to handle any issues with context-locals and other state/session configuration since this will be a kind of "disconnected" JVM where the BaseDataTypes will be used without access to the server, to the directory or any other configuration.

Costin: this new approach can be implemented completely inside ProcessDaemon for both the OS-COMMAND shell cases and the INPUT-OUTPUT/OUTPUT THROUGH redirected stream cases. There will be no conversion changes in this at all.

#29 Updated by Constantin Asofiei over 11 years ago

I don't think it is that easy to replace "quoter" references from the process-launching command with the "java ..." command to launch our utility (consider that the process-launching command is OS-specific and we will need to handle at least linux and windows style commands, with all their complexities - we can't just do a replaceAll("quoter", "java ...")). I think it will be easier to build an OS-specific script named "quoter" which will be added to the OS path and will be responsible of invoking our utility. This way we can leave all the OS-specific stuff outside P2J and also we can leave the process-launching command intact.

#30 Updated by Costin Savin over 11 years ago

Related to this
Quoter syntax is:
quoter [options] [input_file] >[output_file]

In the quoter implementation I've treated the >[output_file] part as argument to the quoter, but in the case we will use it as separate utility it seems to be os command related pointing where the output stream( System.out equivalent ) of quoter should go, should the quoter entry point only expect [options] in case of stream and get data from System.in or [options] [input_file] and get data from file and output to System.out?

#31 Updated by Greg Shah over 11 years ago

Constantin, you raise some good points.

Here are the differences as I know them:

- On Windows, the executable name can include the .EXE or it can optionally leave it out.
- The basic filesystem differences between Linux/Windows need to be handled (case-sensitivity, path separator, file separator, absolute paths with/without drive letters).
- On all platforms, we have to handle having an absolute path, a relative path and no path.
- Handle having spaces in the path/filename. Actually, this one may not be an issue if our 4GL parser/conversion handles tokenizing things right. It seems like this might automatically be handled by the parameters being turned into an array of strings.
- Parsing the executable name later in the pipeline (not just as the first parameter). This isn't hard because we just have to process all of the params array for the launch, not just the first element.

I know it is more work on the Java side to parse things properly. But the result is so much cleaner for deployment, that the result is worth it. Please go forward with the search/replace processing.

Costin: please make sure that your testcases show all of the above cases so that we can get this right.

#32 Updated by Greg Shah over 11 years ago

should the quoter entry point only expect [options] in case of stream and get data from System.in or [options] [input_file] and get data from file and output to System.out

Yes, Quoter will ALWAYS output to STDOUT.

The only difference is IF [input_file] is specified, then you read from that file instead of STDIN.

#33 Updated by Costin Savin over 11 years ago

Compared p2j implementation results to progress,however there seems to be one problem :
When we use it in p2j the console it's cleared of previous displayed data, while in Progress the displayed data remains on screen.

#34 Updated by Costin Savin over 11 years ago

Would launching the Quoter as a separate process be a solution for this?

#35 Updated by Greg Shah over 11 years ago

I don't understand. My recent suggestion to modify ProcessDaemon to transform "quoter" into "java ... com.goldencode.p2j.util.Quoter" is all about launching a child process. And I am sure that this is how Progress 4GL does it too.

#36 Updated by Constantin Asofiei over 11 years ago

Greg, what Costin saw I think is a difference between how 4GL behaves on Windows and Linux environments. Assuming we have this script (test.cmd on win and test.sh on linux):

echo "something"

which is launched by this program on windows:

os-command test.cmd.

and this program on linux:

os-command test.sh.

the behavior is different.
On windows, unless no-wait or silent is set for OS-COMMAND, the process launching will start the test.cmd in a new cmd.exe shell and will wait for the user to explicitly type "exit" on the command line to close it. Testing on P2J under linux, the behavior is completely different - it waits for a key press even if NO-WAIT is set for OS-COMMAND (this does not happen under windows, when NO-WAIT or SILENT is set). More, under linux, when SILENT is set and NO-WAIT is not set, output is shown on the terminal, while on windows is not.

My concern is that when we fix this for windows we will alter the P2J's linux behavior, which is currently implemented. More, when we start regression-testing windows-based 4GL applications, I think we should do it in a windows environment.

Finally, we should test all 4GL's process-launching commands (unix, os-command, input/output/input-output through) on windev01 and check how they behave on windows.

#37 Updated by Greg Shah over 11 years ago

In regards to note 36 (https://projsrv01/redmine/issues/1611#note-36):

Eugenie: please read that note carefully. You explore/document the exact features of each of the above mentioned cases.

Constantin: really good findings. Our approach will be to implement the native support for Windows process launching in libp2j, which will compile differently for Linux and Windows. To the extent that the Java code must also be different, we will implement platform specific logic there too. Your point is well taken: we are not going to change our implementation for Linux. We are just going to make the Windows implementation work properly.

In regards to the regression testing, we have recommended that the customer attempt to remove as many external shell programs, API/library calls (and other native dependencies) as possible. We will see if they can achieve that objective.

#38 Updated by Costin Savin over 11 years ago

Are there more possible extensions quoter can have besides .exe, on unix maybe? Using Quoter with a path containing spaces doesn't work in either p2j or Progress (even with value of variable containing the path).

#39 Updated by Eugenie Lyzenko over 11 years ago

Greg: yes, the calls input through, output through and input-output through has the problem in Windows running after conversion. The samples converted without problems and even started and looks the similar or same(input-output through) but internally there are the problem in these converted code related to the missing launchers for native Windows external process execution. And I would have been note this additionally. And we need to have the native process executor to fix this issue. So I change the status of these calls to only conversion level.

#40 Updated by Greg Shah over 11 years ago

Are there more possible extensions quoter can have besides .exe, on unix maybe?

On UNIX/Linux, quoter would have no extension, not even the ".exe". The .exe (or .Exe or .eXe or .EXE...) are only used on Windows.

Using Quoter with a path containing spaces doesn't work in either p2j or Progress (even with value of variable containing the path).

I think it should work if the text is contained in a string literal, like this:

OS-COMMAND "c:\my stupid space \laden path\quoter.exe" -C 45-50 whatever > output.txt

#41 Updated by Costin Savin over 11 years ago

OS-COMMAND "c:\my stupid space \laden path\quoter.exe" -C 45-50 whatever > output.txt
this kind of call might be problematic since the conversion result would be:

ProcessOps.launch(new String[]
            {
               "c:my stupid space laden pathquoter.exe",
               "-C",
               "2-3,4-20,22-26",
               "../../uast/quoter/quoter_util_column.d",
               ">../../uast/quoter/quoter_util_test_column_incorrect_range.q" 
            }, true, true);

which by the logic would not match the quoter.

Also for input from stream there is the problem that it can try to read from System.io when the stream hasn't put anything there so just checking the readLine for null like in the case of existing source file doesn't work. Additionally we don't know exactly how long to wait for System.in to give something.
A solution for this would be to use a loop like

 while ( ((line = readFile.readLine()) != null) || 
                 (isSysIn && (System.in.available() > 0)))
 {
    if (line!=null)
    {
        ...
    }
 }

where isSysIn is a boolean which is true if we don't have a file input (System.io used as input) or false otherwise.
This seems to work but I have some doubts it is failsafe.

#42 Updated by Greg Shah over 11 years ago

"c:my stupid space laden pathquoter.exe"

What do you mean here? This example is for Windows. If we are converting for Windows, p2j should properly deal with the \ in strings. Of course, your p2j.cfg.xml settings must be correct. If your settings are correct and this still happens, then there is a bug I think.

My point is that I think this is possible in the 4GL. If so, then we MUST support it.

Also for input from stream there is the problem that it can try to read from System.io when the stream hasn't put anything there so just checking the readLine for null like in the case of existing source file doesn't work.

I don't understand. Our reading from the stream should block until there is something to read or we read EOF. Are you using non-blocking reads?

#43 Updated by Costin Savin over 11 years ago

1. You are right about the conversion settings I had them for Unix, will add the logic to handle this case.

2. Normally EoF of the stream is detected by read() returning -1 but thinking we can't have a read() and also have the line at the same time (without losing line integrity) I went using available() to avoid it but it can be wrong.
This seems the solution for the read logic:

while ((line = readFile.readLine()) != null || readFile.read() >= 0)
{
   ...
}

#44 Updated by Greg Shah over 11 years ago

How about using one of the com.goldencode.p2j.util.Stream sub-classes to read the input? We have already written this logic many times. I don't want to duplicate it in Quoter.

#45 Updated by Costin Savin over 11 years ago

Which Stream can I use?
Looking at the com.goldencode.p2j.util.Stream implementations:
FileStream constructor expects a filename and for System.in we have InputStream to provide.
ProcessStream expects the process id (this might may not be obtainable in the same way for linux and windows)
RemoteStream constructor expects the stream id on the remote node.
StreamWrapper just a wrapper against a stream.
TerminalStream terminal related , cannot actually read line
NullStream mock class

#46 Updated by Greg Shah over 11 years ago

Looking closer at the FileStream sub-class (which is the closest match) I don't think there is an easy way to turn a System.in (InputStream) into the channel we need in FileStream. The other stream sub-classes don't work for this case either.

Let's back up. What is the class of readFile?

Show the reading logic here please.

#47 Updated by Costin Savin over 11 years ago

The readFile is a BufferedReader obtained in this way:

// infile is the input parameter, if it's null we input from System.in      
if (inFile == null)
{
   inputReader = new BufferedReader(new InputStreamReader(System.in));
}
else
{
  try
  {
     try
         {
            inputReader = new BufferedReader(new FileReader(inFile));
         }
         catch (FileNotFoundException e)
         { 
         ...
         }
  }
...

// inputReader is called as readFile name in quoterWorker, name should be changed there to inputReader

quoterWorker(inputReader, outputWriter, columnRanges, separator)

...
private static boolean quoterWorker(BufferedReader     readFile,
                                    BufferedWriter     destFile,
                                    List<ColumnRange>  columnRanges,
                                    String             delimiter)
{
   ...
   while ((line = readFile.readLine()) != null || readFile.read() >= 0)
   {
      ...
   }
   ...

#48 Updated by Greg Shah over 11 years ago

while ((line = readFile.readLine()) != null || readFile.read() >= 0)

This doesn't look right. The readFile.read() >= 0 will read a character from the stream and discard it. The read position will move to the next character in the stream and that character will be lost forever.

All the low level read() implementations are blocking. I would expect readLine() to also be blocking. It is only supposed to return null at end of file (EOF). Is there a problem with just doing the following:

while ((line = readFile.readLine()) != null)

#49 Updated by Costin Savin over 11 years ago

Actually this is the problem
readFile.readLine()) can return null if the stream hasn't put anything on yet and in this case readFile.read() will not affect the line read it's just to check if the stream is closed (-1).

if readFile.readLine()!= null is true readFile.read() will not be evaluated so line is not affected.

inside the while line is also checked for null before doing anything

#50 Updated by Greg Shah over 11 years ago

I don't understand how the readFile.read() can help in this case. If readLine() returns null but you are not at EOF, then read() will in fact read the first character from the file and then it will be discarded. You will always lose a character in the exact case it is meant to help.

Something seems wrong here in general. Please look at the J2SE code for BufferedReader to see if readLine() is somehow non-blocking. If it is non-blocking (which I would expect), then examine the code to determine how it is possible to ever return null without EOF.

#51 Updated by Costin Savin over 11 years ago

The case I'm refferring to is when readFile is a BufferedReader pointing to System.in, in case readFile points to a file
readFile.readline() being null marks the end of the file and readFile.read() will also return -1.

However I do realize now () that there might be a problem where System.in is used for readFile, readFile.readLine() is null and before we check readFile.read() the stream puts something in that character might be lost.

I'm thinking "might" because for the cases I tested the output was ok but not sure if it was because readFile.read() reads the newLine character or there isn't something available from the stream.

For the previous approach:

 while ( ((line = readFile.readLine()) != null) || 
                 (isSysIn && (System.in.available() > 0)))
 {
    if (line!=null)
    {
        ...
    }
 }

the available() method will return an estimate of how many bites are left for read().
The word estimate makes me think of unpredictable result.

#52 Updated by Greg Shah over 11 years ago

available() just tells you if you are going to block or not on a read. If it returns > 0 then you will not block if you try to read that number or less of characters. If it returns 0 you may block, unless more characters appear before you try to read.

It can't be used for the case you have because even if it returns 0, that doesn't mean the EOF has been reached. With streams, you don't know when characters will be available because they might be generated on the fly. That is why it is so important to block on the readLine() until EOF is reached.

Please investigate the null behavior of readLine() and report back. The thing that doesn't make sense here is that if the steam is open, then you should always block on a read/readLine until you get the data OR until you get to EOF.

If the research it is going to take long, then just rework this code to loop by using read() to read the stream character by character. You can then match on newline directly (accept \n, \r and \r\n as valid newlines) and all other processing would be the same.

#53 Updated by Costin Savin over 11 years ago

Tested for
to see the values from readFile.read() when the logic gets to them

 while ((line = readFile.readLine()) != null || readFile.read() >= 0)
   {
      ...
   }

and the value is 0 which means null character
in some atempts the logic never gets to evaluate it.

Done re-testing with using just readLine:

 while ((line = readFile.readLine()) != null)
{ 
   ...
}

I didn't manage reproduce it anymore with the same testcase , though it did happen before to fail when checking just readLine().
In conclusion it seems to be safer to use
readFile.read() then add the read characters unless they are not null to the constructed line.
A problem would be the new line recognition which in Unix is LF and windows is CR + LF.

#54 Updated by Costin Savin over 11 years ago

Is it safe to consider that if the code is converted with a certain OS specific it will also only be run in that OS?
I'm asking because of the quoter path check when dealing with a windows path but run from Unix can yield different results then when run from Windows.

#55 Updated by Greg Shah over 11 years ago

and the value is 0 which means null character

The character 0 is a valid character. Yes, it is also called the ASCII NUL character. But this has nothing to do with the null return value of readLine(). That null value is a String object whose reference has been set to the "null" literal.

Normally, I would expect that if readLine() returned null, that read() would return -1, which is an invalid character (when dealing with a return value whole size is at least 16 bits). For this reason, -1 is used to identify a condition in which no valid character can be returned (like EOF). 0 is a valid character...

I didn't manage reproduce it anymore with the same testcase , though it did happen before to fail when checking just readLine().
In conclusion it seems to be safer to use
readFile.read() then add the read characters unless they are not null to the constructed line.
A problem would be the new line recognition which in Unix is LF and windows is CR + LF.

Please use the readLine() approach as I have suggested. If you find a recreate we will look at it further.

By using readLine() there should be no newline matching issue.

Is it safe to consider that if the code is converted with a certain OS specific it will also only be run in that OS?
I'm asking because of the quoter path check when dealing with a windows path but run from Unix can yield different results then when run from Windows.

Good question. Normally, the answer is that commands are platform specific. However, even in the normal case there are exceptions:

- Many qualified filenames for UNIX and Linux are interchangeable (they work on either platform).
- A simple reference to the basename (and no .EXE extension) will work on both Windows and Linux/UNIX so long as the program is in the current directory or can be found in the path.

In this quoter case, we have a new kind of exception: our replacement code is Java based, so it will run on any platform. So long as our matching code will can parse out the various cases (regardless of the platform on which it is running), then we should always be able to remove it and replace it with our Java launch.

For quoter, the search/replace should handle all possible Linux/UNIX and Windows cases and do the replacement no matter what it finds.

#56 Updated by Costin Savin over 11 years ago

Coming back to the case of converting:

OS-COMMAND "c:\my stupid space \laden path\quoter.exe" -C 45-50 whatever > output.txt

Only now I managed to run the conversion for this case on a Windows system machine the p2j.cfg.xml set for windows while testing path validation on windows.
Unfortunately the conversion result is the same as the one from unix "\" is eliminated from the output.

ProcessOps.launch(new String[]
            {
               "c:my stupid space laden pathquoter.exe",
               "-C",
               "2-3,4-20,22-26",
               "../../uast/quoter/quoter_util_column.d",
               ">../../uast/quoter/quoter_util_test_column_incorrect_range.q" 
            }, true, true);

The conversion also has the same result when using non quoted path: OS-COMMAND c:\dir_a\dir_b\quoter.exe -C 45-50 whatever > output.txt

#57 Updated by Greg Shah over 11 years ago

Yes, Eugenie found this as well. He is working on it.

In the meantime, you should do 2 things:

1. Make sure this case works on Windows. It was my understanding that it does. But I would like to hear your confirmation.

2. Manually modify the converted Java code to add back the missing backslashes (actually, it will need \\ double backslashes for every one in the original 4GL string). Then you can test your implementation to confirm that it handles these cases.

#58 Updated by Costin Savin over 11 years ago

1.Already tested by copying the quoter executable in a path containing spaces and it works.

2. Working on it, my question is: when the conversion it's fixed it will transform "\"
to "\\" always, right?

#59 Updated by Greg Shah over 11 years ago

2. Working on it, my question is: when the conversion it's fixed it will transform "\" 
to "\\" always, right?

Yes.

#60 Updated by Costin Savin over 11 years ago

Should also windows network paths be detected as paths for quoter?
I'm having a little trouble with the regular expression for windows paths.

#61 Updated by Eugenie Lyzenko over 11 years ago

Should also windows network paths be detected as paths for quoter?
I'm having a little trouble with the regular expression for windows paths.

FYI: The fix I'm working on for path string on Windows will be able to handle the network pathes like this: \\WINSRV\Directory\On\Server

#62 Updated by Greg Shah over 11 years ago

Costin: yes, we should handle paths on network shares.

Are we making this too difficult? The basic matching seems to me to be this:

Match "quoter" on a case-insensitive basis, which should be at the end of a string, except if it has .exe, which should also be allowed.

The quoter text must be preceded by a valid file separator character (/ or \) with no other text in between.

All pathing text before that is fine.

Is there ever a case where there is other text in the string, before the \quoter or /quoter? Because it seems that the parsing process should already have properly split the command into its own string. Is that not the case?

#63 Updated by Costin Savin over 11 years ago

I've tried to cover that, here is an example for the latest pattern I've gotten today:

String quoterPath = "C:\\Program Files\\";
String regex = "([a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+)\\((?:[^\\/:*?"+'"'+"<>|\r\n]+\\)*)";
//(where ([a-z]:|\\\\[a-z0-9_.$]+\\[a-z0-9_.$]+) is the drive letter or network node)
// and ((?:[^\\/:*?"+'"'+"<>|\r\n]+\\) matches the folder path made from a non-capturing group of
//characters that are not \ / " < > | C.R. or newline. The sequence can repeat 0 or many times.
Pattern win = Pattern.compile(regex,Pattern.CASE_INSENSITIVE);
boolean isWinPath = win.matcher(quoterPath).matches();

This should theoretically work but doesn't match correctly the paths

#64 Updated by Greg Shah over 11 years ago

Is there anything wrong with my assumption that the path and the executable name will be in the same string AND that there will be nothing else in that string?

If I am correct, then there is no need to ever do the complete regex search like you are trying. We simply won't care about anything except that the string ENDs with / or \ and quoter and an optional .exe.

#65 Updated by Costin Savin over 11 years ago

That would be much easier since I already have these checks in place before doing the pattern check which is supposed to see if the part in front of Quoter is really a valid path, unix or windows depending on separator( even if it ends with / or \ it might not be a path ex: Hunter/Gatherer?/Quoter).
If this check is not necessary I will go further without it.

#66 Updated by Costin Savin over 11 years ago

There is one case of error message which might not be able to be reproduced exactly as in Progress. When we pass the value of a character variable with a length greater than 1 to -D delimiter option as opposed to passing the text directly ( in the case of text it says the length can't be longer than 1 , but when we try with value of variable no error is thrown instead the first character is taken as delimiter.

#67 Updated by Greg Shah over 11 years ago

Please show an example here that reproduces the problem. And show the error output from the 4GL.

#68 Updated by Greg Shah over 11 years ago

If path validation is not necessary I could start preparing the proposed update

I don't know the answer to this. The core question which you must answer is this: is there EVER a case where VALID 4GL code can appear in converted code where the String[] passed to the ControlFlowOps.launch() method results an ANY of the following conditions:

1. The qualified quoter command is split across more than 1 element in the array, as in:

new String[] { "c:\\something ", "crazy\\splits\\the\command\\quoter.exe", ... }

2. The element that contains the qualified quoter command also contains preceding text that is not part of the qualified command.

new String[] { "garbage | c:\\some\\path\\quoter.exe", ... }

3. The element that contains the qualified quoter command also contains following text that is not part of the qualified command.

new String[] { "c:\\some\\path\\quoter.exe garbage", ... }

If none of these cases are possible, then there is no need for path matching.

Do remember, that the quoter command matching is not limited to only the first element of the String[]. My examples above don't show this, but the following is certainly possible:

new String[] { "some_program", "|" c:\\some\\path\\quoter.exe", ... }

#69 Updated by Costin Savin over 11 years ago

1. If a quoter command will be split in this way in p2j it will also not work in Progress. Checked.

2. This doesn't seem to be a valid case if the command is supposed to also be functional (considering the garbage part does nothing)

3. This case will not be detected as quoter case in p2j or Progress.

It seems that checking if the path is valid (in the case of path to quoter detected) should not be required if the we expect normal functional code.

Do remember, that the quoter command matching is not limited to only the first element of the String[]. My examples above don't show this, but the following is certainly possible:

  new String[] { "some_program", "|" c:\\some\\path\\quoter.exe", ... }

That's a good point! When I started I assumed that only the first parameter can be the quoter,
I'm working on it.

#70 Updated by Costin Savin over 11 years ago

QUOTER UTILITY DESCRIPTION

The quoter utility will put the elements identified and extracted from the
input source between double quotes (in the case where a double quote is found
inside it will be doubled) the result being sent to output which can be a
stream or a file.

The quoter utility can be called with the following parameters:

Input is always read from System.in. This may be specified as a
file or as a stream.
Delimiter option (optional): -D followed by the
delimiter character.
When this option is used the input is read line by line and each
line will be split using the specified delimiter before being sent
to output.
Ex: OS-COMMAND QUOTER /home/table.csv -D , >/home/table.q
Delimiter and column options can never be used together!
Column range option (optional) : -C followed by the column
ranges.
The column ranges must be separated in the form start-end
and separated by: ",". No spaces must be present!
This option will read input line by line and if presented with
valid column ranges will split the line into columns from the
starting position of the column range to the end position of the
column range (Ex of column range: 1-4,5-15,... the first column
will start from the first character up to the 4'th character included).
Usage Ex:
OS-COMMAND QUOTER /home/table.csv -C 1-2,4-20,22-26 >/home/table.q
Delimiter and column options can never be used together!
Output is always written to System.out. This may be specified as a
file or as a stream.

Error Handling.
When errors are encountered quoter will send the error message to the
output, then quit.
The following errors messages will be output:

Unknown option: + option :
This error is sent to output when quoter is called with an
option different from -C or -D.
Quoter only accepts one input file
This error is sent to output when there are more than one
arguments that are not options or options attributes.
The -c option should be followed by a space and then a list of
column ranges, separated by commas
This error is sent to output when -c option doesn't have a
value after the option.
The -d option should be followed by a space and then a single
character
This error is sent to output when -d option doesn't have a value
after the option or that value is not a single character.
Invalid range: + invalid range.
This error is sent when an invalid range is detected within the
column ranges: This can happen because of the starting index being
greater than the end index or, one or both of the indexes are not
a number.
Unable to read: + input file.
This error is sent when the input file cannot be read (doesn't
exist or the user doesn't have read permission).
Incompatible options: -c and -d
This error is sent when quoter is used with both column and
delimiter options.

#71 Updated by Greg Shah over 11 years ago

The documentation is pretty good. Some comments:

The input file (optional in case we use output stream).

I think this should read: "An optional filename from which to read OR if not specified, input is read from System.in."

Output file (optional when we have input through): > followed
by the filename
This is actually a command redirect of the quoter output to the
specific file.

This should not be stated in this manner. Simply say: "Output is always written to System.out."

#72 Updated by Greg Shah over 11 years ago

Assuming you have handled all the known issues, you can prepare/upload the update. Make sure the documentation you have created (with the edits I mentioned) is placed in the JavaDoc.

#73 Updated by Costin Savin over 11 years ago

implemented logic to check for quoter case in all the cases, however there are cases like:

def var ch as char.
ch = "quoter -C 1-2,4-20,39-80 ../../uast/quoter/quoter_util_column.d >../../uast/quoter/quoter_util_one_line.q".
os-command no-wait value(ch).
/*or*/
os-command no-wait "quoter -C 1-2,4-20,39-80 ../../uast/quoter/quoter_util_column.d >../../uast/quoter/quoter_util_one_line_str.q".

The conversion result will be:

character ch = new character("");
... 
ch.assign("quoter -C 1-2,4-20,39-80 ../../uast/quoter/quoter_util_column.d >../../uast/quoter/quoter_util_one_line.q".");
ProcessOps.launch(new String[]
 {
    (ch).toStringMessage()
 }, false, false);

 /*or*/
ProcessOps.launch(new String[]
{
   "quoter -C 1-2,4-20,39-80 ../../uast/quoter/quoter_util_column.d >../../uast/quoter/quoter_util_one_line_str.q" 
}

This means there will be only one argument which contains the complete quoter command.
In the current logic the argument will not be identified as quoter command because the quoter keyword is not at the end of the expression and it doesn't end in ".exe".
If the logic is changed to detect such cases it might also detect false quoter cases as valid.

Maybe the quoter script solution is more safe for cases like this?

#74 Updated by Greg Shah over 11 years ago

I know of these cases, but the current ProcessDaemon code does not support them. You don't need to support it at this time.

I'm glad you did enough testing to find the cases. Good work.

#75 Updated by Costin Savin over 11 years ago

File cs_upd20121213a.zip added

Added proposed update and edited the quoter description post .

#76 Updated by Costin Savin over 11 years ago

discovered a bug will add new update once fixed.

#77 Updated by Costin Savin over 11 years ago

File cs_upd20121214a.zip added

Submitted testcases to bazaar added proposed update (revised and re-tested)

#78 Updated by Costin Savin over 11 years ago

Status changed from WIP to Review

#79 Updated by Greg Shah over 11 years ago

Feedback, Part 1:

ProcessDaemon:

1. The file needs its copyright date updated.

2. The changes in prepareCommandLine() are not a bad first pass. However, there is a better way. The code is too duplicative. The old code did add the cmdlist⁰ to the StringBuffer outside of the for loop. That approach eliminated the need to conditionally call sb.append(' '). However, the new approach duplicates the core logic and we can reduce the code greatly with the only cost being to make the sb.append(' ') conditional. Of course, the sb.append(' ') can also be done outside of the if/else block instead of doing it in both the if and else branches. Change the code to loop from 0 instead of 1 and make the other necessary changes.

Quoter:

3. Remove the JPRM column in the header (our standard is to use the header version that does not have that column in all new files).

4. Update the date in the 001 history entry to be more accurate (20121214).

5. Add a blank line after the header and before the package statement.

6. This JavaDoc should be different.

 * inside it will be doubled) the result being sent to output which may be a 
 * stream or a file.

should be:

 * inside it will be doubled) the result being sent to standard output.

The reason is that from the Quoter process' perspective, it always writes to the STDOUT stream. If that stream is redirected to a file, it is not anything that the Quoter process knows or cares about. So there should not be any comment about output to files, otherwise readers may be confused.

This means that this:

 *   <li> Output is always written to System.out. This may be specified as a
 *        file or as a stream.

should also be changed to this:

 *   <li> Output is always written to System.out.

And this can be removed completely:

 *   <li> Output file (optional when we have input through): <b>></b> followed
 *        by the filename
 *        <p> This is actually a command redirect of the quoter output to the 
 *        specific file.

7. This text suggests that the column-end position really ends the column at column-end + 1. Is that correct? So the example's 1st column (1-4) is from characters 1 through and including 5 and the 2nd column (5-15) would start at the 6th character and go through and include character 16? That doesn't make sense. Please make it clear if the start and end columns are inclusive or not. And I assume this is all 1-based offsets? Put an explicit statement in about that.

 *        <p> This option will read input line by line and if presented with
 *        valid column ranges will split the line into columns from the 
 *        starting position of the column range to the end position of the 
 *        column range (Ex of column range: <b>1-4,5-15,...</b> the first 
 *        column will start from the first character up to the 5'th character 
 *        included).

8. What happens if there is neither a delimiter or column range option specified? You should document what happens by default.

9. Please remove the extra blank line at the beginning of the class:

public class Quoter
{  

   /** Error for invalid column range. */
   private static final String INVALID_RANGE_ERR = "Invalid range: ";

should be:

public class Quoter
{  
   /** Error for invalid column range. */
   private static final String INVALID_RANGE_ERR = "Invalid range: ";

10. In getColumnRange(), this part seems to be over-complicated:

               // check for invalid collumn ranges
               if (startCol > endCol)
               {
                  if (destFile != null)
                  {
                     try
                     {
                        // output error to destination file
                        destFile.write(INVALID_RANGE_ERR + ranges[i]);
                        destFile.close();
                        // Invalid range - Throw exception to mark end
                        throw new Exception();
                     }
                     catch (IOException e)
                     {
                        throw e;
                     }
                  }
               }

Why not this approach:

               // check for invalid column ranges
               if (startCol > endCol && destFile != null)
               {
                  // output error to destination file
                  destFile.write(INVALID_RANGE_ERR + ranges[i]);
                  destFile.close();

                  // Invalid range - Throw exception to mark end
                  throw new Exception();
               }

I see no value in catching and re-throwing IOException.

11. Extra blank lines at the beginning and end of methods should be removed:

   private static boolean isOption(String string)
   {

      return string.startsWith("-");
   }

should be:

   private static boolean isOption(String string)
   {
      return string.startsWith("-");
   }

12. In quote(), please move the definitions of quote and doubleQuote to be constants.

         String quote = "" + '"';
         String doubleQuote = quote + quote;

should be:

   /** The quote character used to delimit output strings. */
   private static final String QUOTE = "\"";

   /** The escaped quote character (by doubling it) used inside output strings. */
   private static final String QUOTE_2X = QUOTE + QUOTE;

13. With the change above, the quote() method should be rewritten like this:

   private static String quote(String quoted)
   {
      if (quoted != null)
      {
         String quote = "" + '"';
         String doubleQuote = quote + quote;
         String result = '"' + quoted.replaceAll(quote, doubleQuote) + '"';
         return result;
      }
      else
      {
         return null;
      }
   }

   private static String quote(String quoted)
   {
      return (quoted != null) ? '"' + quoted.replaceAll(QUOTE, QUOTE_2X) + '"'
                              : null;
   }

This is much simpler to read, in my opinion. Wherever less code is possible AND reducing the code does not make the result more cryptic, then we should use the version with less code.

I have to temporarily stop right now. I'll post Part 2 a bit later.

#80 Updated by Greg Shah over 11 years ago

Feedback, Part 2:

14. I think that preProcessQuoter() needs to be renamed. It leaves the impression that it is just doing the front-end of the task, when in fact it is doing everything. Please just call it process().

15. In all cases where an error is printed, the code closes the output stream and then exits the application. In the case of the 2 errors that happen in getColumnRange(), they transfer out via exceptions, but otherwise it is all very consistent. There some some problems:

- in all of these cases, if the input is already open, it never gets closed properly before the app exits
- the code is duplicative, for example it always closes the output stream, so there are many locations where the output stream is closed

I would prefer the closing of resources (the input and output streams) to always be done in the same method in which they were opened. For deeply nested errors, it is OK to use exceptions to unwind all the processing.

SO: main() should handle the closing of streams AND that closing code should only be in 1 place (if reasonably possible).
AND: The called code that can fail, should be in a try/catch/finally so that exceptions can unwind to that point.
AND: The more deeply nested code generally SHOULD NOT catch these exceptions since the idea is to unwind all the way out.
AND: The resource closing code should be in a finally block to ensure it always gets executed.

      try
      {
         process(inputReader, outputWriter, delimiter, colRanges);
      }

      catch (Exception e)
      {
         // nothing to do
      }

      finally
      {
         if (inputReader != null)
         {
            tryClose(inputReader);
         }

         tryClose(outputWriter);
      }

You would add this method:

   private static void tryClose(Closeable stream)
   {
      try
      {
         stream.close();
      }

      catch (IOException ioe)
      {
         // ignore, this is a best efforts close
      }
   }

This makes the close code cleaner.

I would expect error locations to be much cleaner too:

      if (delimiter != null && colRange != null)
      {
         // We cannot have both column and delimiter options
         try
         {
            outputWriter.write(INCOMPATIBLE_OPTIONS_ERR);
            outputWriter.close();
         }
         catch (IOException e)
         {
            return;
         }
         return;
      }

      if (delimiter != null && colRange != null)
      {
         // We cannot have both column and delimiter options
         outputWriter.write(INCOMPATIBLE_OPTIONS_ERR);
         throw new Exception();
      }

If write() causes an IOException, then so what? It will unwind just the same way that our newly thrown exception will unwind.

Normally, we would also create a special exception class to be thrown, but since we are only doing this to unwind and we don't care to differentially catch the exception, it isn't needed here.

16. This method seems cryptic and overcomplicated:

   public static boolean isQuoterCommand(String command)
   {
      String commandUpper = command.toUpperCase();
      if (commandUpper.contains(QUOTER))
      {
         // The simple case
         if (commandUpper.equals(QUOTER))
         {
            return true;
         }
         else
         {
            int quoterStartPos = commandUpper.lastIndexOf(QUOTER);
            if (quoterStartPos == 0)
            {
               String extension = commandUpper.substring(QUOTER.length());
               // check for extension case
               if (extension.equals(".EXE")) 
               {
                  return true;
               }
            }
            else
            {
               // check if it is a path
               char pathSep = commandUpper.charAt(quoterStartPos -1);
               if (pathSep =='/' || pathSep =='\\')
               {
                   if(commandUpper.substring(quoterStartPos + 
                                             QUOTER.length()).equals("") ||
                      commandUpper.substring(quoterStartPos + 
                                             QUOTER.length()).equals(".EXE"))
                  {
                     return true;
                  }
               }
            }
         }
      }
      // all the other case will return false
      return false;
   }

Here is a version that I believe is functionally the same, but it is much simpler:

   public static boolean isQuoterCommand(String command)
   {
      // case insensitive matching
      command = command.toUpperCase();

      // the only possible matches must always end with one of these two values
      if (command.endsWith(QUOTER) || command.endsWith(QUOTER_EXE))
      {
         int start = command.lastIndexOf(QUOTER);

         if (start == 0)
         {
            // no pathing or trash in front is a match
            return true;
         }
         else
         {
            // something is in front, check if it is a path
            char pathSep = command.charAt(start - 1);

            if (pathSep =='/' || pathSep =='\\')
            {
               return true;
            }

            // at this point there must be trash in front, this isn't valid
         }
      }

      // all the other case will return false
      return false;
   }

It assumes that there is this additional member variable:

   private static final String QUOTER_EXE = "QUOTER.EXE";

Does this make sense? Do you see how I am trying to take a simplest possible path through the detection, to keep the logic small, readable and to avoid duplication of code? By doing this, it is easier to ensure that the code is correct. It is easier to maintain it over time, especially for someone that did not write it.

#81 Updated by Costin Savin over 11 years ago

7. The text description is wrong it's 4't included not 5'th included. Corrected

#82 Updated by Costin Savin over 11 years ago

File cs_upd20121217a.zip added

I understand , the final code should be be fully documented, respect the coding standards entirely , tested and brought to the simplest form to enhance readability :).
Added proposed update and retested it.
Besides the feedback changes I also changed the parameter parsing part to a more simplified version.

#83 Updated by Greg Shah over 11 years ago

Feedback:

1. The text "This may be specified as a file or as a stream." in the javadoc for can be removed. Stating that we always write to System.out is sufficient to explain everything. Saying that it can be specified as a file or stream is confusing because it suggests there is a parameter, when there is no parameter.

2. The main class javadoc needs to mention that there is an (optional?) parameter to specify the input filename. This should explain if the filename can be relative and/or absolute. It should also explain what happens when this is not specified.

3. Per the coding standards, please put spaces on both sides of binary operators.

Line 233:

      if (delimiter != null && delimiter.length()>1 && errorMessage==null)

should be:

      if (delimiter != null && delimiter.length() > 1 && errorMessage == null)

Line 242:

      if (errorMessage!=null)

should be:

      if (errorMessage != null)

It makes the code easier to read.

4. When you have code like this:

         try
         {
            outputWriter.write(errorMessage);
            outputWriter.close();
            return;
         }
         catch (IOException e)
         {
            // nothing to be done but return
            return;
         }

You should not duplicate the code that is going to execute anyway. Just code it like this:

         try
         {
            outputWriter.write(errorMessage);
            outputWriter.close();
         }
         catch (IOException e)
         {
            // ignore
         }

         return;

It is less code. AND it makes clear to the reader that no matter what happens, the code will be exiting.

5. The destFile.close(); on lines 566 and 578 should be deleted. They are not needed anymore.

6. The code at lines 576 - 580 is indented too much.

7. Please remember that blank lines should not be placed at the end of methods.

      finally
      {
         if (inputReader != null)
         {
            tryClose(inputReader);
         }

         tryClose(outputWriter);
      }
                                     <---  this line should not be here
   }

This can also be seen in tryClose() on line 683.

8. To confirm: everything has passed functional testing?

#84 Updated by Costin Savin over 11 years ago

8. Yes ran all the tests and checked results it all looks good.
Will also test again after making the changes from review.

#85 Updated by Costin Savin over 11 years ago

File cs_upd20121218a.zip added

Added proposed update.
Also found a bug regarding the error message when input file does not exist and fixed it, the other tests results are ok.

#86 Updated by Greg Shah over 11 years ago

This is the wrong update (it is the same one that goes with #1591). Please upload the proper update (probably it is the cs_upd20121218b.zip?).

#87 Updated by Costin Savin over 11 years ago

File cs_upd20121219a.zip added

Actually it's the cs_upd20121219a.zip mistaken between the two quoter folders.

#88 Updated by Greg Shah over 11 years ago

File deleted (~~cs_upd20121218a.zip~~)

#89 Updated by Greg Shah over 11 years ago

Status changed from Review to Test

I am OK with the proposed code.

Please work with Eugenie to put this through full runtime testing using lightning staging. No conversion testing is needed.

Eugenie: Costin doesn't have access to lightning. Eventually, sometime in January, we will have the devsrv01 ready and everyone can do their own regression testing, but until then I appreciate your help. Constantin may be using staging right now, so coordinate with him.

FYI, I deleted the cs_upd20121218a.zip since it did not contain any code that was associated with this task.

#90 Updated by Eugenie Lyzenko over 11 years ago

Did you get a chance to work on it?

Costin,

Have you updates ready to test in lightning? Is it only cs_upd20121231a? Let me know please what is the archive file list to apply and I rebuild staging and start the tests.

#91 Updated by Costin Savin over 11 years ago

The latest update for this tast is cs_upd20121219a.zip the cs_upd20121231a? is for the other quoter (built-in) #1591. Sorry I forgot to also add the task number

#92 Updated by Eugenie Lyzenko over 11 years ago

Is there inside the update some code that can affect data sorting algirythm inside report generated? I have one issue with report comparison for TC-ITEM-MASTER-104 test. Going to retest the staging afain now.

BTW, the CTRL-C tests have been completed without any regression.

#93 Updated by Greg Shah over 11 years ago

It is unlikely, but yes, it is possible. Please show the lines of output in the report that are different.

#94 Updated by Eugenie Lyzenko over 11 years ago

The current report(The failing line is second in the figures below):

...
                0001GX37        0A0003JF2       97396 OSHA RED PAINT 16 OZ SP            CN PGC                   ____________

                0001H8KJ        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001HBG5        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001FXY0        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001GZN1        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001HCBR        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001HHY1        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001J150        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________
...

The expected report:

...
                0001GX37        0A0003JF2       97396 OSHA RED PAINT 16 OZ SP            CN PGC                   ____________

                0001FXY0        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001GZN1        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001H8KJ        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001HBG5        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001HCBR        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001HHY1        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________

                0001J150        0A0003KAS       ACEFNC-1 FOAM-N-CLEANER                  EA PGC                   ____________
...

#95 Updated by Greg Shah over 11 years ago

Strange. Can you reproduce the problem manually?

#96 Updated by Eugenie Lyzenko over 11 years ago

I have started another harness cycle about a hour ago. We will know today if this issue be repeated.

#97 Updated by Eugenie Lyzenko over 11 years ago

The testing results have been uploaded into shared GCD drive: 20130107_*.zip 20130108_*.zip(4 files). The all tests can be considered as passed now taking into account the cumilative results. I'm stopping the servers to enable promotion.

#98 Updated by Greg Shah over 11 years ago

Costin: please check in the code to bzr and distribute the update to the team.

I won't be promoting this version of staging. We have many pending updates and I plan to batch them up.

#99 Updated by Greg Shah over 11 years ago

Status changed from Test to Closed

#100 Updated by Greg Shah over 7 years ago

Target version changed from Milestone 7 to Runtime Support for Server Features

Also available in: Atom PDF

Project

General

Profile

FWD » Core Development » Base Language

Issues

Custom queries

Feature #1611

implement QUOTER utility

History

#1 Updated by Greg Shah over 11 years ago

#2 Updated by Costin Savin over 11 years ago

#3 Updated by Costin Savin over 11 years ago

#4 Updated by Greg Shah over 11 years ago

#5 Updated by Costin Savin over 11 years ago

#6 Updated by Costin Savin over 11 years ago

#7 Updated by Greg Shah over 11 years ago

#8 Updated by Costin Savin over 11 years ago

#9 Updated by Greg Shah over 11 years ago

#10 Updated by Costin Savin over 11 years ago

#11 Updated by Costin Savin over 11 years ago

#12 Updated by Greg Shah over 11 years ago

#13 Updated by Greg Shah over 11 years ago

#14 Updated by Costin Savin over 11 years ago

#15 Updated by Greg Shah over 11 years ago

#16 Updated by Costin Savin over 11 years ago

#17 Updated by Costin Savin over 11 years ago

#18 Updated by Greg Shah over 11 years ago

#19 Updated by Costin Savin over 11 years ago

#20 Updated by Greg Shah over 11 years ago

#21 Updated by Costin Savin over 11 years ago

#22 Updated by Greg Shah over 11 years ago

#23 Updated by Costin Savin over 11 years ago

#24 Updated by Costin Savin over 11 years ago

#25 Updated by Greg Shah over 11 years ago

#26 Updated by Constantin Asofiei over 11 years ago

#27 Updated by Costin Savin over 11 years ago

#28 Updated by Greg Shah over 11 years ago

#29 Updated by Constantin Asofiei over 11 years ago

#30 Updated by Costin Savin over 11 years ago

#31 Updated by Greg Shah over 11 years ago

#32 Updated by Greg Shah over 11 years ago

#33 Updated by Costin Savin over 11 years ago

#34 Updated by Costin Savin over 11 years ago

#35 Updated by Greg Shah over 11 years ago

#36 Updated by Constantin Asofiei over 11 years ago

#37 Updated by Greg Shah over 11 years ago

#38 Updated by Costin Savin over 11 years ago

#39 Updated by Eugenie Lyzenko over 11 years ago

#40 Updated by Greg Shah over 11 years ago

#41 Updated by Costin Savin over 11 years ago

#42 Updated by Greg Shah over 11 years ago

#43 Updated by Costin Savin over 11 years ago

#44 Updated by Greg Shah over 11 years ago

#45 Updated by Costin Savin over 11 years ago

#46 Updated by Greg Shah over 11 years ago

#47 Updated by Costin Savin over 11 years ago

#48 Updated by Greg Shah over 11 years ago

#49 Updated by Costin Savin over 11 years ago

#50 Updated by Greg Shah over 11 years ago

#51 Updated by Costin Savin over 11 years ago

#52 Updated by Greg Shah over 11 years ago

#53 Updated by Costin Savin over 11 years ago

#54 Updated by Costin Savin over 11 years ago

#55 Updated by Greg Shah over 11 years ago

#56 Updated by Costin Savin over 11 years ago

#57 Updated by Greg Shah over 11 years ago

#58 Updated by Costin Savin over 11 years ago

#59 Updated by Greg Shah over 11 years ago

#60 Updated by Costin Savin over 11 years ago

#61 Updated by Eugenie Lyzenko over 11 years ago

#62 Updated by Greg Shah over 11 years ago

#63 Updated by Costin Savin over 11 years ago

#64 Updated by Greg Shah over 11 years ago

#65 Updated by Costin Savin over 11 years ago

#66 Updated by Costin Savin over 11 years ago

#67 Updated by Greg Shah over 11 years ago

#68 Updated by Greg Shah over 11 years ago

#69 Updated by Costin Savin over 11 years ago

#70 Updated by Costin Savin over 11 years ago

#71 Updated by Greg Shah over 11 years ago