Call Graph Analyzer¶

Call Graph Analyzer

FWD provides tools to analyze the miscellaneous linkages between call sites and internal or external targets. This is accomplished by creating a graph in which the call sites and targets are represented as nodes (vertices), while the edges between the graph nodes represent the linkages between a call site and a target. This graph is backed by the 3 ^rd party Titan DB and data can be extracted either via the standard reports provided by FWD or by writing your own reporting tools.

At this time the following standard reports are available:

Program Dependencies
Missing External Procedures List
Dead Files List (both procedures and includes)
External Targets List
Ambiguous Programs List
Schema Trigger Procedures

As the Titan DB is an implementation of the Blueprints graph database APIs, when writing custom reporting tools it is recommended to stay within the boundaries of the Blueprints APIs. This will hide the real graph database back-end and in the future, it would allow an easy switch from the Titan DB implementation to another graph database implementation.

At the time of this writing, the main graph database libraries in use have the following versions:

Library Name	Version
Titan DB	0.4.2
Blueprints	2.4.0
Lucene	4.4.0
Akiban Persistit	3.3.0

This chapter will describe how the call graph can be configured and executed, what the graph database structure looks like and how you can extract data via standard or custom reporting.

Call Graph Configuration and Connection¶

The graph database is backed by Titan DB, with a hard-coded configuration which can not be changed by the user and must be duplicated, when connecting to the graph database from outside FWD. This hard-coded configuration includes the storage configuration and the configuration for an index storage named search:

the storage backend (storage.backend property) is hard-coded to the Persistit implementation.
the storage directory (storage.directory property) defaults to the ./callgraph folder and can be configured via the callgraph-db-folder parameter from p2j.cfg.xml.
the index backend (storage.index.search.backend property) is hard-coded to the Lucene implementation.
the index storage directory (storage.index.search.directory property) is hard-coded to the <storage.directory>/searchindex folder, where <storage.directory> is the location of the graph storage folder.

The parameters which can be changed by the user are configured in the p2j.cfg.xml file and are described by the following table:

Parameter Name	Default Value	Description
`callgraph-db-folder`	`./callgraph/`	The storage folder.
`basepath`	`./`	The folder in which to search the code set and the include files.
`graph-node-filter`	`null`	In callgraph mode, the pattern engine will use this filter to determine if each walked node should or should not be visited. It must specify a fully qualified class name (including package) for a class that implements the `GraphNodeFilter` interface. This class must have a default constructor (no parameters) so that instances of the class can be created by the pattern engine.
`include-spec`	`*.[iI]`	A shell matching pattern, describing the files which are included by the preprocessor.
`case-sensitive`	`false`	Flag indicating if we are on a case-sensitive file system.
`rootlist`	`n/a`	Needs to be specified. This will point to an XML file, defining the list of entry points (by file or folder). See the Defining the Entry Point Programs section of this chapter, for details about the structure of this file.

Only one connection is allowed to the graph database at a time. To establish a connection, use the com.goldencode.graph.GraphDB.obtainGraphDB(configuration) API. This receives as a parameter, an org.apache.commons.configuration.Configuration instance, holding the graph database configuration. This API will return a com.tinkerpop.blueprints.Graph instance, which is backed by Titan DB. The default properties loaded into the configuration parameter are (following a property file structure):

storage.backend=persistit
storage.directory=./callgraph/
storage.index.search.backend=lucene
storage.index.search.directory=./callgraph/searchindex

Once the com.tinkerpop.blueprints.Graph instance is created, the createIndex and createUniqueIndex APIs of the com.goldencode.graph.GraphDB class are used to create indexes. Indexes can not be created if a vertex or edge property already exists, thus these APIs will be a no-op if there is already an index created for the specified property. The following indexes created by FWD are all added to the search index namespace. To ensure the index for a string property is used, the search needs to be done either for an exact match or for a prefix match, as generic regex or suffix matching is not supported by the Lucene index (thus all the vertices or edges will be iterated, drastically reducing performance).

Indexed Property	Property Type	Index Type	Applies to Node Type	Description
`filename`	`String`	`UNIQUE`	`Vertex`	The property is added to vertices associated with a physical file (external program, include file, etc).
`reverse-filename`	`String`	`UNIQUE`	`Vertex`	As Titan DB doesn't support indexed suffix search, this allows indexed prefix searches, by keeping the reversed `filename` in this property. When searching, reversing the pattern and matching for a prefix in the reversed string is the same as matching the pattern as a suffix in the non-reversed string.
`port-type`	`String`	`UNIQUE`	`Vertex`	The property is added to vertices associated with a SOAP Port Type.
`schemaname`	`String`	`UNIQUE`	`Vertex`	The property is added to vertices which have an associated schema name.
`command`	`String`	`UNIQUE`	`Vertex`	The property is added to vertices which have an associated command, to be passed to the OS.
`connect-string`	`String`	`UNIQUE`	`Vertex`	The property is added to vertices which have an associated connection, for a server (to listen for incoming) or for a client (when connecting to) remote connections.
`com-target`	`String`	`UNIQUE`	`Vertex`	The property is added to vertices which have an associated Automation Object.
`dde-target`	`String`	`UNIQUE`	`Vertex`	The property is added to vertices which have an associated DDE server.
`node-type`	`Integer`	`SORT`	`Vertex`	Added to all vertices, represents the node type.
`node-id`	`Long`	`SORT`	`Vertex`	Added to all vertices, represents the internal node ID (from the AST or computed). The (node-type, node-id) pair must make a unique index, which currently can not be enforced at the DB level.
`call-site-id`	`Long`	`SORT`	`Edge`	For all edges, represents the AST ID of the call site. Usually, this is the same as the `node-id` of the outgoing vertex; else, this is the AST ID of a child (direct or not) of the AST represented by the outgoing vertex.

Specifying the Entry Point Programs¶

Building the callgraph requires the list of entry point (root) programs which are accessible from the outside world. This list must include all programs that are directly executed:

programs executed by a user, a batch process or via the appserver
programs determined dynamically (i.e. saved in a database table and picked from it)
programs ran from command line or shell scripts
programs targeted by schema triggers
etc

All such "top-level" programs must be specified as entry points.

The root programs have a special meaning in the graph database: only the initial root list and all reachable external programs (e.g. via a RUN external-program statement) will be processed during callgraph generation. External programs not reachable from any of the root programs will not be processed (they can be declared "dead").

The root list can be specified either as arguments to the CallGraphGenerator tool, or via a special XML file, specified by the rootlist parameter in p2j.cfg.xml. This XML file is composed from a root XML element named roots, which has one or more XML elements named node. Each node element can specify either a file name or a folder name, relative to the basepath folder:

if the filename attribute is set, it must specify the name of an AST file associated with an external program.
If the folder attribute is set, then the pattern attribute must exist: this is a shell matching pattern of the AST files to be included from the specified folder. The optional recursive attribute can be set to true, so that the pattern will be applied to the folder and any subfolders, recursively.

Following is an example of how this file can look, assuming the basepath is the ./abl/ folder:

<?xml version="1.0"?>
<roots>
  <node filename="./abl/some/application/folder/some-external-program.p.ast" />
  <node folder="./abl/another/application/folder/" pattern="*.ast" recursive="true"/>
</roots>

This example adds to the root list the ./abl/some/application/folder/some-external-program.p and all programs from the ./abl/another/application/folder/ and any subfolders.

If you want to include all external programs as entry-points, use a node with the folder name set to the basepath value, and with the recursive attribute set to true, as in (assuming basepth=./abl in p2j.cfg.xml):

<?xml version="1.0"?>
<roots>
  <node folder="./abl" pattern="*.ast" recursive="true"/>
</roots>

This ensures the entire code set is processed by the callgraph, and all the external targets are determined, regardless if an external program is dead or not. Thus, you might decide to run the callgraph twice, and generate two different databases: one with the entire code set added, to resolve all external targets and one with the explicit list of entry points, to resolve the dead files.

Generating the Call Graph¶

Call graph generation must always be done using the ASTs from the front-end conversion phase, with any post-parse fixups already applied. With this constraint, there are two modes of generating the callgraph: automatically (via the ConversionDriver), and explicitly (via the CallGraphGenerator tool).

Using ConversionDriver¶

The ConversionDriver allows callgraph generation by specifying the F3 mode for the Front-end phase. This produces the same results as running the CallGraphGenerator tool in normal mode. To run the conversion and generate the callgraph (assuming the current folder is the project home folder), the following command can be used:

java -classpath p2j/lib/p2j.jar com.goldencode.p2j.convert.ConversionDriver -Sd2 F3 . "*.p"

When F3 mode is specified, the front-end phase will also generate the callgraph. If needed, the middle and code backend phases can also be specified, so that full conversion is performed.

Using CallGraphGenerator¶

Assuming the ConversionDriver was run in only F3 or F2 mode (i.e. the ASTs are the ones generated by the parser and with any post-parse fixups applied), the com.goldencode.p2j.uast.CallGraphGenerator tool can be used to generate the callgraph, avoiding the overhead of re-parsing the sources.

This cannot be used if the middle or code backend has been executed because the ASTs are heavily modified during the actual conversion process and these modifications render the ASTs unsuitable for the graph analysis.

This tool allows generating the callgraph both from scratch or incrementally, to resolve ambiguities without a full run. The syntax of this tool is:

java -classpath p2j/lib/p2j.jar \
   com.goldencode.p2j.uast.CallGraphGenerator [-Dn] [-u] [root_nodes...]

where:

-Dn sets the pattern engine debug level where n is the numeric setting of the level (default is 1):

none (0)
status (1)
debug (2)
trace (3)

-u sets the callgraph processing in update mode, where only the ambiguous nodes and their newly-resolved targets are processed. The targets specified in the root_nodes list are ensured to be updated and must already exist in the callgraph.

[root_nodes...] is one or more root node filenames to process (which must be valid relative or absolute names based on the current directory). Each filename must specify an existing AST file associated with a legacy external procedure. If no root nodes are listed on the command line and we are not in update mode, then the project's root node list will be used. If we are in update mode, then the explicit list of programs will be added to the list of ambiguous external programs.

This section will describe how the call graph processing works when ran in non-update mode; for the update mode, see the Resolving the Ambiguous Call-Sites section of this chapter.

In non-update mode, the first part consists of loading the entire code set into the graph database, ensuring that unreachable external programs or include files are represented in the graph. This loading is done in two phases:

Using the include-spec parameter (which defaults to *.[iI]), all the include files matching the specification are listed from the basepath folder, and for each a special node is added to the graph.
Using the *.ast pattern, all the external programs (found in the basepath folder) are processed. This includes creating a special node in the graph and also generating the sub-graph associated with the include linkages, by processing the associated pphints file (preprocessor hints). When creating the include sub-graph, if the include file is not found in the graph, a new node is created for it and a warning is logged to STDOUT.

The second part of call graph processing requires creating a graph node for each defined schema trigger and resolving the linkage between the schema trigger and the target external program. The target external programs, even if determined as existing in the initial code-set, will not be automatically specified as entry points.

The final part consists of applying the call graph processing rules (which process the call-sites and add linkages between the call-site and the call-site's target) to the programs determined to be as root (or entry-points) and all reachable programs, in batches:

the rules are applied to the initial root list and the set of first-level reachable programs is determined.
the rules are applied to the set of first-level reachable programs and the set of second-level reachable programs is determined.
the rules are applied to the set of the second-level reachable programs and the set of third-level reachable programs is determined.
and so on, until no more reachable programs are found OR all reachable programs were already processed.

The callgraph processing rules are split into standard and customer-specific rules. The customer specific rules must reside in a customer_specific_call_graph.rules file, which needs to be accessible via the patpath configuration parameter. For each external program, first the standard rules will be applied, followed by the customer-specific rule-set. This allows the customer to write custom rules to automate the disambiguation of ambiguous call-sites and/or to adjust any linkages generated by the standard rules.

After the callgraph is generated, a GraphML representation of the graph database is saved in the callgraph.graphml file, in the project home. GraphML is a standard file format for representing graph networks. This file can be viewed with tools like yEd or read with other applications. Describing the GraphML structure is outside the scope of this document.

Graph Structure¶

The graph database annotates the nodes and edges by setting specific properties, depending on the node and the linkage between the edge's outbound and inbound node. Each node added to the DB has these standard properties:

- a node-type property holding an Integer value, identifying its type. The node's type will always be one of the token types defined by the com.goldencode.p2j.uast.ProgressParserTokenTypes interface.

- a node-id property holding a Long value, which is unique for the given node-type. If the node-type has an associated AST node, then this will be the AST's ID. Else, its value will be computed such that the (node-type, node-id) pair will always be unique.

- a node-key property holding a String value, which specifies the name of the property holding additional information for that node.

The nodes can be split into two sets: nodes associated with legacy code and nodes associated with external targets, outside of the legacy code.

The links created between the graph nodes are uni-directional and represent a dependency between the outbound and the inbound node. Each edge has at least these properties:

call-site-key, holding an Integer value identifying the call-site type (this is not the same as the call-site's token type). The edge's label will be set depending on the call-site-key value, and will always be a string value.
call-site-id, holding a Long value identifying the AST id where the call-site originates.
column and line, each one holding an Integer value identifying the location of the call-site in the outbound file. These location values are relative to the *.cache file (the fully preprocessed source code). They do not necessarily reference the original (unexpanded) external procedure file unless the preprocessor is not used for the file up to that referenced point.

If a call-site is determined to be ambiguous and can not be determined at conversion time as the call-site uses a runtime-evaluated expression, then its associated edge will be set a dynamic_expr=true" property. Also, a link will be created between the call-site's associated graph node and the ambiguous node, which will be labeled “ambiguous” and configured in the same way as the other edges. Details about how to disambiguate these call-sites can be found in the Resolving the Ambiguous Call-Sites section of this chapter.

For call-sites which are determined to be originating from an included file, a special include-location edge will be created for this call-site, pointing to the include file. This is an indication of the origin of this call-site, but unfortunately the exact location in the include file (before the include-file was preprocessed and merged into the external procedure) can not be determined at this time.

Nodes Associated with Legacy Code¶

The nodes which are associated with legacy code will have their node-type property set to one of these ProgressParserTokenTypes constants:

INCLUDE, to identify a physical include file. For this node, the node-key property is set to filename, which in turn specifies the name of this include file, relative to project home.
EXTERNAL_PROCEDURE, to identify an existing external program, part of the legacy source code. Similar to the INCLUDE node, its node-key property is set to filename, which in turn specified the name of this external program, relative to project home.
TABLE, to identify a table from the permanent schema. Only tables which have a defined schema trigger will have an associated node in the graph database. For this node, its node-key property is set to schemaname, which in turn specifies the the full schema name of this table.
The token type for AST nodes that represent call-sites which have an associated node added to the graph database.

It's worth mentioning that there is an AMBIGUOUS “super-node” to which all call-sites determined to be ambiguous will link. This allows marking ambiguous call-sites, even if the call-site has no associated node in the graph database (thus only a link between the EXTERNAL_PROCEDURE and the AMBIGUOUS node can be added).

For these nodes, the following properties are added:

An entry-point=true property, in case the EXTERNAL_PROCEDURE node is determined to be in the root list programs.
column and line, which represents the location of this AST in the .cache file.
text, with the AST's text
type, with the AST's original token type. Do not confuse with node-type.
All the AST's annotations are copied from the AST into node properties.

Nodes Associated with External Targets¶

The external targets are nodes which have no representation in the legacy source code, but which are linked from the source code (via invocation, connection, or other mode). Each node representing an external target, an external=true property is automatically set. Following is the list of all possible external targets:

MISSING_EXTERNAL_PROCEDURE is the same as EXTERNAL_PROCEDURE, except this refers to a 4GL program which can not be found in the code-set.
PORT_TYPE represents the target of a RUN port-type SET hport ON SERVER h statement; its associated node-key is port-type, which in turn specifies the name of this WSDL port type.
LIBRARY_PROCEDURE represents the target of a RUN library-ref statement; its associated node-key is filename, which in turn specifies the name of this external library.
NATIVE_PROCEDURE represents the library name from a PROCEDURE ... EXTERNAL definition; its associated node-key is filename, which in turn specifies the name of the native library and the procedure's name (the entry-point name in the library), using the native-lib:entry-point syntax.
NATIVE_PROCESS represents the target of a process launching statement, like UNIX, BTOS, INPUT THROUGH, etc; its associated node-key is command, which in turn specifies the command to launch this native process. If the OS shell is to be launched, then the used command is hard-coded to the “OS_SHELL” string.
NETWORK_CONNECTION represents the target of a CONNECT or ENABLE-CONNECTIONS method call; its associated node-key is connect-string, which in turn specifies the method's connection string.
COM_OBJECT represents the target of an automation object being created via a CREATE automation-object statement; its associated node-key is com-target, which in turn specifies details about the created automation object.
DDE_SERVER represents the target of a DDE INITIATE statement; its associated node-key is dde-target, which in turn specifies details about the DDE server.
OCX_CONTROL represents the target of a loadControls method call; its associated node-key is filename, which in turn specifies the name of the frame control file.

Linkage Types (a.k.a. Edge Description)¶

Each time a linkage is determined between a call-site and a target, an edge is created. The properties and the property values set at the edge depend on the call-site's type. To disambiguate between more than one form of a certain 4GL statement (i.e. RUN filename and RUN library-ref) special token types were added. Each time the call-site is determined to be ambiguous and the target(s) can not be disambiguated from the hints, then the call-site is linked with the AMBIGUOUS node, with the edge configured the same way as the unambiguous case.

Depending on the call-site-key values, the possible edges are:

call-site-key value (ProgressParserTokenTypes constant)	Edge Label	Description
`INCLUDES`	`"include-location"`	Target node types: `INCLUDE` Associated 4GL statements: {<include-file>} Details: When a new external program is added to the callgraph, its associated `.pphints` file is loaded and dependencies are added between the external program and the first-level includes, between the first-level includes and second-level includes, and so on. Each edge has these properties: • `start-column` and `start-line`, identifying the start position of this include file in the generated `.cache` file. The `column` and `line` properties have the same values as these ones. • `end-column` and `end-line`, identifying the end position of this include file in the generated `.cache` file.
`RUN_FILENAME` `RUN_FILENAME_ON_SERVER` `RUN_VALUE` `RUN_VALUE_ON_SERVER`	`"RUN_EXTERNAL_PROGRAM"`	Target node types: `EXTERNAL_PROCEDURE` `MISSING_EXTERNAL_PROCEDURE` `AMBIGUOUS` Associated 4GL statements: RUN { <filename> \| VALUE (...) } [ ON SERVER ]. Details: Edges are created between each statement and the associated target. If the external program is missing, then a new `MISSING_EXTERNAL_PROCEDURE` node will be created.
`RUN_LIBRARY_REF` `RUN_LIBRARY_REF_ON_SERVER`	`"RUN_LIBRARY_REF"`	Target node types: `LIBRARY_PROCEDURE` `AMBIGUOUS` Associated 4GL statements: RUN library-ref.
`RUN_PORT_TYPE_ON_SERVER` `RUN_PORT_TYPE_VALUE_ON_SERVER`	`"RUN_PORT_TYPE"`	Target node types: `PORT_TYPE` `AMBIGUOUS` Associated 4GL statements: RUN { <port-type> \| VALUE(...) } SET h ON SERVER <web-server-handle>.
`KW_EXTERN`	`"NATIVE_PROCEDURE"`	Target node types: `NATIVE_PROCEDURE` Associated 4GL statements: PROCEDURE EXTERNAL ”<library-name>”.
`KW_BTOS` `KW_DOS` `KW_OS2` `KW_UNIX` `KW_VMS` `KW_OS_CMD` `KW_CTOS` `INPUT_THRU` `INPUT_OUTPUT_THRU` `OUTPUT_THRU`	`"NATIVE_PROCESS"`	Target node types: `NATIVE_PROCESS` `AMBIGUOUS` Associated 4GL statements: { BTOS \| DOS \| OS2 \| UNIX \| VMS \| OS-COMMAND \| CTOS \| INPUT THROUGH \| OUTPUT TRHOUGH \| INPUT-OUTPUT TROUGH } [ <command-string> \| VALUE(...) ]
`KW_TAB_TRG`	`"TABLE_TRIGGER"`	Target node types: `EXTERNAL_PROCEDURE` `MISSING_EXTERNAL_PROCEDURE` Associated 4GL statements: ADD TABLE ... TABLE-TRIGGER ... PROCEDURE “<procedure-name>”. Details: The edge will be added a `trigger-type` property, with the token type of the table's trigger.

Resolving the Ambiguous Call-Sites¶

Call-sites for which the target is determined at runtime need to be disambiguated by providing the target name(s) in the UAST hints file. This is because the source code does not have enough information to resolve the target. The hint name will be prefixed with the call-site-key and terminated with a _%d suffix, where %d is replaced with the 0-based index of this match in the .cache file. If the call-site should have no targets, then add an empty string-array hint. After they are disambiguated, each edge will be set a hint-id property, specifying which hint was used to determine the linkage between the call-site and the target; if the hint contains multiple targets, then the edge's hint-id property will end with the target index in this array.

The following table decribes the hint names for the possible ambiguous call-sites; in this table, the # suffix will be replaced with a 0-based index.

call-site-key value (ProgressParserTokenTypes constant)	Hint Name
`RUN_VALUE`	`RUN_VALUE_#`
`RUN_VALUE_ON_SERVER`	`RUN_VALUE_ON_SERVER_#`
`RUN_PORT_TYPE_VALUE_ON_SERVER`	`RUN_PORT_TYPE_VALUE_ON_SERVER_#`
`KW_BTOS`	`KW_BTOS_#`
`KW_DOS`	`KW_DOS_#`
`KW_OS2`	`KW_OS2_#`
`KW_UNIX`	`KW_UNIX_#`
`KW_VMS`	`KW_VMS_#`
`KW_OS_CMD`	`KW_OS_CMD_#`
`KW_CTOS`	`KW_CTOS_#`
`INPUT_THRU`	`INPUT_THRU_#`
`INPUT_OUTPUT_THRU`	`INPUT_OUTPUT_THRU_#`
`OUTPUT_THRU`	`OUTPUT_THRU_#`
`KW_CONN`	`KW_CONN_#>`
`KW_ENABLE_C`	`KW_ENABLE_C_#`
`CREATE_OBJECT`	`CREATE_OBJECT_#`
`DDE_INITIATE`	`DDE_INITIATE_#`
`KW_LOADCTRL`	`KW_LOADCTRL_#`

To ease resolving the ambiguous call-sites, the hint name for each call-site is automatically provided by the standard reporting tools. See the Standard Reports → Ambiguous call-sites section of this chapter for details about how the ambiguous call-sites are reported and how to read the hint name. The code-set can be assumed as ambiguous as long as ambiguous call-sites are reported.

The structure of a hint file is described in the Conversion Hints chapter of the Conversion Handbook book. For callgraph usage, each hint must be of string or string[] datatype, and it looks like:

<?xml version="1.0"?>

<hints>
   <!-- multiple targets for a RUN VALUE statement -->
   <uast name="RUN_VALUE_0" datatype="string[]">
      <array-val value="missing.p" />
      <array-val value="another-program.p" />
   </uast>

   <!-- single target for a RUN VALUE statement -->
   <uast name="RUN_VALUE_1" datatype="string" value=”some-other-file.p”/>

   <!-- RUN VALUE statement with no targets (i.e. targets an internal procedure which
        is not part of the call-graph at this time) -->
   <uast name="RUN_VALUE_2" datatype="string[]" />

</hints>

Once hints are added for one or more ambiguous call-sites, the callgraph can be re-generated from scratch or processed in update mode. In update mode, only the ambiguous call-sites plus the explicit list of programs passed as arguments to the CallGraphGenerator tool will be processed. The syntax of this tool, when run in update mode, is:

java -classpath p2j/lib/p2j.jar com.goldencode.p2j.uast.CallGraphGenerator [-Dn] -u [root_nodes...]

where:

-Dn sets the pattern engine debug level where n is the numeric setting of the level (default is 1):

none (0)
status (1)
debug (2)
trace (3)

[root_nodes...] is one or more root node filenames to explicitly process (which must be valid relative or absolute names based on the current directory). Each filename must specify an existing AST file associated with a legacy external procedure.

The update mode will bypass the code-set and schema-processing phases, and will just apply the callgraph processing rules, until no more new external programs are linked. If an existing hint is changed, the change will not be picked up by the CallGraphGenerator tool, when ran in update mode. In these cases, the entire call graph must be re-generated from scratch.

Standard Reports¶

The standard reports provide insight into the linkages and dependencies between the graph nodes. Unless explicitly mentioned, only the direct (i.e. first-level) linkages are reported for a call-site. Each report produces a text file whose format and contents are described in the following sections.

The callgraph/reports pipeline is the main driver of the standard reports. To run it, use the following command, while in the project home:

java -classpath p2j/build/lib/p2j.jar \
   com.goldencode.p2j.pattern.PatternEngine callgraph/reports <basepath> "*.ast"

where <basepath> needs to be replaced with the basepath parameter from p2j.cfg.xml.

All the generated files will have the lst extension and will be placed in the project home.

The first output when generating reports will be the set of all token types loaded into the graph. Statistics will be output to STDOUT, under this form:

Using these node-types for reporting:
<token-name> : <node-count>

This phase will also check integrity of the (node-type, node-id) unique constraint. If this constraint is not valid, (i.e. the node count for a certain node-type is not the same as number of unique node-id's for this node-type), a warning will be output to STDOUT.

When running the reports, the PatternEngine will walk the ASTs for each external program. For this reason and to improve performance, each report will process only ASTs with their token type in the set of token-types loaded into the graph database, avoiding unnecessary database hits for the ASTs which don't have a graph database counterpart.

View Program Dependencies¶

The generated file is named all_dependencies.lst and is generated by the callgraph/list_dependencies rule-set. It provides the list of all external procedures, which have one or more associated call-sites. The structure of this file is:

for each external procedure, show its name on a single line using a Filename: %s format, where %s is replaced with the project-relative name of this file.
each external procedure name is followed by one or more lines having this format:

<ast-id> [line:column] <include-file-name> | <node-type> | <target>

where:

<ast-id> is the call-site's AST id
the line and column represent the location of this call-site in the .cache file.
<include-file-name> is optional, and represents the name of the include file from where this call-site originates.
<node-type> is the target's node-type, as described in the Graph Structure section of this chapter.
<target> is the string representation of call-site's target, depending on the target node's type. For ambiguous call-sites, the target will represent the <hint-name>, as emitted in the ambiguous_files.lst file.

Missing External Programs¶

The missing external programs procedures are computed by the callgraph/list_missing rule-set and provides a list of all external procedures which link to missing external programs. The generated file is named missing_procedures.lst and its structure is:

for each external program having call-sites inking to missing external procedures, show its name on a single line using a Filename: %s format, where %s is replaced with the project-relative name of this file.
each external procedure name is followed by one or more lines having this format:

<ast-id> [line:column] <include-file-name> | <missing-external-procedure-name>

where:

<ast-id> is the call-site's AST id
the line and column represent the location of this call-site in the .cache file.
<include-file-name> is optional, and represents the name of the include file from where this call-site originates.

The file ends with the All missing procedures line, followed by an alphabetically sorted list of all external procedure names which could not be resolved.

Dead Files¶

This report is generated by the callgraph/list_dead_files rule-set and provides a list of dead files, composed from the unreferenced external programs and include files. The generated file is named dead_files.lst and it starts with the sorted list of all external-procedures which are not registered as entry-points in the rootlist XML file or passed as arguments to the CallGraphGenerator tool AND no other call-site is linking to them.

The file ends with an Unused include files: line, followed by the list of all include files which are not referenced by any external procedures. Use the include-spec configuration parameter (which defaults to *.[iI]) to specify a shell pattern, and load all physical include files into the graph database.

External Targets List¶

This report is generated by the callgraph/list_external_targets rule-set and provides a list of all call-sites which link to external targets, as defined in the Graph Structure section of this chapter. The generated file is named external_targets.lst and the structure of this file is:

for each external procedure which has call-sites linking to external targets, show its name on a single line using a Filename: %s format, where %s is replaced with the project-relative name of this file.
each external procedure name is followed by one or more lines having this format:

<ast-id> [line:column] <include-file-name> | <edge-label> | <call-site-key> | <external-target>

where:

<ast-id> is the call-site's AST id
the line and column represent the location of this call-site in the .cache file.
<include-file-name> is optional, and represents the name of the include file from where this call-site originates.
<edge-labe@l@> is the label of this edge from the call-site to the external-target node, as described in the Graph Structure section of this chapter.
<call-site-key> is a property set at the edge, as documented in the in the Graph Structure section of this chapter.
<external-target> represents i.e. a command to start a native process, a network connection string, etc depending on the call-site.

The file ends with the All external targets line, followed by an alphabetically sorted list of external targets by their type, with each line having as the first word the value of the node's node-key property (i.e. command, connect-string, filename, etc).

Ambiguous Programs¶

This report is generated by the callgraph/list_ambiguous rule-set and provides a list of call-sites which need to be disambiguated, as their target is determined at runtime. The disambiguation is done via hints provided in the external-procedure's hint file. The generated file is named ambiguous_files.lst and the structure of this file is:

for each external procedure having ambiguous call-sites, show its name on a single line using a Filename: %s format, where %s is replaced with the project-relative name of this file.
each external procedure name is followed by one or more lines having this format:

<ast-id> [line:column] <include-file-name> | <hint-name>

where:

<ast-id> is the call-site's AST id
the line and column represent the location of this call-site in the .cache file.
<include-file-name> is optional, and represents the name of the include file from where this call-site originates.
<hint-name> is the name of the hint which needs to be provided, to disambiguate this call-site.

The <hint-name> provided in this file can be used in the associated UAST hint file, to disambiguate this call-site, as specified in the Resolving the Ambiguous Call-Sites section of this chapter.

Schema Trigger Procedures¶

The problems related to schema trigger procedures are reported by the callgraph/verify_schema_triggers rule-set. The rule-set will check for consistency each procedure associated with a schema trigger and each defined schema trigger. Inconsistencies will be listed in the schema_trigger_procedures.lst file.

Each problematic external program will be reported on a single line using a Filename: %s format, where %s is replaced with the project-relative name of this file. Following this line, the possible problems related to this external program are reported:

When an external procedure is defined as a schema trigger and the target table has a trigger with the specified type but not for this program, a Table <schemaname> has no <trigger-type> trigger for this program. line will be reported.
When an external procedure is defined as a schema trigger and the target table has no trigger with the specified type, a Table <schemaname> has no triggers line will be reported.
When an external procedure is not defined as a schema trigger and the procedure is registered as the target for one or more schema triggers, the Not defined as a trigger, but linked to these table triggers: will be reported, followed by one or more <trigger-type> : <schemaname> lines, with details about the triggers which target this external program.

Project

General

Profile

FWD

Wiki

Call Graph Analyzer¶

Call Graph Configuration and Connection¶

Specifying the Entry Point Programs¶

Generating the Call Graph¶

Using ConversionDriver¶

Using CallGraphGenerator¶

Graph Structure¶

Nodes Associated with Legacy Code¶

Nodes Associated with External Targets¶

Linkage Types (a.k.a. Edge Description)¶

Resolving the Ambiguous Call-Sites¶

Standard Reports¶

View Program Dependencies¶

Missing External Programs¶

Dead Files¶

External Targets List¶

Ambiguous Programs¶

Schema Trigger Procedures¶