SQLFlow: PL/SQL Multi-Diagrammatic Source Code Visualization

Samir Tartir Ayman Issa Department of Computer Science University of the West of England (UWE), University of Georgia Coldharbour Lane, Frenchay, Bristol BS16 1QY Athens, Georgia 30602 United Kingdom USA Email: [email protected] Email: [email protected]

ABSTRACT: A major problem in software introducing new problems to other code that was perfectly maintenance is the lack of a well-documented source code working before. Therefore, educating programmers about in software applications. This has led to serious the behaviors of the target code before start modifying it difficulties in and evolution. In is not an easy task. Source code visualization is one of the particular, for those developers who are faced with the heavily used techniques in the literature to overcome this task of fixing or modifying a piece of code they never lack of documentation problem, and its consequent even knew existed before. Database triggers and knowledge-sharing problem [7, 9]. procedures are parts of almost every application, are typical examples of badly documented software Therefore, this research aims at investigating the current components being the hidden back end of software database source code visualization tools and identifying applications. Source code visualization is one of the the main open issues for research. Consequently, the heavily used techniques in the literature so as to mostly used PL/SQL database has familiarize software developers with new source code, been selected for a flowcharting reverse engineering and compensate for the knowledge-sharing problem. prototype tool called "SQLFlow". SQLFlow has been Therefore, a new PL/SQL flowcharting reverse designed, implemented, and tested to parse the target engineering tool, SQLFlow, has been designed and PL/SQL code stored in database procedures, analyze its developed. SQLFlow is a three tiers architecture tool that structure, and render its flow in a visual flowchart. dominates over the currently available flowcharting tools by its powerful multi-diagrammatic and source code Section 2 presents a brief summary for the most well- metrics extraction capabilities. Finally, future work is known flowcharting tools. Section 3 discusses the high planned to integrate SQLFlow with the UML modelling level architecture and the underlying components of standards in the software industry. SQLFlow. Section 4 contrasts the features of SQLFlow against the available literature. The validation and KEYWORDS: PL/SQL, Flowchart, Reverse engineering, verification of SQLFlow is detailed in section 5. legacy systems. Finally, an evaluation including an outline of future research work is presented in section 6. 1. INTRODUCTION 2. COMMERCIAL FLOWCHARTING Databases (DBs) form a part of almost every computer TOOLS application that is currently in use [11]. A huge effort is On the professional level, code visual to flowchart and usually put in designing and building a database and its Visutin [1, 3] are the most well known and commonly underlying stored packages, procedures, and triggers. As used commercial flowcharting tools in the literature. Both a part of a larger system, databases usually are called the tools are so powerful to support source code visualization "back end" of the system, and this has usually led to not of several programming languages such as: , C++, C#, paying enough attention to document code that is stored VB, Java, and much more. But, they also share the same within, contrary to code of a the "front end" which is limitation of supporting very small number of DB usually better documented [10]. scripting languages compared to the supported number of high level programming languages. In many cases, there is no documentation of the project source code and the people who originally wrote the code The underlying operational principle of both tools is no longer work with the company. This situation might identical. They rely on the system user to define the cause problems for programmers who are faced with the physical location of the target source code to analyze and task of fixing or modifying a database code they never visualize. However, they differ in the way they present even knew existed before. their output and how they integrate with other external systems such as: VISIO and MS Word [1, 3]. In that, The results of such a situation may be the introduction of Visutin seems to show more powerful integration with more problems than the ones that were intended to fix, or external systems.

1

Figure 1: SQLFlow Architecture. Unfortunately, both tools concentrate on flowcharting The main responsibility of the parser is to find the lexical program visualization and ignore the other useful tokens of the source code and identify the tokens that will that might be of interest to the users in the change the flow of control. Therefore, it goes through the different stages of the life cycle input PL/SQL source code file, line by line, and identifies (SDLC). Moreover, they also discard the role of several the keywords that will determine its execution flow. For complexity metrics that could be extracted from the example, it will recognize a "Begin" keyword, as a start of source code, which are highly beneficial to the different a new bock. On the other hand, finding an "If" statement stakeholders in the different stages of the SDLC. will start a branching statement that should be closed by an "End If" or "Else" keyword. On the other hand, several academic attempts [7] have been cited in the literature as prototyping tools to support Currently, there are three types of more types of diagrams, but not metrics. Kita et al. [7] is a statements; sequential, conditions and repetition. While typical example for a prototype tool that that aims at accessing each token, the parser dynamically involves the educating Java novice programmers by generating flow GUI builder component that will handle any requirements and paths for their target Java files. to reflect the token's action on the . Figure 2 presents the pseudo code of the parser’s component workflow. This raises the flag for an urgent need for a new source code visualization tool that overcome the above major For each line in the input file Read current word; drawbacks of the most well known flowcharting tools. Case Word Correspondingly, SQLFlow has been designed to not only Case "--": single-line: ignore; generate the flow chart of the target Oracle PL/SQL Case "/*": multiple-line comment: find closing and ignore; source code, but also the Flow graph. In addition, several Case "BEGIN": Signal a new program metrics are now extracted from the source code to block; facilitate the subsequent analysis and testing phases. Case "DECLARE": ignore and step to the next "Begin"; Case "EXCEPTION": Signal an exception 3. SYSTEM ARCHITECTURE AND block; Case "ELSE": Signal else; COMPONENTS Case "IF": Find the corresponding "THEN" and signal "IF" block; SQLFlow is a light weight MS Visual Basic 6.0 tool designed to read an input file stored from an Oracle Case "LOOP": Signal "LOOP" block; Case "FOR": Find the corresponding PL/SQL database procedure and generate the "LOOP" statement and signal a "FOR" corresponding flowchart and flow graph that reflect its block; Case "WHILE": Find the corresponding operational flow. "LOOP" statement and signal a "WHILE" block; A model-view-controller (MVC) three-tier architecture Case "END": Get next word [10] has been adopted in SQLFlow implementation to Case ";": Close last "BEGIN" block; separate the concerns of the different system components. Case "LOOP": Close last "LOOP, FOR, or WHILE" block; The first tier, the view tier, is represented by a graphical Case "IF": Close last "IF" block; user interface (GUI) Builder component that is Case "WHEN": Find the corresponding "THEN" and Signal a "WHEN" block; responsible on analyzing the results of parsing the source code and generating the corresponding diagrams. The Case "RAISE": find the corresponding ";" and signal a statement; middle tier, the controller tier, holds the parsing and Otherwise: "one-word;", "SQL", "one- analyzing business logic component. And, the last tier, the word...;", or ...:=...;": Get next word: persistence model tier, is represented by the file management component. Figure 1 depicts the visual view Case SQL: SQL Statement: find corresponding ";" and signal of the system architecture. Sections 3.1 and 3.2 discuss statement; the details of the controller and view components, Case Assignment: Assignment statement: Signal assignment statement; respectively. Otherwise: Procedure call: Signal procedure call statement; 3.1 SOURCE CODE PARSING Figure 2: Parser Component Pseudo Code. COMPONENT

2 3.2 GUI BUILDER COMPONENT maintenance phases; while future projects could combine the extracted metrics with effort and productivity The main responsibility of the GUI builder component is information to be reused in estimating new software to handle the actual generation of the flowchart and flow development projects. graph diagrams. It "listens" to signals by the parser, and processes each signal appropriately to draw the correct "BEGIN": Draw a with the word shapes to reflect the actions of the signal. For example, if "Begin".

the signal was for a loop, the drawer will draw the loop "EXCEPTION": Draw a rectangle with the word parallelogram shape and will keep a note of an open loop "Exception". that needs an "end loop;" statement to close it. Figure 3 summarizes the parser signals caught by the GUI builder "IF": Draw a diamond containing the condition. and its corresponding reaction. "ELSE": Draw another branch for the last "IF" Hence, the GUI Builder builds the required diagrams diamond and move drawing to that dynamically by showing each shape that is being added to branch.

the . This makes it easier to follow the progress of "LOOP": Draw a parallelogram with the word the parsing process and trace the generated diagrams to its "Loop". source code fragment. "FOR": Draw a parallelogram with the loop condition. 4. SQLFLOW VALIDATION AND

EVALUATION "WHILE": Draw a parallelogram with the loop condition.

Several DB applications have been targeted to validate "END": Draw a rectangle that closes the last and verify SQLFlow. An iterative validation and "BEGIN" and draw a line between the "BEGIN" and "END" . verification process has been adopted to evaluate the generated flowcharts, flow graphs, and source code "END LOOP": Draw a rectangle that closes the last "LOOP" and draw a line between metrics. The intention is to prioritize SQLFlow functionalities and get the core functionality, source code the "LOOP" parallelogram and the "END" rectangle. parsing and analyzing, delivered earlier. Then, the other supportive functionalities, GUI and file management, are "END IF": Draw a rectangle that closes the last "IF" and draw a line between the delivered in subsequent iterations. Further, focusing on validating and verifying one component at a time had "IF" diamond and the "END" rectangle. resulted in cleaner and more structured source code to "WHEN": Draw a diamond containing the facilitate future reusing and integration with other condition. systems. On the other hand, earlier iterations "RAISE": Draw a rectangle with the raise functionalities tend to receive more testing with every statement. new system version [5].

"SQL": Draw a rectangle with the SQL statement. Figures 4, 5, and 6 present a sample PL/SQL source code, and its corresponding flowchart, and flow graph, "Assignment": Draw a rectangle with the respectively, generated using SQLFlow. The top right assignment statement. corner of the flowchart and flow graph diagrams is allocated for the resulting source code metrics. The "CALL": Draw a rectangle with the call. flowchart related metrics [4] are: maximum nesting depth, total number of SQL statements, total number of Figure 3: GUI Builder Caught Signals and conditional statements, and total number of repetition itsCorresponding Reaction. statements. On the other hand, the flow graph metrics are: cyclomatic complexity [8], number of regions, and number of predicates. 5. SQLFLOW FEATURES COMPARISON

A typical SDLC consists of 5 main phases: requirements Comparing SQLFlow features to the available analysis, design, implementation, testing, and commercial and academic flowcharting tools [1, 3, 7]; it deployment. The objective of the extracted source code has been found that SQLFlow dominates over them from metrics is to be used as a historical data to guide and different perspectives. In particular, the multi- inform current and future software development projects. diagrammatic and metrics extraction appear to be unique Current projects could utilize the extracted metrics, e.g. in SQLFlow. However, further development is needed to cyclomatic complexity, conditional statements, etc, to extend SQLFlow scalability to consider large volumes inform the upcoming testing, deployment, and DB triggers and procedures. Table 1 summarizes the

3 results of contrasting the different flowcharting tools with SQLFlow.

declare a number; b number; c number; d number; e number; f number; g number; h number; i number; j number; begin Proc1;Proc2(a, b, c); raise form_trigger_failure; if a=1 or(b=2 and(c=3 and(d=4 or(e=5 and f=6)or g=7)and h=8)or(i=9 and j=10)) then message('Condition met'); else message('Condition Failed'); Figure 6: Sample PL/SQL Source Code Flow Graph. end if;

FOR K IN 1..54 LOOP Table 1: Flowcharting Tools Features Comparison. MESSAGE(I); Feature Code visual to Visutin SQLFlow END LOOP; flowchart end; Support Function and No Yes Yes Figure 4: Sample PL/SQL Source Code. Procedure Call Provide Source No No Yes Code Metrics Support Error No No Yes Detection Operating System No Yes Yes Limitations Performance High High High Generated Diagrams High High High Clarity Support Multiple No No Yes Diagrams Generation

6. CONCLUSION AND FUTURE WORK

A new PL/SQL flowcharting tool, SQLFlow, has been design, built, and presented. It aims at facilitating the process of knowledge sharing between the different software development team members. On the other hand, it eases the burden of understanding and maintaining the existing legacy systems.

SQLFlow is a light weight three-tier architecture tool that separates the concerns of the embodied diverse components. Contrasting SQLFlow against other professional and academic flowcharting tools, it has been concluded that SQLFlow dominates over them by its Figure 5: PL/SQL Source Code Sample Flowchart. powerful multi-diagrammatic and metrics extraction features. However, further development is needed to extend SQLFlow scalability.

4 Future work is planned to upgrade SQLFlow to visualize [4] Fenton, N. and Neil, M., (2000). Software the source code using the standard UML [2] activity Metrics:Roadmap in International Conference on diagrams. This paves the way to integrate SQLFlow into a Software Engineering New York, NY, USA ACM UML design tool, such as rational rose, making the Press, pp.357-370. PL/SQL code a part of the formal model of the [5] Harrold, M., (2000). Testing:a Roadmap in application. International Conference on Software Engineering

New York, NY, USA IEEE-CS: Computer Society, Finally, applying SQLFlow to several legacy DB systems pp.61-72. will result in building large volume multi-purpose software metrics repository that could be utilized to [6] Issa, A., Odeh, M., and Coward, D., (2005). inform the diverse SDLC phases [6], e.g. cost estimation Using Use Case Models to Generate Object Points. and planning phase, of prospective software development in Kokol, P. Proceedings of the IASTED projects. International Conference on Software Engineering Austria. ACTA Press, pp.468-473. 7. ACKNOWLEDGEMENT [7] Kita, Y. Kawasoe T. Katayama T., (2005). Prototype Of An Automatic Visualization Tool For Samir Tartir would like to express his gratitude to Prof. Java To Educate Novice Programmers in Kokol, Talib Al-Sari for his useful feedback during SQLFlow P. Proceedings of the IASTED International development. Conference on Software Engineering Austria. ACTA Press, pp.307-312. REFERENCES [8] McCABE, T., (1976). A Complexity Measure IEEE Transactions on Software Engineering, 2 [1] Aivosto, (2003). Visustin Flow Chart Generator (4), pp. 308-320. [online]. Aivosto.com. Available from: [9] Quatrani, T., (2000). Visual Modeling With http://www.aivosto.com/visustin.html [Accessed Rational Rose 2000 and UML. Rev. ed. Boston, 21-5-2005]. Mass. London: Addison Wesley. [2] Booch, G., Rumbaugh, J. and Jacobson, I., (1999). [10] Sommerville, I., (2001). Software Engineering. 6th The Unified User Guide. ed. Harlow, England ; New York : Addison- Reading, Mass. Harlow: Addison-Wesley. Wesley. [3] Fatesoft (1997). Code Visual to Flowchart [online]. [11] Sommerville, Ian and Kotonya, Gerald, (1998). USA: Available from: : Processes and http://www.fatesoft.com/s2f/ [Accessed 21-5- Techniques. Chichester, New York : J. Wiley. 2005].

5