Stimulus Structures and Mental Representations in Expert Comprehension of Computer Programs By: Nancy Pennington Presented By: Michael John Decker Author & Journal

Stimulus Structures and Mental Representations in Expert Comprehension of Computer Programs By: Nancy Pennington Presented by: Michael John Decker Author & Journal • Nancy Pennington - No internet presence • Journal: Cognitive Psychology - Is. 19, Pg: 295-341 (1987) • 2013 JIF: 3.571 • Paper Citations: 525 Research Question • What mental model best describe how an expert programmer builds up knowledge of a program? • When studying the code? • When performing maintenance activities? Motivation • Study of program comprehension means studying role of particular kinds of knowledge in cognitive skill domains • Estimated 50% of professional programmers time is spent on program maintenance • comprehension is a significant part of maintenance • Increased understanding of how knowledge is obtained and applied then use for higher productivity and decreased maintenance costs Computer Program Text Abstractions • Goal hierarchy - functional achievements of program • Data flow - transformations applied to data objects • Control flow - sequences of program actions and the passage of control between them • Conditionalized Actions - representation specifying states of the program and the actions invoked Programming Knowledge • Text Structure Knowledge • Plan Knowledge Text Structure Knowledge • Decomposition of a program’s text into text structure units • Segmentation of statements into phrase-like grouping that are combined into higher-order groupings • Syntactic markings provide clues into the boundary of segments • Segmentation reflects the control structure of the program • Relation to abstractions: • control flow information - easy to obtain • data flow and goal-hierarchy - difficult to obtain when involves across unit • conditionalized actions - difficult to obtain Plan Knowledge • Patterns of program instructions combine to accomplish some functions • e.g. bubble-sort, swap • Recognition of patterns implementing known plans • Plans are activated by partial pattern matching • Results reflect data flow structure of program indexed by program functions • Relation to abstractions: • data flow and goal-hierarchy - easy to obtain • control flow and conditionalized actions - difficult to obtain Study One: Details • 80 Expert Programmers (40 COBOL, 40 FORTRAN) • 8 Program Segments: 15 lines, comprehensible, and different Text Structure and Plan units • 6 comprehension questions per segment that tested knowledge related to specific abstractions • 22 item recognition test list (4 targets per segment separated into two sets) • constructed from triple consisting of target item, Text Structure prime, and Plan prime • Response and response times were collected Study One: Design • 2 (language) x (4 orders) x 2 (subject groups within language) x 2 (prime types) x 2 (sets of target items) • Language, order of presentation, and subject groups between participant with subject group, prime type, and target items forming a 2 x 2 x 2 Latin square Study One Results: Recognition TS COBOL Plan COBOL TS COBOL Plan COBOL TS FORTRAN Plan FORTRAN TS FORTRAN Plan FORTRAN 3 2.9 2.9 2.8 2.8 2.7 2.7 2.6 2.6 2.5 2.5 2.4 2.4 2.3 2.3 2.2 2.2 2.1 2.1 Target Set 1 Target Set 2 Subject Group 1 Subject Group 2 • Response times consistently quicker for Text Structure units then for Plan • Accuracy was not significantly different Study One Results: Comprehension Operations Control Flow Data Flow State Function 50% 40% 30% 20% 10% FORTRAN COBOL • Error rates least for program operations and Control Flow (Text Structure Knowledge) • FORTRAN programmers better for control flow and COBOL better at data flow Study Two: Details • 40 Expert Programmers (20 COBOL, 20 FORTRAN) best and worst previous study • 200 line program computes specifications for industrial plant designs • Part I: 45 min. spent studying program followed by summary and comprehension questions • Part II: 30 min. modification activity followed by summary and comprehension questions • 40 Questions (10 Control Flow, 10 Data Flow, 10 goal hierarchy, 10 conditionalized actions) • Divided two matching sets of 20 (question correspondence between sets) Study Two: Design • 2 (language) x 2 (previous performance) x (talk and no talk) x 2 (comprehension tests) x 4 (question categories) • Language, previous performance, and talk/no talk between-participant with separate tests and categories within-participant Study Two Part I Results: Comprehension Control Flow Data Flow State Function 60% • Results 50% comparable to first study 40% 30% • Control Flow lowest error indicating 20% Textual Structural 10% Model 0% Information after Study Session Study Two Part I Results: Summaries • Classified by type: procedural, data flow, function statements • 57% procedural, 30% data flow, 13% function • Indicates Text Structure Knowledge Study Two Part II Results: Comprehension Control Flow Data Flow State Function 60% 50% 40% 30% 20% 10% 0% Information after Modification Session No Talk Information after Modification Session Talk • Data flow and function lowest error rates indicating Plan Knowledge • Participants who talked allowed show larger disparity and larger trend toward Plan Knowledge Conclusions • Text Structure Knowledge: plays an the initial organizing role in memory for programs (Study One, Study Two Part I) • Plan Knowledge: comes to play in later stages of program comprehension under appropriate task conditions Future Work • Find evidence for why/how shift between Text Structure Knowledge and Plan Knowledge occurs • Speculate situation model • Evidence for studies is found by post-task questions and analysis, more recent technology (eye-tracking, EEG) could possible be used to measure and identify how mental model is built • Further look at the language affect on mental model Ending Remarks • Overall, well thought out studies that controlled for a number of variables (e.g. language) • Results that initial model differs from latter model indicate task related shift which sets stages for latter research • The second study lacked data about summaries after modification task and a comparison • 200 line program is no longer moderate length (if ever was) and is still easy to understand specially in time allotted. How would this apply to a moderate length project now? References • [Pennington'87] Pennington, N., (1987), "Stimulus Structures and Mental Representations in Expert Comprehension of Computer Programs", Cognitive Psychology, vol. 19, pp. 295-341..

Stimulus Structures and Mental Representations in Expert Comprehension of Computer Programs By: Nancy Pennington Presented By: Michael John Decker Author & Journal

7. Control Flow First?

A Survey of Hardware-Based Control Flow Integrity (CFI)

Control-Flow Analysis of Functional Programs

Obfuscating C++ Programs Via Control Flow Flattening

Control Flow Statements

Control Flow Statements in Java Tutorials Point

Control Flow Analysis

Code Based Analysis of Object-Oriented Systems Using Extended Control Flow Graph

Computer Science 2. Conditionals and Loops

Statement-Level Control Structures

Control Flow

Control Flow