Decentralized Software Architecture Discovery in Distributed Systems

4400 University Drive MS#4A5 Department of Computer Science Fairfax, VA 22030-4444 USA George Mason University Technical Reports http://cs.gmu.edu/ 703-993-1530 Decentralized Software Architecture Discovery in Distributed Systems Jason Porter Daniel A. Menascé Hassan Gomaa [email protected] [email protected] [email protected] Technical Report GMU-CS-TR-2016-2 Abstract and composition. The architecture helps in the under- standing of complex systems, supports reuse at both the Software architecture discovery plays an increasingly component and architectural level, indicates the major important role in the evolution, maintenance, and run- components to be developed and their relationships and time self-adaptation of modern software systems whose constraints, exposes changeability of the system as well architecture may have become outdated or may not have as allowing the verification and validation of the target previously existed. However, current approaches to system at a high level [22, 56]. architecture discovery take a centralized approach, in An area in which software architecture has been very which the process is carried out from a single location. influential is that of self-adaptive systems, i.e., systems This proves inadequate in the case of large distributed that are capable of self-configuration, self-optimization, systems which, due to size, consist of nodes that are self-healing and self-protection, also called self-* or disparately located and are highly dynamic in nature. autonomic systems [25]. In architecture-based self- This report presents DeSARM: Decentralized Software adaptation, components dynamically change in order to Architecture discoveRy Mechanism, a completely de- continuously adhere to architectural specifications and centralized and automated approach for software archi- system goals. This approach has proven to be very pop- tecture discovery of distributed systems based on gos- ular and many groups have used it as the foundation siping and message tracing. Through message tracing, for their work [59]. In a previous work [37], Menasce´ et the technique is able to identify important architectural al. developed a framework for a self-architecting soft- characteristics such as components and connectors, in ware system (SASSY), which was designed to automate addition to synchronous and asynchronous communi- the architectural decision making process for service- cation patterns. Furthermore, through its use of gossip- oriented systems in the face of quality-of-service (QoS) ing, it exhibits the properties of scalability, global consis- trade-offs. The framework automatically generates, at tency among participating nodes, self-organization, and runtime, candidate software architectures and selects the resiliency to failures. The report discusses DeSARM’s one that best serves stakeholder-defined scenario-based architecture and detailed design and demonstrates its QoS goals [14, 35]. Adaptation decisions are made based properties through an analysis of small and large-scale on changes in the system’s operational environment that experiments. affect these goals [19, 18]. A new architecture is then generated and the system is reconfigured to a new QoS optimized state. 1 Introduction Whereas with SASSY the current (i.e., before adaptation) architecture is assumed to be known, this research Software architecture—the high level structures of a soft- considers the case in which the architecture is unknown ware system including a collection of components, con- and needs to be discovered at run-time. It is not uncom- nectors and constraints—plays an increasingly critical mon for the architecture to be unknown in large-scale role in the design and development of any large com- distributed systems because the system’s structure is of- plex software system. These artifacts are needed to rea- ten dynamic due to churn, where nodes randomly join son about the system. In general, software architecture and leave the network and may fail. acts as a bridge between requirements and implemen- Software architecture erosion may occur when the tation and provides a blueprint for system construction prescriptive or intended software architecture departs 1 from the descriptive or implemented software architec- tion 4 describes the architecture of DeSARM, our archi- ture due to the system’s evolution over time [10]. As tecture discovery method. Section 5 presents the details one can imagine, such discrepancies can affect adapta- of running experiments with DeSARM. Finally, Section 6 tion decisions at run-time where complete knowledge presents some concluding remarks and discusses possi- of the current architecture is imperative. Besides ero- ble future work. sion, another factor that would require the retrieval of the software architecture at run-time is when there is no prescriptive architecture available for the system. This 2 Related Work often occurs when the system is either developed with- out an explicit architecture or design documents may The columns of Table 1 categorize software architec- be lost [22]. Such issues are a motivation for software ture discovery techniques into either centralized or dis- architecture discovery. Software architecture discovery tributed and the rows categorize software applications as involves the methods, techniques and processes used to either centralized or distributed. A centralized architec- uncover a software system’s architecture from available ture discovery method is one in which the discovery pro- information [38]. This is the focus of this report. cess, which includes data gathering and processing, is Currently, most architecture discovery techniques rely done from a single location. Conversely, in a distributed on input obtained from implementation level artifacts discovery method, data gathering and processing is ac- to reconstruct the software architecture [39]. Software complished across multiple locations. We discuss here maintenance and evolution requires architecture discov- prior work that uses centralized discovery methods ap- ery when the original architecture has eroded [16]. In our plied to both centralized and distributed applications, work we do architecture discovery for architecture-based as well as distributed discovery methods applied to dis- adaptation purposes and perform discovery through a tributed applications, which is the focus of our report. decentralized analysis of message flows between com- Clearly, a distributed method applied to a centralized ponents in a distributed system. application is not applicable. A software architecture has structure and behavior, where structure captures components and how Architecture Discovery Method they are connected, and behavior captures interactions Application Centralized Distributed among components. This report focuses on software Centralized N/A architecture structure discovery but also addresses be- Distributed havioral discovery of the architecture such as inter- component communication patterns (i.e., synchronous, Table 1: application structure and architecture discovery asynchronous, single or multiple destination). We re- method cover the software architecture in a decentralized man- ner by keeping message logs in each node and dissemi- Software architecture discovery approaches can be fur- nating message interaction information between nodes ther classified into dynamic, static, and hybrid, i.e., com- through the use of gossip exchanges. Once convergence bining both dynamic and static analysis, as described in is achieved, each component will have a global view of what follows. the architecture. Israr et al. [23] describe SAMEtech, a dynamic ap- This report makes the following key contributions: proach for automating the discovery of architecture models and layered performance models from message trace 1. DeSARM: Decentralized Software Architecture dis- information. DiscoTect [53] uses a set of pattern recog- coveRy Mechanism, a completely decentralized and nizers and knowledge of the architectural style being automated method for software architecture discov- implemented to map low-level system events into high- ery of distributed systems using gossiping and mes- level architecturally meaningful events. Bojic and Vela- sage tracing. The information dissemination and sevic [6] use test cases that cover relevant use-cases, and fast convergence capability of gossiping aids each concept analysis to group system entities together that component in deriving a view of the architecture. implement similar functionality. Vasconcelos et al. [58] 2. Demonstration that the software architecture dis- use specified use-cases to generate execution traces from covery mechanism exhibits the following properties: which interaction patterns are identified using pattern self-healing, self-organizing, global consistency, and detection in order to define architectural elements. Es- scalability. These properties relate to the architec- fahani et al [13] take a dynamic approach to recovering ture discovery process not to the failure recovery of the architectural model of a distributed application by the application system. generating a component interaction model using data mining. The approach works by collecting system ex- The rest of this report is organized as follows. Section 2 ecution traces at run-time, then uses association rule discusses related work. Section 3 provides a problem mining to infer a probabilistic model of the component description along with some basic assumptions. Sec- interactions taking place. 2 A number of

Load more