Data Engineering for Data Analytics: a Classification of the Issues, And

Total Page:16

File Type:pdf, Size:1020Kb

Data Engineering for Data Analytics: a Classification of the Issues, And Data Engineering for Data Analytics: A Classification of the Issues, and Case Studies Alfredo Nazabal1, Christopher K.I. Williams12, Giovanni Colavizza3∗ , Camila Rangel Smith1, and Angus Williams* 1The Alan Turing Institute, London, UK 2School of Informatics, University of Edinburgh, UK 3Media Studies Department, University of Amsterdam, The Netherlands April 28, 2020 Abstract approaches; observations can be considered anomalous according to different criteria; or features can be trans- Consider the situation where a data analyst wishes to formed in different ways. carry out an analysis on a given dataset. It is widely Furthermore, most publicly available datasets have al- recognized that most of the analyst's time will be taken ready undergone some form of pre-processing. While up with data engineering tasks such as acquiring, under- providing a clean version of the data is extremely help- standing, cleaning and preparing the data. In this paper ful from a modelling perspective, data engineering re- we provide a description and classification of such tasks searchers suffer from a limited availability to public messy into high-levels groups, namely data organization, data datasets. This leads to researchers addressing some of quality and feature engineering. We also make available the problems by synthetically corrupting clean datasets four datasets and example analyses that exhibit a wide according to the data wrangling problem to be solved. variety of these problems, to help encourage the develop- However, such synthetic corruption is often not sufficient ment of tools and techniques to help reduce this burden to capture the wide variety of corruption processes ex- and push forward research towards the automation or isting in real-world datasets. All of this variability in semi-automation of the data engineering process. terms of data engineering issues and possible solutions, and the lack of public messy datasets and their final 1 Introduction cleaned version makes attempting to automate this prob- lem extremely challenging. A large portion of the life of a data scientist is spent ac- In this paper we first provide a classification of data en- quiring, understanding, interpreting, and preparing data gineering problems appearing in messy datasets when a for analysis, which we collectively term data engineering1. data scientist faces an analytical task, and give practical This can be time-consuming and laborious, for example examples of each of them. We have identified three high- [1] estimate that these tasks constitute up to 80% of the level groups of problems: Data Organization issues arXiv:2004.12929v1 [cs.DB] 27 Apr 2020 effort in a data mining project. Every data scientist faces (DO), related to obtaining the best data representation data engineering challenges when carrying out an anal- for the task to be solved, Data Quality issues (DQ), re- ysis on a new dataset, but this work is often not fully lated to cleaning corrupted entries in the data, and Fea- detailed in the final write-up of their analysis. ture Engineering (FE) issues, related to the creation Our focus is on a task-driven analysis scenario, where of derived features for the analytical task at hand. Addi- a data scientist wants to perform an analysis on a given tionally, we have further divided the DO and DQ groups dataset, and they want to obtain a representation of the according to the nature of data wrangling problem they data in a format that can be used as input to their face. Under Data Organization we include data parsing preferred machine learning (ML) method. Interestingly, (DP), data dictionary (DD), data integration (DI) and there is no unique or correct way of wrangling a messy data transformation (DT). Under Data Quality we in- dataset. For example, features and observations with clude canonicalization (CA), missing data (MD), anoma- missing values can be deleted or imputed using different lies (AN) and non-stationarity (NS). Providing a classi- fication of wrangling challenges not only pushes research ∗ The work described in this paper was carried out when Gio- in these individual fields, but also helps to advance the vanni Colavizza and Angus Williams were with the Alan Turing Institute automation or semi-automation of the whole data engi- 1Another term often used is data wrangling. neering process. 1 A second contribution of the paper is to make available a sample of anonymized medical records from a num- four example messy datasets, each with an associated an- ber of London hospitals, including demographic data, alytical task. The analyses were carried out by data sci- drug dosage data, and physiological time-series measure- entists at the Alan Turing Institute (in some cases repli- ments [4], together with publicly available data clean- cating steps that were taken for published analyses). In ing scripts. Our chosen analytical challenge is to predict the appendices we describe the cleaning operations they which patients die in the first 100 hours from admission needed to perform in order to obtain a version of the data to the hospital. See Section 5.3. that could be used for the analysis task at hand. These provide practical examples of the issues identified in our Ofcom Consumer Broadband Performance classification, and give an insight of what constitutes a dataset: The Consumer Broadband Performance practical pipeline during a data wrangling problem. dataset contains annual surveys of consumer broadband The structure of the paper is as follows: Section 2 in- speeds and other data, commissioned and published by 3 troduces the four cases studies that will provide examples Ofcom , and available as a spreadsheet for each year [5]. throughout the paper. Section 3 describes our classifica- The data engineering challenge here is primarily one of tion of data engineering challenges, broken down under matching common features between the data for any two the headings of Data Organization, Data Quality and years, as the names, positions, and encodings change Feature Engineering. Section 4 discusses related work. every year. There was no given analytical challenge with Section 5 provide details of the data engineering and mod- the data as published, so we have chosen to build a elling steps carried out for each of the four datasets, and model to predict the characteristics of a region according our conclusions appear in section 6. to the properties of the broadband connections (see Section 5.4). 2 Overview of Case Studies Table 1: List of wrangling issues encountered in the We have identified four case studies, drawing data from a datasets. variety of domains and having a variety of formats: plant Data Data Feature measurements, household electricity consumption, health Organization Quality Engineering records, and government survey data. We refer to Sec- Dataset DP DD DI DT CA MD AN NS FE tion 5 for detailed descriptions of the data engineering Tundra ••• ••• • challenges and modelling approaches of each dataset. We HES ••• •• • also provide a GitHub repository with the datasets and 2 CleanEHR ••• ••• • the wrangling processes. Table 1 shows an overview of Broadband ••• •• • the wrangling challenges present in each of our use cases. Note that the actual challenges that an analyst encoun- ters depend on the particular analysis that is undertaken, which in turn will depend on the particular analytical task being addressed. 3 Classification of Wrangling Challenges Tundra Traits dataset: The Tundra Traits data con- sists of measurements of the physical characteristics of When data analysts want to employ their preferred ma- shrubs in the Arctic tundra, as well as the data wran- chine learning models, they generally need the data to be gling scripts that were used to produce a \clean" version formatted in a single table. Although data engineering of the dataset [2]. The analytical challenge entails build- issues extend to other forms of data such as images, text, ing a model to gauge the effect of temperature and pre- time series or graph data, in this work we take the view cipitation on specific shrub traits related to plant growth that the data analyst wishes to obtain a clean regular ta- (see Section 5.1). ble that contains the information needed to perform the desired analytical task. To define notation, a table is as- Household Electricity Survey (HES) dataset: The sumed to consist of n rows and D columns. Each row is Household Electricity Survey (HES) data contains time- an example or instance, while each column is a feature series measurements of the electricity use of domestic ap- or attribute, and attributes can take on a set of different pliances, as well as a report on data cleaning by Cam- values. bridge Architectural Research [3]. Our chosen analytical challenge is to predict the energy consumption of a house- We mainly focus on two large groups of data wran- hold given its appliance profile (see Section 5.2). gling issues: those related with organizing the data and those related with improving the quality of the data. We Critical Care Health Informatics Collaborative also include a third group on feature engineering, where (CleanEHR) dataset: The CleanEHR data contains the separation between wrangling issues and modelling choices starts to become blurred. In Table 3 at the end 2https://github.com/alan-turing-institute/aida-data- engineering-issues 3The UK Office of Communications. 2 of this section we provide a summary of all the data wran- separator results in different structured tables with three gling problems addressed in this work. Notice that this columns). classification should not be considered as a sequential pro- cess. Depending on the dataset and the analytical task 3.1.2 Data Dictionary some challenges are necessary while others are not, and the order of solving these problems can be variable. The term Data Dictionary refers to understanding the contents of the data and translating that knowledge into additional metadata. Ideally a dataset should be 3.1 Data Organization described by a data dictionary or metadata repository When data scientists start to work on a specific problem which gives information such as the meaning and type with real data, the first problem they face is obtaining of each attribute in a table, see e.g.
Recommended publications
  • Open Source Used in Influx1.8 Influx 1.9
    Open Source Used In Influx1.8 Influx 1.9 Cisco Systems, Inc. www.cisco.com Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco website at www.cisco.com/go/offices. Text Part Number: 78EE117C99-1178791953 Open Source Used In Influx1.8 Influx 1.9 1 This document contains licenses and notices for open source software used in this product. With respect to the free/open source software listed in this document, if you have any questions or wish to receive a copy of any source code to which you may be entitled under the applicable free/open source license(s) (such as the GNU Lesser/General Public License), please contact us at [email protected]. In your requests please include the following reference number 78EE117C99-1178791953 Contents 1.1 golang-protobuf-extensions v1.0.1 1.1.1 Available under license 1.2 prometheus-client v0.2.0 1.2.1 Available under license 1.3 gopkg.in-asn1-ber v1.0.0-20170511165959-379148ca0225 1.3.1 Available under license 1.4 influxdata-raft-boltdb v0.0.0-20210323121340-465fcd3eb4d8 1.4.1 Available under license 1.5 fwd v1.1.1 1.5.1 Available under license 1.6 jaeger-client-go v2.23.0+incompatible 1.6.1 Available under license 1.7 golang-genproto v0.0.0-20210122163508-8081c04a3579 1.7.1 Available under license 1.8 influxdata-roaring v0.4.13-0.20180809181101-fc520f41fab6 1.8.1 Available under license 1.9 influxdata-flux v0.113.0 1.9.1 Available under license 1.10 apache-arrow-go-arrow v0.0.0-20200923215132-ac86123a3f01 1.10.1 Available under
    [Show full text]
  • Buildbot Documentation Release 1.6.0
    Buildbot Documentation Release 1.6.0 Brian Warner Nov 17, 2018 Contents 1 Buildbot Tutorial 3 1.1 First Run.................................................3 1.2 First Buildbot run with Docker......................................6 1.3 A Quick Tour...............................................9 1.4 Further Reading............................................. 17 2 Buildbot Manual 23 2.1 Introduction............................................... 23 2.2 Installation................................................ 29 2.3 Concepts................................................. 41 2.4 Secret Management........................................... 50 2.5 Configuration............................................... 53 2.6 Customization.............................................. 251 2.7 Command-line Tool........................................... 278 2.8 Resources................................................. 289 2.9 Optimization............................................... 289 2.10 Plugin Infrastructure in Buildbot..................................... 289 2.11 Deployment............................................... 290 2.12 Upgrading................................................ 292 3 Buildbot Development 305 3.1 Development Quick-start......................................... 305 3.2 General Documents........................................... 307 3.3 APIs................................................... 391 3.4 Python3 compatibility.......................................... 484 3.5 Classes.................................................
    [Show full text]
  • Gerrit J.J. Van Den Burg, Phd London, UK | Email: [email protected] | Web: Gertjanvandenburg.Com
    Gerrit J.J. van den Burg, PhD London, UK | Email: [email protected] | Web: gertjanvandenburg.com Summary I am a research scientist with a PhD in machine learning and 8+ years of experience in academic research. I am currently looking to transition to industry to work on complex, large-scale problems that can have a positive real-world impact. I have extensive experience with machine learning modeling, algorithm design, and software engineering in Python, C, and R. My goal is to use my expertise and technical skills to address ongoing research challenges in machine learning and AI. Research Experience Postdoctoral Researcher — The Alan Turing Institute, UK 2018–2021 • Introduced a memorization score for probabilistic deep generative models and showed that neural networks can remember part of their input data, which has important implications for data privacy • Created a method for structure detection in textual data files that improved on the Python builtin method by 21%. Developed this into a Python package that has received over 600,000 downloads • Developed a robust Bayesian matrix factorization algorithm for time series modeling and forecasting that improved imputation error up to 60% while maintaining competitive runtime • Established the first benchmark dataset for change point detection on general real-world time series and determined the best performing methods, with consequences for research and practice • Collaborated with other researchers to design a unified interface for AI-powered data cleaning tools Doctoral Researcher
    [Show full text]
  • QUARTERLY CHECK-IN Technology (Services) TECH GOAL QUADRANT
    QUARTERLY CHECK-IN Technology (Services) TECH GOAL QUADRANT C Features that we build to improve our technology A Foundation level goals offering B Features we build for others D Modernization, renewal and tech debt goals The goals in each team pack are annotated using this scheme illustrate the broad trends in our priorities Agenda ● CTO Team ● Research and Data ● Design Research ● Performance ● Release Engineering ● Security ● Technical Operations Photos (left to right) Technology (Services) CTO July 2017 quarterly check-in All content is © Wikimedia Foundation & available under CC BY-SA 4.0, unless noted otherwise. CTO Team ● Victoria Coleman - Chief Technology Officer ● Joel Aufrecht - Program Manager (Technology) ● Lani Goto - Project Assistant ● Megan Neisler - Senior Project Coordinator ● Sarah Rodlund - Senior Project Coordinator ● Kevin Smith - Program Manager (Engineering) Photos (left to right) CHECK IN TEAM/DEPT PROGRAM WIKIMEDIA FOUNDATION July 2017 CTO 4.5 [LINK] ANNUAL PLAN GOAL: expand and strengthen our technical communities What is your objective / Who are you working with? What impact / deliverables are you expecting? workflow? Program 4: Technical LAST QUARTER community building (none) Outcome 5: Organize Wikimedia Developer Summit NEXT QUARTER Objective 1: Developer Technical Collaboration Decide on event location, dates, theme, deadlines, etc. Summit web page and publicize the information published four months before the event (B) STATUS: OBJECTIVE IN PROGRESS Technology (Services) Research and Data July, 2017 quarterly
    [Show full text]
  • VES Home Welcome to the VNF Event Stream (VES) Project Home
    VES Home Welcome to the VNF Event Stream (VES) Project Home This project was approved May 31, 2016 based upon the VNF Event Stream project proposal. In the meantime the project evolved and VES is not only used by VNF but also by PNF (physical network functions). However, the term "VES" is established and will be kept. Next to OPNFV and ONAP also O-RAN, O-RAN-SC and 3GPP are using VES. The term "xNF" refers to the combination of virtual network functions and physical network functions. Project description: Objective: This project will develop OPNFV platform support for VNF and PNF event streams, in a common model and format intended for use by Service Providers (SPs), e.g. in managing xNF health and lifecycle. The project’s goal is to enable a significant reduction in the effort to develop and integrate xNF telemetry-related data into automated xNF management systems, by promoting convergence toward a common event stream format and collection system. The VES doc source, code, and tests are available at: OPNFV github (generally updated with 30 minutes of merged commits) OPNFV gitweb To clone from the OPNFV repo, see the instructions at the Gerrit project page Powerpoint intro to the project: OPNVF VES.pptx. A demo of the project (vHello_VES Demo) was first presented at OpenStack Barcelona (2016), and updated for the OPNFV Summit 2017 (VES ONAP demo - see below for more info). The following diagram illustrates the concept and scope for the VES project, which includes: From ONAP a Common Event Data Model for the “VNF Event Stream”, with report "domains" covering e.g.
    [Show full text]
  • Metrics for Gerrit Code Reviews
    SPLST'15 Metrics for Gerrit code reviews Samuel Lehtonen and Timo Poranen University of Tampere, School of Information Sciences, Tampere, Finland [email protected],[email protected] Abstract. Code reviews are a widely accepted best practice in mod- ern software development. To enable easier and more agile code reviews, tools like Gerrit have been developed. Gerrit provides a framework for conducting reviews online, with no need for meetings or mailing lists. However, even with the help of tools like Gerrit, following and monitoring the review process becomes increasingly hard, when tens or even hun- dreds of code changes are uploaded daily. To make monitoring the review process easier, we propose a set of metrics to be used with Gerrit code review. The focus is on providing an insight to velocity and quality of code reviews, by measuring different review activities based on data, au- tomatically extracted from Gerrit. When automated, the measurements enable easy monitoring of code reviews, which help in establishing new best practices and improved review process. Keywords: Code quality; Code reviews; Gerrit; Metrics; 1 Introduction Code reviews are a widely used quality assurance practice in software engineer- ing, where developers read and assess each other's code before it is integrated into the codebase or deployed into production. Main motivations for reviews are to detect software defects and to improve code quality while sharing knowledge among developers. Reviews were originally introduced by Fagan [4] already in 1970's. The original, formal type of code inspections are still used in many com- panies, but has been often replaced with more modern types of reviews, where the review is not tied to place or time.
    [Show full text]
  • Fashion Terminology Today Describe Your Heritage Collections with an Eye on the Future
    Fashion Terminology Today Describe your heritage collections with an eye on the future Ykje Wildenborg MoMu – Fashion Museum of the Province of Antwerp, Belgium Europeana Fashion, Modemuze Abstract: This article was written for ‘non-techy people’, or people with a basic knowledge of information technology, interested in preparing their fashion heritage metadata for publication online. Publishing fashion heritage on the web brings about the undisputed need for a shared vocabulary, especially when merged. This is not only a question of multilingualism. Between collections and even within collections different words have been used to describe, for example, the same types of objects, materials or techniques. In professional language: the data often is “unclean”. Linked Data is the name of a development in information technology that could prove useful for fashion collecting institutions. It means that the descriptions of collections, in a computer readable format, have a structure that is extremely easy for the device to read. As alien as it may sound, Linked Data practices are already used by the data departments of larger museums, companies and governmental institutions around the world. It eliminates the need for translation or actual changing of the content of databases. It only concerns ‘labeling’ of terms in databases with an identifier. With this in mind, MoMu, the fashion museum of Antwerp, Belgium, is carrying out a termi- nology project in Flanders and the Netherlands, in order to motivate institutions to accomplish the task of labeling their terms. This article concludes with some of the experiences of this adventure, but firstly elucidates the context of the situation.
    [Show full text]
  • Full-Graph-Limited-Mvn-Deps.Pdf
    org.jboss.cl.jboss-cl-2.0.9.GA org.jboss.cl.jboss-cl-parent-2.2.1.GA org.jboss.cl.jboss-classloader-N/A org.jboss.cl.jboss-classloading-vfs-N/A org.jboss.cl.jboss-classloading-N/A org.primefaces.extensions.master-pom-1.0.0 org.sonatype.mercury.mercury-mp3-1.0-alpha-1 org.primefaces.themes.overcast-${primefaces.theme.version} org.primefaces.themes.dark-hive-${primefaces.theme.version}org.primefaces.themes.humanity-${primefaces.theme.version}org.primefaces.themes.le-frog-${primefaces.theme.version} org.primefaces.themes.south-street-${primefaces.theme.version}org.primefaces.themes.sunny-${primefaces.theme.version}org.primefaces.themes.hot-sneaks-${primefaces.theme.version}org.primefaces.themes.cupertino-${primefaces.theme.version} org.primefaces.themes.trontastic-${primefaces.theme.version}org.primefaces.themes.excite-bike-${primefaces.theme.version} org.apache.maven.mercury.mercury-external-N/A org.primefaces.themes.redmond-${primefaces.theme.version}org.primefaces.themes.afterwork-${primefaces.theme.version}org.primefaces.themes.glass-x-${primefaces.theme.version}org.primefaces.themes.home-${primefaces.theme.version} org.primefaces.themes.black-tie-${primefaces.theme.version}org.primefaces.themes.eggplant-${primefaces.theme.version} org.apache.maven.mercury.mercury-repo-remote-m2-N/Aorg.apache.maven.mercury.mercury-md-sat-N/A org.primefaces.themes.ui-lightness-${primefaces.theme.version}org.primefaces.themes.midnight-${primefaces.theme.version}org.primefaces.themes.mint-choc-${primefaces.theme.version}org.primefaces.themes.afternoon-${primefaces.theme.version}org.primefaces.themes.dot-luv-${primefaces.theme.version}org.primefaces.themes.smoothness-${primefaces.theme.version}org.primefaces.themes.swanky-purse-${primefaces.theme.version}
    [Show full text]
  • Arxiv:1910.06663V1 [Cs.PF] 15 Oct 2019
    AI Benchmark: All About Deep Learning on Smartphones in 2019 Andrey Ignatov Radu Timofte Andrei Kulik ETH Zurich ETH Zurich Google Research [email protected] [email protected] [email protected] Seungsoo Yang Ke Wang Felix Baum Max Wu Samsung, Inc. Huawei, Inc. Qualcomm, Inc. MediaTek, Inc. [email protected] [email protected] [email protected] [email protected] Lirong Xu Luc Van Gool∗ Unisoc, Inc. ETH Zurich [email protected] [email protected] Abstract compact models as they were running at best on devices with a single-core 600 MHz Arm CPU and 8-128 MB of The performance of mobile AI accelerators has been evolv- RAM. The situation changed after 2010, when mobile de- ing rapidly in the past two years, nearly doubling with each vices started to get multi-core processors, as well as power- new generation of SoCs. The current 4th generation of mo- ful GPUs, DSPs and NPUs, well suitable for machine and bile NPUs is already approaching the results of CUDA- deep learning tasks. At the same time, there was a fast de- compatible Nvidia graphics cards presented not long ago, velopment of the deep learning field, with numerous novel which together with the increased capabilities of mobile approaches and models that were achieving a fundamentally deep learning frameworks makes it possible to run com- new level of performance for many practical tasks, such as plex and deep AI models on mobile devices. In this pa- image classification, photo and speech processing, neural per, we evaluate the performance and compare the results of language understanding, etc.
    [Show full text]
  • A Dataset of Vulnerable Code Changes of the Chromium OS Project
    A Dataset of Vulnerable Code Changes of the Chromium OS project Rajshakhar Paul, Asif Kamal Turzo, Amiangshu Bosu Department of Computer Science Wayne State University Detroit, Michigan, USA fr.paul, asifkamal, [email protected] Abstract—This paper presents a an empirically built and of vulnerabilities that were escaped during code reviews, validated dataset of code reviews from the Chromium OS project we use the Monorail-based bug tracking system hosted at that either identified or missed security vulnerabilities. The https://bugs.chromium.org/. We searched in the bug tracking dataset includes total 890 vulnerable code changes categorized based on the CWE specification and is publicly available at: system to identify a list of security defects for the Chromium 1 https://zenodo.org/record/4539891 OS project . We used the following five-step approach to build Index Terms—security, code review, dataset, vulnerability this dataset. (Step I) Custom search: We used a custom search (i.e., I. INTRODUCTION (Type=Bug-Security status:Fixed OS=Chrome), Peer code review is an effective and well-recommended to filter security defects for the Chromium OS projects practice to identify security vulnerabilities during the pre- with the status as ‘Fixed’. Our search result identified total release stages [2]. However, despite practicing mandatory code 591 security defects. We exported the list of defects as a reviews, many Open Source Software (OSS) projects are still comma-separated values( i.e., CSV) file, where each issue is encountering a large number of post-release security vulner- associated with a unique ID. abilities, as some security defects remain undetected during (Step II) Identifying vulnerability fixing commit: The Mono- code reviews.
    [Show full text]
  • Git and Gerrit in Action and Lessons Learned Along the Path to Distributed Version Control
    Git and Gerrit in Action And lessons learned along the path to distributed version control Chris Aniszczyk (Red Hat) Principal Software Engineer [email protected] http://aniszczyk.org About Me I've been using and hacking open source for ~12 years - contribute{d} to Gentoo Linux, Fedora Linux, Eclipse Hack on Eclipse, Git and other things at Red Hat Member of the Eclipse Board of Directors Member in the Eclipse Architecture Council I like to run! (2 mins short of Boston qualifying ;/) Co-author of RCP Book (www.eclipsercp.org) An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Agenda History of Version Control (VCS) The Rise of Distributed Version Control (DVCS) Code Review with Git and Gerrit Lessons Learned at Eclipse moving to a DVCS Conclusion Q&A An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Version Control Version Control Systems manage change “The only constant is change” (Heraclitus) An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Why Version Control? VCS became essential to software development because: They allow teams to collaborate They manage change and allow for inspection They track ownership They track evolution of changes They allow for branching They allow for continuous integration An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Version Control: The Ancients 1972 – Source Code Control System (SCCS) Born out of Bell Labs, based on interleaved deltas No open source implementations as far as I know 1982 – Revision Control System (RCS) Released as an alternative to SCCS
    [Show full text]
  • CAFCR: a Multi-View Method for Embedded Systems Architecting; Balancing Genericity and Specificity
    CAFCR: A Multi-view Method for Embedded Systems Architecting; Balancing Genericity and Specificity trecon = tfilter(nraw-x ,nraw-y) + t ) + nraw-x * ( tfft(nraw-y)+ col-overhead ny * ( tfft(nraw-x) + trow-overhead ) + tcorrections (nx ,ny) + tcontrol-overhead Gerrit Muller ii This page will not be present in the final thesis version: 2.9 status: concept date: August 21, 2020 iii CAFCR: A Multi-view Method for Embedded Systems Architecting; Balancing Genericity and Specificity Proefschrift ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof.dr.ir. J.T. Fokkema, voorzitter van het College voor Promoties, in het openbaar te verdedigen op maandag 7 juni 2004 om 13:00 uur door Gerrit Jan MULLER doctorandus in de natuurkunde geboren te Amsterdam iv Dit proefschrift is goedgekeurd door de promotor: Prof.dr W.G. Vree Samenstelling promotiecommissie: Rector Magnificus voorzitter Prof.dr. W.G. Vree Technische Universiteit Delft, promotor Prof.dr. M. Rem Technische Universiteit Eindhoven Prof.dr.ir. P.A. Kroes Technische Universiteit Delft Prof.dr.ir. H.J. Sips Technische Universiteit Delft Prof.dr. R. Wagenaar Technische Universiteit Delft Prof.dr.ing. D.K. Hammer Technische Universiteit Eindhoven Prof.dr. P.H. Hartel Universiteit Twente Prof.dr. M. Rem heeft als begeleider in belangrijke mate aan de totstandkoming van het proefschrift bijgedragen. ISBN 90-5639-120-2 Keywords: Systems Architecture, System design, Systems Engineering These investigations were supported by Philips Research Laboratories, and the Embedded Systems Institute, both in Eindhoven. Cover Photograph: René Stout http://www.rwstout.com/ Copyright c 2004 by G.J.
    [Show full text]