International Journal of Advanced Research and Publications

ISSN: 2456-9992

Using Data Analytics To Extract Knowledge From Middle-Of-Life Product Data

Fatima-Zahra Abou Eddahab, Imre Horváth

Delft University of Technology, Industrial Design Engineering, Landbergstraat 15, 2628 CE Delft, Netherlands, [email protected]

Delft University of Technology, Industrial Design Engineering, Landbergstraat 15, 2628 CE Delft, Netherlands, [email protected]

Abstract: Data analytics needs dedicated tools, which are getting complex. In this paper we summarize the results of our literature research done with special attention to existing and potential future tools. The attention is paid mainly to processing big data, rather than to effective semantic processing of middle-of-life data (MoLD). The issue of handling of big MoLD of specific characteristics efficiently has not yet been addressed by the commercialized tools and methods. Another major limitation of the current tools is that they are typically not capable to adapt themselves to designers‟ needs and to support knowledge/experience reusability in multiple design tasks. MoLD require quasi-real life handling due to their nature and direct feedback relationships to the operation process and the environment of products. Nowadays products are equipped with smart capabilities. This offers new opportunities for exploiting MoLD. The knowledge aggregated in this study will be used in the development of a toolbox, which (i) integrates various tools under a unified interface, (ii) implements various smart and semantics orientated functions, and (iii) facilitates data transformations in contexts by the practicing designers themselves.

Keywords: Data analytics, middle-of-life data, analytics tools, application practices, smart analytics toolbox.

1. Introduction circumstances. This may provide insights for them concerning how to transform use patterns to specific product 1.1. Setting the stage enhancements, and how to avoid deficiencies which may The rapid rise of emerging economies and deployment of occur under circumstances that were not completely known information technologies have led to remarkable changes in or specified in the development phase of their products. the lifecycle of products and services. Because of the rapid MoLD can be obtained in multiple ways. They can be development of hardware and software, focusing on aggregated by field observations and interrogations of the exploration of data became ubiquitous [1]. Companies need users, or studying failure log files and maintenance reports, to adjust their operations to the influences of the rapidly or from relevant web resources such as social media and user evolving markets and to better manage the lifecycle of their forums. Alternatively, they can be elicited directly from products. Towards this end goals, they have to combine (i) products by sensors or self-registrations. The last mentioned static process information with dynamic ones, (ii) product approaches are getting more popular as products advance information with processes and resources information, and from traditional self-standing products to network linked (iii) human aspect information with business information advanced products to awareness and reasoning enabled smart throughout the entire product lifecycle [2]. However, as products [7]. However, due to the dynamic change of sensor reflected by the related literature, most of the past efforts data, the large volumes of data aggregated over time, and the were dedicated to the methodological and computational unknown nature of data patterns, it is unfortunately not support of the begin-of-life and end-of-life models and straightforward to perform effective using the activities. Less efforts were made to exploit middle-of-life existing traditional techniques [8]. Feeding structured MoLD data (MoLD) and to value/knowledge creation based on this and use patterns back to product designers is an kind of data. Thanks to new information technologies such as insufficiently addressed issue [9]. The key challenge is to sensors, RFID and smart tags, the chunks of information find ways to effectively use data analytics techniques in conveyed by the middle-of-life (MoL) phase of product purposeful combinations, depending on the application usage can now be identified, tracked and collected [3]. Our contexts and the specific objectives of product designers background research was stimulated by the observation that [10]. there seems to be a lack of tools to support decision-making in design and servicing using MoLD. This was placed into 1.2. Reasoning model and contents the position of a concrete research phenomenon to be studied We completed a comprehensive literature study in two [4]. The above decision was underpinned by the following phases. Referred to as „shallow exploration‟, the first phase argumentation. Effective statistical and semantic processing was conducted to identify the most relevant domains of of MoLD is not only an academic challenge, but also a useful knowledge for the study. Based on a wide range of asset for the industry [5], since logistics, operation and keywords, we tried to develop a topographic landscape of the maintenance activities are located in the MoL phase [6]. It is related publications. This topographic landscape was important for product developer and production companies supposed not only to show the distribution (the clusters of to see how their products are used by different customers in the keyword-related publications), but also the peaks and the different application contexts and under different plains of the clusters. Figure 1 shows the clustering of the

Volume 4 Issue 1, January 2020 1 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 outcome of the keyword-based mapping of the related literature. The graphical image was built using VOSviewer based on the concerned articles. Figure 1 shows not only the neighboring (semantically related) keywords, but also the distances between them as they appear in the literature. The different colors indicate the frequency of the occurrence of keywords (i.e. the formation of peaks). The most frequently occurring keywords are shown in red (and dark ) and the less frequent ones are shown in green (and light blue). The visual representation generated by the software application let us see that actually four major clusters of papers could be identified. In a kind of transitive ordering, Figure 2: Connectivity graph of the chosen family of these are: (i) changes in the nature of data, (ii) approaches of keywords transformation of data, (iii) tools and packages for data analytics, and (iv) design applications of data analytics. The above pieces of information obtained from the These were used as descriptors of the main domains of quantitative part of the literature study were used to develop interest in the detailed literature study, in addition to the key a reasoning model for the qualitative part of the study. This phrase „evolution of data analytics support of product part focused on the interpretation on the findings and design‟. Called „deep exploration‟, in the second phase of disclosed semantic relationships. The reasoning model, our study, various sources such as subscription-based and which was also used at structuring the contents of this paper, open access journals, conference proceedings, web- is shown in Figure 3. Indicated are only the first and second repositories, and professional publications were used for level key terms while, as mentioned above, the study was collecting several hundreds of relevant publications. The actually done with key terms of the third decomposition findings made it possible to define further relevant key terms level. on a third level (not shown in Figure 1).

Figure 3: Reasoning model of the literature research

The considered papers were published in different periods of time, ranging from the mid-fifties until today. An important Figure 1: Citation of the chosen family of keywords in the observation was that the concepts identified by the first level literature key terms are in an implicative relationship with each other. Namely, if the nature of data changes, then it entails a The second phase was also used to provide a quantitative change in the approaches of data transformation, that in turns characterization of the interrelationships among the key implies the need for different data transformation methods and tools. These enablers can provide support for a broader terms belonging to the same cluster. Figure 2 shows the range of existing applications and can facilitate new data interrelationships found. If two terms are used in the same analytics applications with regards to enabling product document, then there is a line between them, and the enhancement and innovation by design. The investigation thickness of the line indicates how frequently they occur. In concerning the changes in the nature of data was focused other words, the thick lines refer to combinations of terms mainly on product-related use, maintenance and service data, that appear in multiple papers, whereas the thin ones refer to and on data describing the conditions and behavior of combinations that rarely appear in the studied publications. It products. can be observed in the connectivity diagram shown in Figure 2 that the thickest lines are between the above four key terms 1.3. Organization of the paper – a fact that underlines their significance and relatedness. In In the rest of this paper, we provide details about the state of addition, the diagram casts light not only on the complexity the art in the broad field of data analytics methods and tools, of the completed study, but also indicates which key terms which support extracting product developmental knowledge could not be studied separately due to their interconnections from middle-of-life product data. In Section 2, we investigate claimed by the authors of the studied publications. the essence and trend of changes from product-associated data (referred to as functional data, or small data in other publications) to processing big data. In Section 3, we provide an overview of the various data transformation actions and

Volume 4 Issue 1, January 2020 2 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 techniques, and discuss the accompanying challenges. In that can be used to create new forms of economic asset s and Section 4, we summarize the findings related to existing values [15]. The motivation to exploit mass data has emerged commercial and academic data analytics software tools and as a new research field called big data science. Big data discuss how they can be improved according to the literature. analytics (BDA) has come in as a promising, minute-wise In Section 5, various applications of data analytics are evolving methodology to retrieve knowledge from massive discussed, including the major application domains of data streams and repositories [16]. Though the overall various big data analytics approaches and the challenges that research objective of our study was to find and analyze have already been recognized and addressed. The last scientific and professional publications that discuss the Section combines the findings and the lessons learnt in the recent changes and trends in the nature of data, the actually above four domains and discusses their implications. In this conducted review was restricted to data associated with Section we present our conclusions regarding what needs to monitoring real life use of the products and services in be done to find resolution for the unearthed issues and operation, as well as to those obtained by user feedback on deficiencies. Based on these, we sketch up the concept of a social media, due to the abundance of the kinds of data. On novel data analytics enabler, called data analytics toolbox the other hand, this scoping of the study made it possible to (DAT), which is intended to go beyond the functionality and derive highly relevant conclusions in the narrow context of services of the currently available commercialized integrated our research. The practical objective was formulated so as to data analytics software packages. The implementation of this study the kind of data that makes it possible (i) to extend toolbox is actually in the focus of our follow up research. product and service lifespans, and (ii) to optimize the use of the necessary resources all along their lifecycle. Here, the 2. Overview and analysis of the changes in the term „lifecycle‟ refers to all the phases of the product and nature of data service life, from the beginning of life (BOL) where the product is designed and realized, through the middle of life 2.1. Nature of data as a domain of research interest (MOL) phases where the product is available on the market Analysis of the trends and changes in the nature of data and used by the customer, until the end of life (EOL) phases, implies the need for a historical perspective. As a result of where the product is dismissed or revamped [17]. the third industrial revolution, which established the scientific and technological domain of electronics created 2.2. Major findings and involved it in automation of industrialized processes the In addition to the enormous changes in the amount of elicited beginning of 1950s, in addition to alphanumeric data, and processed data, the move from natural, concrete and analogue system signals has been also carriers of „data‟. The unstructured data to (purposefully) created, abstract and fourth industrial revolution, which is culminating since the highly organized data structures is the most important midst of 1980s, introduced digital computing and syntactic change. This is both the outcome of and the stimulant for data processing not only in industrial contents, but also in advanced technologies. As discussed in [18], everyday creative and executive processes. The current fifth abstract data types can play a significant role in software industrial revolution, often referred to as the revolution of development. Different data models and data modeling intelligence, shifted the attention to various formal and tacit techniques (that are central to information systems) have forms of knowledge and to knowledge engineering and been proposed that provide a basis for specific technological semantic knowledge processing. This is an indispensable solutions for database design and realizing data-intensive step considering the objective of current day product design applications [19]. Data modeling is the kernel activity in data and production to offer smart, cognizant and even intelligent management and processing, which imposes constructs and artefactual systems for the society. These are deemed to be structures on data and elicits meaning of individual or the most fundamental (generic and global) trends of change structured data. Technically, data modeling is the process of that can at all be identified in the nature of data. However, in (i) discretizing physical variations of data for accurate addition to these, the literature also reflects many specific representation, (ii) modeling sets of data by data structures, and local changes in the nature of the data captured and and (iii) generating data constructs for effective processing process, in particular in the field of realization of new by available software [20]. Data modeling is the basis of all products and value-adding services. In the above context, data processing tasks and forms an explicit part of many of data is a set of qualitative and/or quantitative values of them, supporting capturing and understanding the variables [11]. It may concern the behavior, the status, or the relationships and meaning of data [21]. Data modeling function of a system and can be in different formats creates a simplified structure and representation of data, (symbols, texts, numbers, figures, etc.) [12], [13]. Before and which can be used as the starting point of analytics, at the beginning of the fourth industrial revolution, data reasoning, and simulation [22]. Data modeling also support typically were typically limited (up to a couple of petabyte, schemata design of and data as a maximum) in terms of their amount, which is nowadays warehouses/repositories [23]. The logical structures imposed indicated by the term „small data‟). Due to the technological by data modeling support human understanding, maintaining advancements in digital data processing, the use of multiple and extending data structures [24]. Data models represent digital devices, and complex sensor and effector networks, structures and integrity of elements of data (Spyns et al., the possibility of capturing and processing data has 2002). Their semantics usually constitute an informal drastically changed. Tremendous amount of digital data agreement between developers and users of data models records are generated that form continuous data streams. (Meersman, 1999). For the reason that databases became This „new generation‟ of data, also called „big data‟, may run critical components of information systems, the success of up to exabytes scales [14]. It is no longer regarded as static projects became dependent on the accuracy of data models or stable, but as dynamic and recomposable, and as such, it is (Ramakrishnan & Gehrke, 2000) (Siau et al., 2013). The a raw material for business exploitation and a crucial input

Volume 4 Issue 1, January 2020 3 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 huge amount of data has made data modeling a critical and advantage [39]. It offers substantial value to organizations vital issue of survival of companies (Verhoef et al., 2003). who decided to adopt it, but poses a number of challenges for Over the past decades of digital data processing, the the realization of such added value [40]. fundamental concepts and general principles of data modeling have gone through an evolution (Goodchild, 1996). There are seven modern data modeling approaches identified in the literature, namely: (i) relational, (ii) semantic, (iii) entity-relationship (ER), (iv) extensions of ER (ERR), (v) object-oriented (OO), (vi) statistical, and (vii) data metamodeling (Ter Bekke, 1992) (Ter Bekke, 1995) (Ter Bekke, 1997). In spite of the fact that many pitfalls have been discovered in the relational theory, relational data- models have become widely accepted. They offer mathematical foundations and simple user-level paradigm, while semantic models offer flexible structuring capabilities Figure 4: Change in the characteristics of regular and and explicit data constrains (Siau et al., 2013). Semantic massive data sets (designed after [36], [37]) modeling applies various abstractions (such as classification, instantiation, aggregation, decomposition, generalization, In the context of this paper, big data is collected during the and specialization) in order to capture meanings by data whole product lifecycle stages: (i) beginning-of-life (BoL), structures in complex situations. Semantic integrity is (ii) middle-of-life data (MoL), and (iii) end-of-life (EoL) typically guaranteed by making use of the fundamental type- [17]. Researchers showed that it is important to focus on attribute relationship. Object-oriented data modeling captures MoLD that primarily, but not exclusively include use, conceptual entities as objects [24]. OO modeling has service and maintenance data [3], [41]. It provides failure emerged as an alternative to the traditional entity- data, performance data, age data of product, operating relationship orientated modeling technique, based on the environment data, usage intensity data, maintenance reports, premise that the resulting OO data models are easier to use and refund and replacement data [42]. These data allow the and understand [25]. It has been experienced that observation of the condition and the behavior of products understanding OO data models is indeed faster than that of during the usage phase [13]. If manufacturers face EER for both simple and complex problems. However, many communication challenges and if they do not have well- of the other claims concerning the superiority of OO data established processes to obtain and utilize feedback from modeling (e.g. perceived ease of use) are not empirically their customers, processing MoLD may also help them [43]. verified. OO data models are typically used where used in Furthermore, acquisition of MoLD creates opportunities for computing such as programming, analysis, design, and and encourages a life-cycle orientated approach to database management [26]. Despite the huge number of incremental product design that evaluates and enhances all research projects in semantic and OO data modeling, the products and services on a continuous basis [44]. In other relational one is still the predominant database used in the words, MoLD can be transformed into knowledge that industry [27]. ER and EER modeling are popular as tools for enables a perpetual and long term design improvement, and conceptual data modeling [28]. For the purpose of product innovation and planning. Collecting product exploratory data analysis (of continuous data), various information during the MoL and EoL phases makes a parametric and nonparametric data models have been product itself or product operations improved in various proposed based on estimating the quantile functions and ways, such as the improvement of design and the density quantile functions [29], [30]. The trends of change optimization of maintenance operations [45]. A recognized concern not only the sources and amount (size) of digital difficulty related to MoLD is that the related elicitation data its size, but also the arrangement (structure) of data, activities should be executed outside the companies, Figure 4. Unstructured data do not support formal analyzes, typically with an intense involvement of both the products traditional database management [31], or applying pattern and the end users. If elicitation of product-related searching methods [32]. The papers related to the nature of information is interventional, then it may lead to operational data cluster indicate that individuals, industry, and science inefficiencies [3]. Since conventional information systems face the challenge of dealing with large datasets as a result of used in the definition of products and services cannot handle the proliferation and ubiquity of high-throughput computing MoLD and the knowledge, the need for dedicated data technologies and internet connectivity. The main difficulty is analytics approaches and tools have been recognized also by not in the technical handling of large amount of data, but in the developers of product life cycle management systems. mining and extracting valuable information and knowledge Nevertheless, in the current industrial practice the from them [33]. Decades ago data was characterized by three potentialities offered by MoLD analytics is seldom utilized characteristics (volume, velocity, and variety) because these sufficiently by product and service developers. The gradually lend themselves to advanced, complex, and predictive increasing smart behavior of products have been recognized business analysis and insights [34], [35]. Recently it has been as a key development in collecting and feeding back data and complemented with three more characteristics (value, information to designers with regards to modes of use and veracity and viability) [36], [37]. In order to deal with its operation [3]. scalability and affordability, the literature suggests that big data requires optimized data warehouses and cloud 2.3. Lessons learnt computing [38]. The outcome of the literature review The study of the literature showed that the notion of „data‟ showed that managing and gaining insights from the and the nature of data, as well as the types of knowledge produced big data is a challenge and a key to competitive digitally processed have rapidly and remarkably changed

Volume 4 Issue 1, January 2020 4 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 over a relatively short time. What is typical nowadays is to intensity, completed maintenance task data, efund and produce and process complex data structures, rather than replacement data, etc. Due to their nature (state and time only data constructs or individual data, as contents for data dependence), these need specific data processing and warehouses, integrated databases [46]. We are moving away interpretation approaches. from the so called „situated aspect data‟ (sorted according to  Collecting data is determined where the MoLD can be type, location and meaning) to big data, which cannot found and collected. In this sense, MoLD may have many however be interpreted easily from the point of view of its sources, outcome of which should typically be combined structure and semantics because these patterns are often before processing. For instance, MoLD can be deeply hidden in the flow of data, which does not hint aggregated by field observations, interrogations of users, directly at the semantical meanings of the various bodies of studying failure log files and maintenance reports, or data [47], [48]. One reason of why digital data have become from relevant web sources such as social media and user diverse is that engineering and technical data have been forums. Alternatively, they can also be elicited directly combined with social data [49]. At the same time, it has also from products by sensors or self-registrations. become tendentious to produce qualitative data together with  Processing MoLD depends on the opportunities of quantitative data. It is a daily routine nowadays. Processing computational management, for instance, on the pre- life-time data of products and related processes means that processing, processing and post-processing procedures descriptive, prescriptive, predictive and operational data (detailed in the next section). Though these steps are should be managed concurrently. This resulted in a data common, what makes them challenging is the fact that engineering challenge for the developers of the current data the complexity of data should be addressed, the data is to processing tools [50]. There has emerged an increased need be processed in real-time, and there is a time- and context for modifying or redesigning them in order to be able to dependence to be dealt with. The existing data processing process hidden data patterns and mixed semantics [51]. The tools are not yet equipped with capabilities for these above changes in the nature of data explain why a new data purposes. science is in a formation stage and why a lot of novel  Using MoLD can be considered from multiple aspects methods and tools have been or are being developed in the and application cases. This raises different requirements field of managing data technologies. Data modeling plays an on the output of the data analytics tools. For instance, important role in both structuring data, and capturing and data collected during the use of a product may deliver eliciting the meaning of (structured) data. Typically, three insights on consumers‟ behavior and preferences, but also types of data models are used in information systems: (i) on the frequency of using a product, the convenience of conceptual data models (data requirements models), (ii) the interaction, or the potential misuses of a product. This logical data model (data constructs/structures models), and means that the goals of data processing are strongly (iii) physical data models (database/warehouse access articulated by the objectives and manners of using a models) [52]. These models are used in both business- product. oriented and system-related data processing. The major issues are related to (i) the inflexibility of current data 3. Overview and analysis of steps and models, (ii) the multiplicity of imposing structures and capturing meanings, (iii) the limited reusability of data techniques of data transformation models, (iv) the insufficiency of data modelling standards, (v) the lack of stereotyped data interfaces, (vi) the 3.1 Interpretation and mapping of data management of the relational complexity of data models, and transformation approaches (vii) the uncoordinated development of information systems The term ‟data transformation‟ (DT) has a broader and a [53]. Though these issues were identified almost 15 years narrower meaning. In the narrower meaning DT is the ago, the majority of them are still acute. A clear distinction process of converting data and information from one format has been drawn between data modeling and data analysis. to another, usually from the format of a source system into Data modeling is the activity of using a set of tools and the required format of a destination system [55]. The typical techniques to aggregate, organize, relate, represent and store statistical transformations are such as: (i) logarithmic, (ii) data, whereas data analysis is interpreted as the activity of square root, (iii) square, (iv) cube root, and (v) reciprocal merging data from multiple data sources together using set of transformations. In the broader meaning it refers to all data methods and tools to gain insight from a data and analyze processing activities that can introduce change in the state, trends to help make better decisions [54]. There have been an representation and/or meaning of data [56]. That is, data impenetrable range of data modeling/representation and data transformation is the process by which data in a dataset are analytics tools developed. This issue will be addressed in transformed, or changed, during data cleaning and involves Section 4. At this point it is important to mention that, as all the use of mathematical operations in order to reveal features data collected during the whole life-cycle of a product, MoL of the data that are not observable in their original form [57]. data also need to be defined, collected, processed and then The usual process of DT involves converting data structures, can be used. However, there is a difference with regards to file or database contents, and documents. Knowledge the execution of these four steps. transformation (KT) is often considered either a part of this  MoLD are associated with certain phases of the product broader concept of data transformation, or a special case of life cycle. Therefore, first the phases with which the it, for the reason that ultimately semantic knowledge MoLD are associated should be captured. These phases representation also boils down to managing syntactic data can be so broad as logistics, operation, and maintenance, [58]. The focus of KT is mainly on methods and techniques which produce a wide variety of MoLD, e.g. run time allowing the extraction of knowledge from data. Oftentimes, performance data, failure data, data about the aging of a data conversion also involves software conversion from one product, changes in the operating environment, usage computer language to another, in order to make the running

Volume 4 Issue 1, January 2020 5 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 of a particular software tool possible on a different platform omissions, and (iii) to handle missing values [74]. In [59]. This is often referred to as data- or software-migration. addition, pre-processing includes activities such as (i) However, as reflected in the literature, data transformation classification that increases the efficiency of the retrieval of means data processing for some authors, and it is the data [75], (ii) clustering that facilitates structured handling of synonym of data translation or data integration for others. data [76], [77], and (iii) visualization that can be applied to We adopted the comprehensive interpretation according to knowledge discovery processes [78] as well as to the results which data transformation blends data pre-processing of other transformative actions [79]. Data cleaning (~ data processing (transformation) and data post-processing checking or ~ validation) is regarded as an important process (presentation). DT involves multiple steps, going from the by which missing, erroneous, or invalid data are determined aggregation of data, through cleaning, classification and and cleaned, or removed, from a dataset and follows the data interpretation of data, to extraction of patterns to be preparation process. Though the concept of back evaluated and representation of data patterns. These steps are transformation (that is the process in which mathematical needed to support and interpret changes in the structure, representation and content of data [56]. Since the beginning operations are applied to an already transformed data set in of the big data era, many authors addressed the challenges order to revert the data to their original form) is known in the that they were facing. The primary issue is not the huge literature, the number of papers is much less than of those amount and diversity of data, but the way of transforming dealing with forward transformations [80]. An important them and extracting valuable insights from dumped and action in the transformation of data to knowledge is data dynamically changing data [41]. The aim is to discover mining that is gaining more and more attention since the previously unknown interrelations among unrelated beginning of the big data era [14]. is defined as attributes of datasets [60], [61]. Many books are published “an algorithmic process that takes data as input and yields on the essence and challenges of data transformation patterns such as classification rules, association rules, or processes [15]. The complexity and other challenges of data summaries as output” [81]. It seems to be a simple action, mining were addressed from many aspects by data scientists but its implementation is challenging due to the associated and analysts, as well as the steps needed to find solutions computational complexity [82]. Data mining tasks may be [62]–[64]. The difficulty and specific challenges of making used to discover the knowledge and rules of MoLD, such as correlation analysis (i.e. finding the measures, the degree of usage pattern of product, maintenance history, customer association of the data, or the strength of the relationship support information, updated bill of material and updated among them) using mathematical operations is also product demand information [83]. Many researchers argue addressed in the literature [65]. These challenges include: (i) that the challenge of data mining is partially caused by the spatial and temporal variation of data, (ii) missing values, need for effective algorithms designs, which are able to and (iii) the lack of balance in sampling [66]. To find tackle the mining problem even in the case of a huge volume answers to these challenges is a concern not only for of complex and dynamic data [84]. This huge dimensionality researchers, but also business/market managers and decision makers who should base their decisions and actions on the of data, together with the explosion of features and variables, insights gained from big data [67]. Toward this end, they is what brings new challenges to data analytics [1]. Various have to understanding the aggregation of big data, the methods have been developed to extract meaningful approaches of extracting patterns, and using them to predict knowledge from complex and dynamically changing data future situations and/or behavior. [85], [86]. Regarding MoLD processing, the literature was diverging from the method of data processing. For example, 3.2 Major findings failure modes and effects analysis (FMEA) has been used for The ultimate objective of big data management is to generate identifying design problems during MoL phase, the FMEA- new business opportunities for the service industries. based methods have a critical weak point in the sense that However, since more than 80% of the world‟s data is they do not consider product degradation at failure models unstructured, most businesses do not even try to use it for quantitatively [87]. As discussed in [84], “while typical data their benefits [68]. Big data itself has no real meaning, unless mining algorithms require all data to be loaded into the main it is exploited by extracting information and knowledge from memory, this is becoming a clear technical barrier for big it. Various authors argue that in fact not only the amount of data because moving data across different locations is data is important, but also the diversity of it, which may lend expensive, …, even if we do have a super large main itself to various semantic patterns [69]. There is a wide memory to hold all data for computing”. To solve the variety of big data sources. It ranges from natural processes complexity problem of data mining, some authors proposed and substances through engineered physical processes and to apply parallel computing [88], [89]. Others prefer artifacts to virtual environments and objects. A recent source collective mining to sample and aggregate multi-source data, of big data has been the social media and websites [70]. Data and then use parallel computing in the actual mining process collected from diverse sources and represented in various [90]. Pre-processing plays an important role in the case of formats need to be transformed (prepared for semantic big data [91], but it is a time consuming set of activities, and processing). This constitutes an essential step in the process there are certain threads associated with it [92]. The main of a meaningful analysis [71]. Social media and website data transforming activity is data analysis, which may have need pre-processing that involves common steps such as different objectives such as obtaining useful values, aggregation, cleaning, sorting, etc. because of the fact that extracting patterns, providing suggestions, and optimize real-world data are typically impure [72]. Pre-processing is decision making [51], [52]. Real-time processing of big data also expected (i) to determine the accuracy and completeness remains a very challenging task [84]. According to of data [73], (ii) to reduce noise of data and correct the McKinsey [93], smart data analytics will be the key to

Volume 4 Issue 1, January 2020 6 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 competition, productivity and innovation. increasingly complex sensor networks used to monitor product use and life cycle performance [106]. In addition to 3.3 Lessons learnt the amount of data generated by sensors, the signals and data Base on the development of automatic identification, data generated by the end user products themselves should also be capture and storage technologies, people generate data much accounted for. It was argued that data analytics will be faster and collect data much bigger than ever before in significantly challenged by the necessity of combining business, science, engineering, education and other areas sensor generated (objective and aggregating) macro data [94]. Current literature identifies: (i) data preparation, (ii) with end-user generated (subjective and finely granular) data mining, (iii) patterns evaluation, and (iv) knowledge micro data to figure out their mutual meaning and impacts representation as the main activities of processing complex [107], [108]. Furthermore, associating quantitative and non-structured data [95]. The change in the nature and (measured and factual) data with qualitative (social the sources of data implies a change in the steps of data networking provided) data needs. Data modeling approaches transformation [96]. Due to the large dynamics, data are often distinguished as exploratory and confirmatory processing should consider the time dependent validity of approaches [109], [110]. Furthermore, distinction is made data [97]. The literature claims that there is a need for data between non-parametric statistical confirmatory data transformation techniques able to manage changing data modeling and parametric statistical confirmatory data patterns or dealing with dynamic patterns recognition and modeling [111]. evaluation [33]. It was explained in many papers that big data is mainly unstructured and heterogeneous, but its 4. Overview and analysis of data cleaning and pre-processing may lead to a relatively high transformation tools and packages loss of data [56]. It is also claimed that significant amount of big data is discarded [97], which may severely influence the 4.1. Interpretation and mapping of data results of data transformation, but still no clear solution has transformation means been proposed concerning those issues, which raises The literature presents, discusses and compares many tools complexity challenges of big data. The mining of big data that have been developed to help understanding and primarily focuses on extracting patterns to be evaluated by processing data. The overwhelming majority of these tools both manual and automated approaches [98]. Currently, are general purpose statistical tools [112]. Much less tools dealing with patterns requires multiple expert interventions have been developed to assist the improvement of products (especially after mining) [99]. At the same time, the and services [113], [114]. The general purpose software currently widespread methods of big data transformation do means are typically sorted into three categories: (i) single not consider human behavior, which adds uncertainty to the task orientated software tools [115], (ii) multi tasks outcome of the process [100]. Extraction of patterns is orientated integrated software packages (and toolboxes) typically done on historical data, rather than won real-time acquired data. Mining and transforming big data necessitates [116], and (iii) multi-functional development environments highly scalable strategies [101]. In order to achieve more [117]. The first category consists of software implementation effective processing, the literature suggests developing of algorithms, procedures, or techniques to represent, sophisticated data filtering and integration techniques, as enhance, analyze or transform input and output data. A well as using advanced parallel computing environments and second level categorization of the single task orientated more effective user involvement [14]. Researchers also software tools is made based on the type of tasks they are observed that if the process of transforming data to intended to support. Typical representatives are such as knowledge is time consuming, then it may negatively business analysis tools, data visualization tools, trend influence the relevance of the extracted knowledge and its analysis tools, etc., which are marketed by many vendors. validity in the dynamic context, or it can even make the The second category of systems includes (often modularized) extracted knowledge invalid [102]. This issue has been software packages, which combine and interlink addressed by many publications, but the contour of a general functionalities of multiple software tools. The component solution does not seem to be in formation. However, one tools typically share a common user interface and can issue that does not seem to be sufficiently addressed in the exchange data among each other [118]. The statistical literature is providing meaning to data (automatically or procedures offered by the integral packages for exploring semi-automatically). The importance of this issue originate and predicting big data address the issues of big data in that it concerns and may computationally influence all processing such as source heterogeneity, noise accumulation, data transformation steps. A hierarchy of concepts spurious correlations, and incidental endogeneity, in addition interlinked by the assumed relationships, and the axioms able to balancing the statistical accuracy and computational to express the relationships of the concepts and to constrain efficiency [119]. The third category includes developer tools their interpretation are seen as ingredients of a possible solution [103]. In addition, only limited efforts have been that are able to generate (partly or fully automated) made related to capturing the semantics of transformed data algorithms, source codes, and executable programs, and to and to give meaning to transformed data in context [104]. No interlink these. The term „environment‟ in the name of the sufficient attention was given to relations between signifiers, category indicates that all developer tools and production such as words, phrases, signs and symbols, what they stand servers are incorporated. for, and what their denotations are [105]. It seems that there are multiple challenges related to the early preparatory 4.2. Major findings activities of data analytics. One of them is data inundation, There is a need for computational theories and tools to assist which may manifest as the major performance bottleneck for humans in servicing [120], and to extract useful information processing (cleaning, sorting, structuring, etc.) the output of and knowledge from the rapidly growing volumes of digital

Volume 4 Issue 1, January 2020 7 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 data [121]. This is also confirmed by a study in 2017 Table 1: Investigated software tools explaining the large volume of data makes it difficult for Data transformation steps allowed by the Software tool License human being to extract valuable knowledge from it without software Data classification, data mining, data powerful tools [122]. Though the literature discusses many ADaMSoft [146] OS big data mining and analysis tools, their majority is still in visualization Analytica Data visualization, simulation infancy [123]. The traditional tools are able to capture, [147] curate, analyze and visualize big data, but usually fail to BV4.1 OS Patterns detection, data visualization fulfill the variety of needs such as satisfying all experimental [148] CLUTO designs [124], finishing the processing in a reasonable time OS Data clustering [149] [125], functioning in parallel [126] and being scaled to large COMSOL Data modelling, simulation, data C datasets [127]. The fact is that as the number of existing [150] visualization Dataiku Data visualization, data pre-processing, data commercial (C) and open source (OS) tools grows, the OS [151] modelling choice of the most appropriate one for a particular data DataMelt Data visualization, data pre-processing, data OS analysis task is becoming increasingly difficult [128]. An [152] mining FreeMat apparent technological issue for traditional software tools OS Data processing, data visualization that amount of data generated and stored at different sources [153] GNU Octave OS Data pre-processing, data visualization grows rapidly and their handling needs a sufficient level of [154] automation [129]. In the lack of this, it is becoming hard to JASP Data processing, pattern recognition, data OS capture, store, manage, analyze, visualize and share mass [155] visualization KNIME Data cleaning, data classification, data OS data using typical tools [130]. Required are powerful and [156] visualization efficient tools that extract useful information from data, and MATLAB Data pre-processing, data mining, patterns C that can cope with big data challenges [131]. It has been [157] evaluation, data visualization MaxStat found in [54] that there are software tools that solely operate C Statistical analysis of data [158] as information servers to data mining tools and do not have Data pre-processing, data mining, patterns C any analysis functionality. Others, as argued, “can be abused [159] evaluation, data visualization OpenStat for data mining, but their intended use lies somewhere else” OS Simulation (like that of the software packages dedicated to pure [160] Oracle Data preparation, data classification, data C statistical analysis, or MATLAB‟s Neural Network Toolbox [161] clustering, data mining R Data pre-processing, data mining, patterns [132]. Some other software tools are advertised as data OS mining or knowledge discovery tool, but they only do [162] evaluation, data visualization RapidMiner Data preparation, data mining, data C reporting and visualization, such as Oracle Discoverer [133]. [163] visualization There are surveys and comparative studies in the literature SAS C Statistical data analysis that analyze the functional capabilities of data analytics tools [164] and compare the performance of these tools [134], [135]. A OS Statistical data analysis [165] survey done in 2016 reported that the top ten tools were: R, Shogun Data pre-processing, data mining, patterns OS Python, Microsoft SQL Server, Microsoft Excel, [166] evaluation, data visualization Data pre-processing, data mining, patterns RapidMiner, Hadoop, Spark, Tableau, KNIME, and Scikit- SPSS Statistics C evaluation, data visualization (mainly [167] learn [136]. In Table 1 we sorted the tools, which we statistical analysis) investigated in our study, according to their functionality C Data visualization, patterns evaluation offered for data analytics. Obviously, in the case of any [168] Data classification, data clustering, data comparison of the tools not only the functionalities and the WEKA OS mining, attributes selection, data cleaning, [169] data analytics tasks at hand, but also the user groups, the data data visualization structures, the processing methods, the import and export of data, the use of models, as well as the platforms and the 4.3. Lessons learnt licensing have to be taken into consideration. A lot of work has been done to develop and enhance data Other web resources showed that the choice of tools is also analytics tools. It is emphasized that big data cannot be influenced by several pragmatic issues, such as the budget managed with traditional methodologies or data mining and the user experience . As an overall finding we can argue software tools [140], [141]. In general, they encounter great that there is no a single tool not even an integrated package, difficulties at handling heterogeneity, volume, speed, as well that could cover all needs and steps of data analytics, in as privacy and accuracy. They are inadequate for handling particular not in the case of big data processing and such characteristics [142]. Applying existing data mining applications [137]. It seems to be a generally accepted algorithms and techniques to real-world problems raises conclusion in the literature that no tool is better than the many challenges due to inadequate scalability and other others [138] and that users can select the adequate data limitations of these algorithms and techniques [143], [144]. analytics software package only based on a critical analysis Several authors confirm that there is a need for new of the objectives and the application case [139]. It is worth computational theories and tools to assist humans in noting that there are some quasi-data analytics tools, which extracting useful information or knowledge from the rapidly we have not considered to be relevant for our specialized growing volumes of digital data [145]. There are a multitude study. For instance, in the area of engineering, tools like of tools, which can help understanding and interpreting ThingWorx (PTC), Exaled (Dassault Systemes), and Omneo various application data, but can also assist the improvement (Siemens). of products and services based on dedicated data

Volume 4 Issue 1, January 2020 8 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 transformation steps. It is often mentioned that even the most sophisticated tools need human interaction [170]. Another open issue is that it is not clear how a data analytics system can deal with historic data as well as with real-time data at the same time, and what is an optimal architecture of such a system [171]. Less significant, but still important issue is that it is difficult to find user-friendly visualizations in the case of large volume data [172].

5. Overview and analysis of data analytics Figure 5: The dominating trend of development of new applications generations of products

5.2. Major findings 5.1. Interpretation and mapping of data analytics applications Interestingly, there seems to exist a debate concerning the relevance of big data processing to all application domains, Data driven applications have emerged in the last decades or, in other words, if BDP has equal importance in the [173]. Due to its fast development big data is rapidly various data-intensive application domains [176]. There are expending in all science and engineering domains, as well as voices that big data processing exists only on paper, as a in physical, biological and biomedical sciences [84]. theoretical perception that cannot be put into practical Research has provided important information for designing applications. Others argue that BDP is still in a rather big data mining algorithms and systems [174]. For the reason premature state. Therefore, it still struggles with complex that every discipline and application domain has a vested application challenges and cannot provide immediate interest, big data became primordial for multidisciplinary benefits for practical applications [182]. The premature state problem solving [175]. In this case, the challenge is how it is is associated with both the lack of sophistication and possible to use data regardless the application domain [176]. dependability of the implementation technologies and the Despite the complexity of this issue, there is an optimistic low level of elaboration of application methodologies atmosphere due to the serious efforts that scientists and relevant for various application domains. On the other hand, developers are currently making related to what big data can there are already the limitless examples demonstrating the offer. “Many big data applications will have unintended and benefits and advantageous changes brought into both unpredictable results as the data scientist seeks to reveal new professional application domains and our daily life. In the trends and patterns that were previously hidden” [177]. In context of product development, BDP opened the spite the above efforts, the domains of engineering- and opportunity of not only storing data concerning customers, product-associated big data processing (BDP) are behind the but also to analyze large volume of data about their overall progress because of the shear fact of late recognition. behaviors and customs, which in turn allows the possibility Other issue is the rapid paradigmatic changes in the field due of gaining competitive advantage [183]. Current typical big to the converging technologies and embedding software and data applications are mainly related to processing sensitive cyber-ware in practically all products. These imply many information and data exchanges or transmissions [184], changes that are already observable currently [178]. First of [185]. In electronic commerce, big data analysis was all, engineered products are becoming more-and-more concluded to be primordial for the success of websites, since multifunctional, technology intensive, network connected, it facilitates building markets as well as increasing the data dependent, and customized/personalized [179]. customers‟ abilities to extract relevant information on the Operation and maintenance process data can be tracked in web [186]. In financial trading, BDP permits service real-time over MoL by embedding an information device to companies (such as Google) to gain profit by making use of the product itself [83]. Products with these characteristics are data [187]. BDP has also been proved beneficial in providing often referred to as advanced or sophisticated products. a multi-purpose data processing system to support financial However, the largest paradigmatic change is that they are transactions and services [188]. In society administration and rapidly becoming knowledge-intensive and smartly government, extracting informative data and knowledge operating, or even progressing towards some forms of patterns provides the chance to improve productivity and to intelligent operation (Figure 5) [180]. Therefore, in line with increase the level of effectiveness [93] as well as to forecast many other researchers, we considered advanced consumer in advance, and take actions in case of natural damages durable products as a specific application domain of big data [189]. In the field of health care, online diagnosis repository and data analytics [181]. In the next section, we will is one of the successful applications of big data [190]. BDP highlight the major finding related to main application also helps decreasing the variability of healthcare quality and domains of big data management, including advantages, augmenting its effectivity by data mining techniques (for challenges and trends. instance, to determine the most effective treatments for

different conditions) [191]. In the pharmaceutical industry, collecting big data can provide information, e.g. about the preferences of certain medicines and drugs [192]. In the case of telecommunication, big data mining has been applied to bring to light useful information about use trends and customs, and to determine telecommunication frauds [193]. Likewise, preprocessing big data has been confirmed to boost the plausibility and accuracy of forecasts [194]. In

Volume 4 Issue 1, January 2020 9 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 scientific research, many fields have become highly data- related technologies to be able to produce truly „intelligent‟ driven due to the development of computer science [195], for tools [212]–[214]. instance, astronomy [196], social computing [197], bioinformatics [198], and biology [199], which generate 6. Discussion, conclusions and future work enormous data sets able to provide the basis for inquiry or to drive the whole system design when analyzed [200]. In all of 6.1. Discussion the mentioned applications, but also in others, significant From a philosophical perspective, the whole of our inquiry challenges are to be considered related to system capabilities, was driven by pragmatism, the doctrine of which is featured algorithmic designs, and design models [201]. by setting a concrete goal and acting purposefully towards

achieving this goal. Pragmatism also meant that, rather than 5.3. Lessons learnt all pertinent, only those publications have been considered As argued in [84], “Driven by real-world applications … for reviewing, which have a strong relevance and managing and mining big data have shown to be a significance from the perspective of our ultimate research challenging yet very complicated task”. Practically objective. Furthermore, the observed trends, and the independent of the field of application, one of the main proposed theories and solutions were mainly evaluated in challenges of BDP is exploring large volumes of data and terms of the caused changes and their success in practical extracting useful information or knowledge for future actions application. This lent itself to a reflexive review, which will [202]. Notwithstanding, new applications are revealed and be appended by a concise discussion of our prospective new approaches are proposed. One proliferating field of future research. Based on a statistical and relational study of application is using BDP in strategic product development the literature, we derived a reasoning model, which identified and life cycle engineering of products. In this particular field and brought four domains of knowledge (DoK) into an of application, rapid changes are predominant. We can implicative interrelationship. The four domains are: (i) witness the trend of intellectualization of products and changes in the nature of data and their characteristics, (ii) services nowadays [203]. Intellectualization means approaches of data analytics-based transformations, (iii) data equipping industrial and consumer products with digital analytics algorithms, tools and packages, and (iv) connectivity and data communication as well as with representative application fields and practices of data capabilities that mimics human-type intelligence. The first analytics. A specific objective of our study was to synthesize developments are supported by the Internet of things knowledge for a fifth domain of interest, which is technologies as overall infrastructure. However, it has also contribution to data analytics-based support of product been clarified that the Internet of things enables connectivity enhancement and new product innovation. Actually, this and information exchange, rather than implementation of objective created an application context for the whole of the operational intelligence of products [204]. It is the role of explorative study. The findings (i.e. the specific pieces and cyber-physical systems and the various advanced forms of chunks of knowledge obtained from various sources) were artificial (system) intelligence to provide sophisticated synthesized in this context. By bringing the mentioned five mechanisms for building situation and context awareness, DoK into implicative relationships, the adopted reasoning reasoning and decision making, and adaptation to carrying model entailed a kind of natural streaming of knowledge that operational situations and objectives [205]. It has been in turn enabled building an intellectual platform for a new, observed that the term „intelligence‟ has become popular sufficiently tailored supporting means. Current data analytics both in the scientific literature and in the professional should deal with data that are largely different than those literature, though it begs for a more careful use in the context processed digitally some decades ago. The major difference of artifacts and services. As for now, the terms such as is not only in the amount of data, but also in the „advanced‟, „sophisticated‟, „smart‟, „autonomous‟, complexification of data. On the one hand, this creates new „intelligent‟, etc. are used interchangeably, as well as challenges for data analytics, but, on the other hand, it also distinguishingly, by various authors [206]. Until now, the creates opportunities for new value creation approaches. concept of intelligent products has remained fuzzy and the There seems to be a consensus in the literature concerning use of the term is confusing [207]. Interestingly, even the the fact that the complexity of big data cannot be properly scientific literature is divided in terms of using these terms to addressed by the overwhelming majority of the existing characterize the operation and/or behavior of artifacts and (traditional) data processing methodologies and tools, and their interaction with humans and other artifacts (systems) that exploiting the affordances of big data in various [208]. It seems that there is a problem with the verbatim application contexts needs a stronger contextualization of the interpretation of the term „intelligent‟ as well as with the data transformation processes. The transformation techniques relationships of intelligent products to knowledge acquisition and tools are expected to support real time processing of data and processing. This entails the need for further work as well as the highest possible level of semantic considering the variety of application contexts. In addition, interpretation of data. Time-dependent (and real-time) there is a need for a new classification of these products processing of complex data streams still raises many issues, simultaneously considering the achieved level of intelligence in addition to the well-known issues of storing big data, and the specific manifestations of these levels [209]. On the fusion of heterogeneous multi-data sources, and visualization other hand, researchers active in various fields of intelligent of big data. (It has to be mentioned that, due to the obvious products do agree that there is still a long way to go before space limitation in this paper, we could not discuss other key different kinds of machines and systems will be able to issues such as; (i) notional and epistemological articulation intelligently communicate, reason and understand each other of the definitions of big data, (ii) development of big data [210], [211]. Some of them believe that more is needed than management platforms, (iii) exploitation and service models that typically provided by ontologies and semantic web- of big data, (iv) distributed repositories for big data storage,

Volume 4 Issue 1, January 2020 10 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 (v) big data virtualization platforms, (vi) deploying big data that everything is changing and developing rapidly in the applications on cloud environments, and (vii) management of current daily practice of data analytics. The source time-wise distributed big data and applications – just to phenomenon triggering the changes in the methods, tools mention the most significant ones. The use of cloud and applications is the change of nature of digital data. computing methods and resources in capturing and This change manifests itself in the growing amount and processing big data is becoming a daily standard, and it is increasing complexity of data, which challenges pattern- exploited in many areas of big data. This phenomenon based information and knowledge mining. rapidly proliferates in these days since users are able to  A significant diversification can be observed in the area access data and data-processing tools from a cloud anywhere of data transformation. A plethora of methods and and anytime it is needed [215]. Actually, there are several techniques have been developed for systematic and existing data processing applications that need cloud controlled data aggregation, interactive visualization, environments, such as multi-media data management in a structural and semantic interpretation, and trend analysis distributed manner. In turn, the need for efficient BDP also and prediction. However, many of these are general raises many new requirements for cloud computing, for (mathematical and statistical) approaches that are not instance: (i) resources for handling large-scale heterogeneity, reflecting the specific needs of particular applications. (ii) effective and smart multimedia content retrieval, and (iii)  Diversification of the methods and techniques is naturally transport and security protocols, and so on. Large databases followed by the diversification and articulation of the with large volume of vaguely related data entities or complex data analytic tools and systems. Articulation is reflected data structures are also in the focus of research. The by the fact that the commercialized enablers range from intension is to lessen data uncertainty and to increase (i) specific-purpose (individual) tools, through (ii) multi- understanding of meaning and consequently enhance the purpose (integrated) packages, to (iii) application reliability of analysis and decision making. Since the amount oriented toolboxes. There are several interoperability and of data grows irresistibly, data sub-sampling is becoming a efficiency issues reported in the literature. means of resolving computational limitations. Though  Design application of data analytics seems to be in a surrogating entire complex datasets helps overcome the real- premature phase. As a combined effect of the time constraint, it introduces even greater uncertainty. It can proliferation of data analytics tools and the Internet of be prognosticated that efficiency and reliability of data Things connectivity, companies gradually recognize the mining and knowledge discovery will remain the major opportunities and try to convert them into business issues for advanced big data analytics. Processing algorithms benefits. However, neither comprehensive methodologies and mechanisms should be based on new underpinning nor dedicated toolboxes seem to be available to facilitate theories, which allow dealing with the volume, the their endeavor. distribution, the cognitive complicatedness, and the  There is a kind of paradoxical situation due to the large dynamically changing characteristics of big data. The timed number of data analytics tools and their underexposed characteristic of big data does not seem to a significant exploitation in product development and innovation. In obstacle, but the interplay among all aspects of big data does. other words, product managers, designers and developers Across all industries, big data is a new business asset and need to be supported by proper data analytics enablers in advanced data analytics will help businesses to become order to extract new knowledge and achieve significant smarter, more productive, and better at making predictions. benefits by MoL product data processing [216]. It seems In the context of future product and service development, to be a pragmatic, but instrumental strategy to combine some new sources of data like social media will offer new the existing tools and packages into user-friendly and opportunities for designers to get insights into consumers‟ application-sensitive toolboxes. purchasing preferences, decisions and behavior, and to  The title of the paper is: „What does data analytics offer uncover information in context that is not possible with for extracting knowledge from the middle-of-life product traditional product functionalities and life cycle data data?‟ This indicates the need to investigate data management approaches. In a wide field-based collecting analytics in the context of processing MoLD, which are data, they may rely on Internet of Things principles and typically characterized by huge variety, dynamics, and technologies, while in terms of locally interpreting and multiplicity of relationships. Our study revealed that reasoning with big data, they may rely on smart cyber- there are only a very limited number of papers that physical systems technologies. To put an end to the infinite specifically address the issues of extracting knowledge problems of big data management and processing, some from this kind of data, considering the above authors proposed specific solutions for a selection of tools, characteristics and the purpose of providing information the way of coping with the complexity of big data in about product usage and operations. The existing particular application domains. A purposeful regrouping of literature so far lacks a systematic and extensive analysis these solutions (touched upon earlier in this paper) could of stakeholders within MoL phase [217], also analyzing reveal the fact that the majority of authors are committed (if MoLD is a challenge and still in its infancy, since data not attached) to real-time analysis of data and to developing collection of MoLD reveals several issues [10]. With powerful tools and better system architectures so that regards to the tools available, we also claim that there is consumer durable making companies can realize value by no one specifically developed for processing MoLD data. understanding their operations, customers, distributors, and  In order to be able to improve products and services, the marketplace as a whole. designers need to consider the application context and objective at transforming (raw) big data into creative 6.2. Conclusions knowledge. This transformation is also supposed to help Our main conclusions have been formulated as follow: them make proper decisions concerning the best  The completed literature study reinforced our observation enhancement opportunities. However, the tools currently

Volume 4 Issue 1, January 2020 11 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 available on the market are not dedicated to the changing issue is the implementation of data fusion on multiple levels. aspects of the design tasks. In addition, the use of some The associated challenge is to elicit the output data of the tools is complicated because of the necessity of structures, and to give meaning to them according to the having a minimum of knowledge and skills to write and specific application domain, context and tasks. In our case it employ algorithms for data analysis. Furthermore, a concretely means that the toolbox should provide insights for critical issue from the perspective of designers is the lack developers and make them able to transform use-patterns of data integration and the abstraction towards semantic into product enhancement information and knowledge based interpretation of the outputs, without which it is difficult on the outcome of the conducted analyses. to put the results into a specific design context. Knowledge level  The need for a smart data processing system to process Semantically correlated Semantic level real-time data streams to provide support for the operator data structure is becoming a necessary tool to handle the vast amount of Outcomes of the tools Pragmatic level

data generated from online instruments. It will provide e

g

x

a

o

k

b

c

l a

the benefit or pre-filtering of useful data to be transferred o

p

o

e

T

r

a w

from the remote monitoring system and stored for t

f o reference purposes. Much of this data is stored but never Data analytics tools S Syntactic level accessed and the potential capability of expensive instruments often goes unrealized due to a lack of under- standing and difficulty in extracting useful information Figure 6: Novel system features of the proposed toolbox from massive databases [218]. approach

6.3. Future work From an operational point of view, the toolbox will take care Our on-going and future work concentrates on the of three different types of functions, namely: (i) kernel development of a novel „designerly‟ data analytics toolbox functions (for data transformations), (ii) auxiliary functions for product designers of a family of data-intensive products. (for data management), and (iii) interface functions Based on our forerunning study we hypothesized that the (connecting end-users, developers, and systems). The currently available data analytics tools miss semantic fusion development of the toolbox will happen in two phases. The with regards to their output data and suffer from the lack of reason is that, according to our experiences, an incremental interpretation of data constructs in various design contexts. (or a step-by-step) innovation is more acceptable and We are conducting a two-stage study to understand what straightforward for companies than a disruptive innovation. designers would expect from a „designerly‟ data analytics The first phase focuses on the development of an initial toolbox, and what sort of smart functionalities would they version of the toolbox for integral syntactic data analytics, assume to be provided by a sophisticated (smart) toolbox. On which is referred to as „conventional data analytics toolbox‟ the basis of these studies we are conceptualizing a functional, (C-DAT). Typical classes of syntactic data mining tasks are architectural and information processing framework for the such as: (i) clustering analysis, (ii) anomaly/outlier detection, target toolbox. We assumed that, instead of developing a set (iii) association rule learning, (iv) classification analysis, and of new data analytics tools, regrouping and compilation of (v) . C-DAT includes an arsenal of existing ones in a toolbox proper for an application existing tools, launched as open source or marketed as dependent set of data transformation actions and able to commercial tools. The software tools interoperating in store, access and analyze data in real-time/nearly real-time to various modules share a common user interface, exchange allow extracting of knowledge beneficial for companies‟ data with each other, and can be adapted easily according to product managers and developers. The functionality of the the user and application needs. They aggregate, represent, toolbox will go beyond the typical analytics functionality of maintain, analyze or enhance MoLD entered as inputs, and commercialized integrated packages. It will offer provide insights on how products are used by different purposefully correlated output data structures, which will customers and in different cultures, inform on the availability allow interpretation of the meanings and relationships of the of the product, cast light on emerged failures, etc. This will outcomes of multiple software tools not only on syntactic facilitate the decision making concerning what and how to level, but also on pragmatic and semantic levels. In addition, improve and for which customers. In the second phase of the toolbox will feature an open architecture that makes it development, C-DAT will be augmented with various smart possible to include or exclude various tools, and will be reasoning components enabling semantic data analytics, equipped with reasoning capability that will provide reasoning, and advising. This advanced version is called meaningful insights and knowledge or product developers. „smart data analytics toolbox‟ (S-DAT). The smart kernel The general principle of information fusion by the toolbox is functionality extends developed (second-generation) toolbox shown in Figure 6. The functional/architectural framework of with: (vi) historic operation use situation/change analysis, the toolbox reflects a stratified structure as shown in this (vii) multi-source output data fusion, (viii) simulation-based Figure. The endeavor is to achieve the highest possible level prediction and forecasting, (ix) visually-based feature of fusion and semantic processing of output data, rather than learning, (x) event occurrence monitoring („watchdogging‟), just integrating analytics tools on data exchange-level. This (xi) semantic association graph building, and (xii) context- is underpinned by the need of product developers, who are based advising. The modules of the smart part are not data analytics experts. They need not only convenient purposefully developed semantic data modeling and data transfer among the various software tools and databases, reasoning tools. This smart extension will be developed to but also well-constructed and visually interpretable data keep up with next generations of products, which are structures, as well as explicit developer knowledge that can equipped with the capability of: (i) aggregating and be used in multiple enhancement contexts. A major technical generating MoLD real-time, (ii) communicating about their

Volume 4 Issue 1, January 2020 12 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 objectives and states, and (iii) building awareness, reasoning [8] A. Katal, M. Wazid, and R. Goudar, “ Big Data: about their operation states and objectives, and (iv) adapting Issues, Challenges, Tools and Good Practices,”. In themselves towards an optimum performance. Based on Proceedings of the 6th International Conference on these, they can be considered as an external part of the smart Contemporary Computing (IC3‟13), pp. 404-409, toolbox. On the other hand, the considered elements of 2013. smartness are supposed to allow designers making the products more autonomous and carrying out a part of the [9] S. Terzi, A. Bouras, D. Dutta, M. Garetti, and D. data analytics quasi- automatically. This gives opportunity Kiritsis, “Product Lifecycle Management from its for using multiple products as interacting agents contributing History to its new Role,” International Journal of to multi-aspects enhancements. This also means that Product Lifecycle Management, 4(4), pp. 360-389, designers will use the toolbox in case of large scale, multi- 2010. source dependent data processing. Our research will also consider sharing smart data analytics functions between the [10] W.F. van der Vegte, “Taking Advantage of Data concerned products themselves and the product data generated by Products: Trends, Opportunities and analytics environment. The reason for this is that products Challenges”, In Proceedings of the ASME Design are becoming equipped with more and more smart Engineering Technical Conferences & Computers and capabilities, which enable them to gather and process data by Information in Engineering Conference IDETC/CIE, themselves in run time and self-adapt themselves according pp. V01BT02A025-V01BT02A025, 2016. to the operational conditions and altering objectives. What it means is that products can take over a part of the [11] H. Shu, “Big Data Analytics: Six Techniques,” Geo- functionality of a data analytics toolbox. This, however, is a spatial Information Science, pp. 1-10, 2016. new research phenomenon and challenge that needs further extensive studies. Putting everything together, research in [12] T. H. Davenport, and L. Prusak, Working Knowledge: and development of a smart toolbox able to assist How Organizations Manage What They Know, anticipation of real-life use patterns and decision-making Harvard Business Press, Buston, 1998. about product enhancement is relevant not only for the scientific community, but also for several segments of the [13] A. Bufardi, D. Kiritsis, and P. Xirouchakis, making industry. “Generation of Design Knowledge from Product Life Cycle Data,” Methods and Tools for Effective References Knowledge Life-Cycle-Management, pp. 375-389, [1] Z. Bi, and D. Cochran, “Big data analytics with 2008. applications,” Journal of Management Analytics, 1(4), pp. 249-265, 2014. [14] D. Che, M. Safran, & Z. Peng, “From Big Data to Big Data Mining: Challenges, Issues, and Opportunities,” [2] K. Bodenhoefer, A. Schneider, T. Cock, A. Brooks, In Proceedings of the International Conference on G. Sands, L. Allman, and O. Delannoy, Database Systems for Advanced Applications, pp. 1- “Environmental life cycle information management 15, 2013. and acquisition – First experiences and results from field trials,” In Proceedings of Electronics Goes [15] V. Mayer-Schönberger, and K. Cukier, Big Data: A Green, pp. 5-8, 2004 Revolution That Will Transform How We Live, Work, and Think, Houghton Mifflin Harcourt, 2013. [3] H.-B. Jun, D. Kiritsis, and P. Xirouchakis, “Research issues on closed-loop PLM,” Computers in Industry, [16] F. Xhafa, & L. Barolli, “Semantics, Intelligent 58(8), pp. 855-868, 2007. Processing and Services for Big Data,” Future Generation Computer Systems, 37, pp. 201–202, [4] Falcon Project, “Feedback mechanisms Across the 2014. Lifecycle for Customer-driven Optimization of iNnovative product-services”, available at: [17] D. Kiritsis, “Ubiquitous Product Lifecycle http://www.falcon-h2020.eu, 2015. Management Using Product Embedded Information Devices,” In Proceedings of the International [5] M. Franke, P. Klein, L. Schröder, and K.-D. Thoben, Conference in Intelligent Maintenance Systems “Ontological Semantics of Standards and PLM (IMS‟2004), x-x, 2004. Repositories in the Product Development Phase,” Global Product Development, pp. 473-482, 2011. [18] J. Guttag, “Abstract Data Types and the Development of Data Structures,” Communications of the ACM, [6] A. Saaksvuori, and A. Immonen, Product lifecycle 20(6), pp. 396-404, 1977. management (2 Ausgabe), Springer, 2005. [19] M.L. Brodie, “On the development of data models,” [7] K. Bongard-Blanchy, and C. Bouchard, “Dimensions On conceptual modelling, pp. 19-47, 1984. of User Experience from the Product Design Perspective,” Journal d'Interaction Personne-Système, [20] C. Ballard, D. Herreman, D. Schau, R. Bell, E. Kim, 3(1), pp. 1-15, 2014. and A. Valencic, Data Modeling Techniques for Data Warehousing”, IBM Corporation International Technical Support Organization, 1998.

Volume 4 Issue 1, January 2020 13 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 [21] S.R. Gunn, M. Brown, and K.M. Bossley, “Network laney/files/2012/01/ad949-3D-Data-Management - Performance Assessment for Neurofuzzy Data Controlling-Data-Volume-Velocity-andVariety.pdf. Modeling,” Lecture Notes in Computer Science, 2015 1280, pp. 313-323, 1997. [35] P. Russom, “Big Data Analytics,” TDWI Best [22] N. Jia, M. Xie, and X. Chai, “Development and Practices Report, fourth quarter, pp. 1-35, 2011. Implementation of a GIS-based Safety Monitoring System for Hydropower Station Construction,” [36] M. M. Fouad, N. E. Oweis, T. Gaber, M. Ahmed, & Journal of Computing in Civil Engineering, 26(1), pp. V. Snasel, “Data Mining and Fusion Techniques for 44-53, 2011. WSNs as a Source of the Big Data,” Procedia Computer Science, 65, pp. 778-786, 2015. [23] N. Jukic, “Modeling strategies and alternatives for data warehousing projects,” Communications of the [37] H. Rahman, S. Begum, and M.U. Ahmed, “Ins and ACM, 49(4), pp. 83-88, 2006. Outs of Big Data: A Review,” In Proceedings of the International Conference on IoT Technologies for [24] M.F. Worboys, H.M. Hearnshaw, and D.J. Maguire, HealthCare, pp. 44-51, 2016. “Object-oriented Data Modelling for Spatial Databases,” International Journal of Geographical [38] A.J. Jara, D. Genoud, and Y. Bocchi, “Big Data for Information System, 4(4), pp. 369-383, 1990. Cyber Physical Systems: An Analysis of Challenges, Solutions and Opportunities,” In Proceedings of the [25] M. Molenaar, “A Syntactic Approach for Handling 8th International Conference on Innovative Mobile the Semantics of Fuzzy Spatial Objects,” Geographic and Internet Services in Ubiquitous Computing objects with indeterminate boundaries, 2, pp. 207- (IMIS), pp. 376-380, 2014. 224, 1996. [39] M.D. Assunção, R.N. Calheiros, S. Bianchi, M.A. [26] P. Shoval, and S. Shiran, “Entity-relationship and Netto, and R. Buyya, “Big Data Computing and Object-oriented Data Modeling - An Experimental Clouds: Trends and Future Directions,” Journal of Comparison of Design Quality,” Data & Knowledge Parallel and Distributed Computing, 79, pp. 3-15, Engineering, 21(3), pp. 297-315, 1997. 2015.

[27] C. Coronel, and S. Morris, Database Systems: Design, [40] P.S. Yu, “On mining big data,” Web-Age Information Implementation & management, Cengage Learning, Management, 7923, XIV, 2013. Boston, 2016. [41] J. Cassina, Extended Product Lifecycle Management, [28] G.V. Post, Database Management Systems: Designing PhD thesis at Politecnico di Milano, 2008. and Building Business Applications, McGraw-Hill, Boston, 1999. [42] M. Madhikermi, A. Buda, B. Dave, and K. Främling, “Key Data Quality Pitfalls for Condition based [29] P. Andreeva, “Data Modelling and Specific Rule Maintenance,” In Proceedings of the 2nd International Generation Via Data Mining Techniques,” In Conference on System Reliability and Safety Proceedings of the International Conference on (ICSRS), pp. 474-480, 2017. Computer Systems and Technologies (CompSysTech), pp. IIIA.17-1.-IIIA.17-6, 2006. [43] D. Vladimirova, S. Evans, V. Martinez, and J. Kingston, “Elements of Change in the Transformation [30] D. J. Henderson, R. J. Carroll, and Q. Li, towards Product Service Systems,” Functional “Nonparametric Estimation and Testing of Fixed Thinking for Value Creation, pp. 21-26, 2011. Effects Panel Data Models,” Journal of Econometrics, 144(1), pp. 257-275, 2008. [44] A. Ericson, P. Müller, T. Larsson, and R. Stark, “Product-service Systems from Customer Needs to [31] S. Madden, “From Databases to Big Data,” IEEE Requirements in Early Development Phases,” In Internet Computing, 16(3), pp. 4-6, 2012. Proceedings of the 1st CIRP Industrial Product Service Systems Conference, pp. 62-67, 2009. [32] D. Dietrich, Data Science & Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting [45] J. H. Shin, H.B. Jun, C. Catteneo, D. Kiritsis, and P. Data, Wiley, 2015. Xirouchakis, “Degradation Mode and Criticality Analysis based on Product Usage Data,” The [33] H.-P. Kriegel, K.M. Borgwardt, P. Kröger, A. International Journal of Advanced Manufacturing Pryakhin, M. Schubert, and A. Zimek, “Future trends Technology, 78(9-12), pp. 1727-1742, 2015. in data mining,” Data Mining and Knowledge Discovery, 15(1), pp. 87-97, 2007. [46] J.S Ward, and A. Barker, “Undefined by Sata: A Survey of Big Data Definitions”, available at [34] D. Laney, 3D Data Management: Controlling Data https://arxiv.org/abs/1309.5821, 2013 Volume, Velocity, and Variety, Technical Report, http://blogs.gartner.com/doug-

Volume 4 Issue 1, January 2020 14 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 [47] D. Bollier, and C.M. Firestone, The Promise and Peril [60] E.A. King, “How to Buy Data Mining: A Framework of Big Data, Aspen Institute, Communications and for Avoiding Costly Project Pitfalls in Predictive Society Program, Washington, DC., 2010. Analytics,” Information Management, 15(10), pp. 38- 44, 2005. [48] J. Gantz and D. Reinsel, “The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest [61] I.H. Witten, E. Frank, M.A. Hall, and C.J. Pal, Data Growth in the Far East,” IDC iView: IDC Anal. Mining: Practical Tools and Future, vol. 2007, pp. 1-16, 2012. Techniques, Morgan Kaufmann, 2016.

[49] M. Shah, “Big Data and the Internet of Things,” Big [62] A. Berson, and S.J. Smith, S.J., Data Warehousing, Data Analysis: New Algorithms for a New Society, Data Mining, and OLAP. McGraw-Hill, Inc., New Springer, pp. 207-237, 2016. York, 1997.

[50] E.A. Mohammed, B.H Far, and C. Naugler, [63] P. Zikopoulos, and C. Eaton, Understanding Big “Applications of the MapReduce Programming Data: Analytics for Enterprise Class Hadoop and Framework to Clinical Big Data Analysis: Current Streaming Data, McGraw-Hill Osborne Media, 2011. Landscape and Future Trends,” BioData mining, 7(22), pp. 1-23, 2014. [64] N. Marz, and J. Warren, Big Data: Principles and Best Practices of Scalable Realtime Data Systems, [51] S.H. Nguyen, A. Skowron, and P. Synak, “Discovery Manning Publications Co., 2015. of Data Patterns with Applications to Decomposition and Classification Problems,” Rough Sets in [65] J. Cohen, P. Cohen, S.G. West, and L.S. Aiken, Knowledge Discovery 2, Springer, pp. 55-97, 1998. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences, Psychology Press, [52] L. Silverston, and P. Agnew, Universal patterns for Routledge, 2013. data modeling, The Data Model Resource Book, John Wiley & Sons, 2009. [66] W. Lin, M.A. Orgun, and G.J. Williams, “An Overview of Temporal Data Mining,” In Proceedings [53] M. West, Developing high quality data models, of the 1st Australian Data Mining Workshop, pp. 83- Elsevier, 2011. 90, 2002.

[54] M. Goebel, and L. Gruenwald, “A Survey of Data [67] T.H. Davenport, J.G. Harris, and R. Morison, Mining and Knowledge Discovery Software Tools,” Analytics at Work: Smarter Decisions, Better Results, ACM SigKDD Explorations Newsletter, 1(1), pp. 20- Harvard Business Press, Buston, MA, 2010. 33, 1999. [68] A. Trnka, “Big Data Analysis,” European Journal of [55] S.-J. Lee, and E.-N Huh, “Shear-based Spatial Science and Theology, 10(1), pp. 143-148, 2014. Transformation to Protect Proximity Attack in Outsourced Database,” In Proceedings of the 12th [69] C. Barras, S. Meignier, and J.-L Gauvain, IEEE International Conference on Trust, Security and “Unsupervised Online Adaptation for Speaker Privacy in Computing and Communications Verification over the Telephone,” In Proceedings of (TrustCom), pp. 1633-1638, 2013. the IEEE Odyssey, ISCA Speaker Recognition Workshop, pp. 157-160, 2004. [56] E. Rahm, and H.H. Do, “Data Cleaning: Problems and Current Approaches,” IEEE Data Engineering [70] A. Dowgiert, The Impact of Big Data on Traditional Bulletin, 23(4), pp. 3-13, 2000. Health Information Management and EHR. Doctoral Dissertation, The College of St. Scholastica, 2014. [57] L. Berti-Equille, T. Dasu, and D. Srivastava, “Discovery of Complex Glitch Patterns: A Novel [71] E. Fotopoulou, A. Zafeiropoulos, D. Papaspyros, P. Approach to Quantitative Data Cleaning,” In Hasapis, G. Tsiolis, T. Bouras, N. Zanetti, “Linked Proceedings of the 27th International Conference on Data Analytics in Interdisciplinary Studies: The the Data Engineering, pp. 733-744, 2011. Health Impact of Air Pollution in Urban Areas,” IEEE Access, 4, pp. 149-164, 2016. [58] K. Czarnecki, and S. Helsen, “Classification of Model Transformation Approaches,” In Proceedings of the [72] S. Zhang, C. Zhang, and Q. Yang, “Data Preparation 2nd OOPSLA Workshop on Generative Techniques in for Data Mining,” Applied Artificial Intelligence, the Context of the Model Driven Architecture, 17(5-6), pp. 375-381, 2003. Anaheim, 2003 [73] A.D. Chapman, “Principles and Methods of Data [59] S. Sharma, and S. Srinivasan, “Issues Related with Cleaning,” Report for the Global Biodiversity Software Conversion Projects,” International Journal Information Facility, Copenhagen, pp. 1-72, 2005. of Electrical Electronics & Computer Science Engineering, 1(4), pp. 10-13, 2014. [74] J. Han, J. Pei, and M. Kamber, Data Mining: Concepts and Techniques, Elsevier, 2011.

Volume 4 Issue 1, January 2020 15 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 [75] K. Paunović, “Data Collecting,” Data Collecting Aggregation,” IEEE Transactions on Fuzzy Systems, Encyclopedia of Public Health, Springer, pp. 196-199, 14(6), pp. 885-896, 2006. 2008. [87] Z. Blivband, P. Grabov, and O. Nakar, “Expanded [76] R.T. Ng, and J. Han, “CLARANS: A Method for Fmea (efmea),” In Proceedings of the Annual Clustering Objects for Spatial Data Mining,” IEEE Symposium Reliability and Maintainability, pp. 31- Transactions on Knowledge and Data Engineering, 36, 2004. 14(5), pp. 1003-1016, 2002. [88] J. Shafer, R. Agrawal, and M. Mehta, “SPRINT: A [77] A.K. Jain, M.N. Murty, and P.J. Flynn, “Data Scalable Parallel Classifier for Data Mining,” In Clustering: A Review,” ACM Computing Surveys, Proceedings of the 22nd International Conference on 31(3), pp. 264-323, 1999. Very Large Databases, pp. 544-555, 1996.

[78] J. Han, and N. Cercone, “RuleViz: a Model for [89] D. Luo, C. Ding, and H. Huang, “Parallelization with Visualizing Knowledge Discovery Process,” In Multiplicative Algorithms for Big Data Mining,” In Proceedings of the 6th ACM SIGKDD International Proceedings of the IEEE 12th International Conference on Knowledge Discovery and Data Conference on Data Mining, pp. 489-498, 2012. Mining, pp. 244-253, 2000. [90] R. Chen, K. Sivakumar, and H. Kargupta, “Collective [79] A. Wnuk, D. Gozdowski, A. Górny, Z. Wyszyński, Mining of Bayesian Networks from Distributed and M. Kozak, “Data Visualization in Yield Heterogeneous Data,” Knowledge and Information Component Analysis: An Expert Study,” Scientia Systems,6(2), pp. 164-187, 2004. Agricola, 74(2), pp. 118-126, 2017. [91] P. Williams, C.R. Margules, and D.W. Hilbert, “Data [80] T. Imamura, S. Yamada, and M. Machida, Requirements and Data Sources for Biodiversity “Development of a High Performance Eigensolver on Priority Area Selection,” Journal of Biosciences, the Petascale Next Generation Supercomputer 27(4), pp. 327-338, 2002. System,” In Proceedings of the Joint International Conference on Supercomputing in Nuclear [92] O. Maimon, and L. Rokach, “Decomposition Applications and Monte Carlo 2010 (SNA+MC2010), Methodology for Knowledge Discovery and Data pp. 643-650, 2011. Mining,” Data Mining and Knowledge Discovery Handbook, 2, pp. 981-1003, 2005. [81] L. Geng, and H.J. Hamilton, “Interestingness Measures for Data Mining: A Survey,” ACM [93] J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, Computing Surveys (CSUR), C. Roxburgh, and A.H. Byers, Big Data: The Next https://www.researchgate.net/profile/Howard_Hamilt Frontier for Innovation, Competition, and on/publication/220566216_Interestingness_Measures Productivity, McKinsey Global Institute, 2011. _for_Data_Mining_A_Survey/links/00463517ef3e1d2 ce8000000/Interestingness-Measures-for-Data- [94] L. Duan, and Y. Xiong, “Big Data Analytics and Mining-A-Survey.pdf. 2006. Business Analytics,” Journal of Management Analytics, 2(1), pp. 1–21, 2015. [82] J. Chattratichat, J. Darlington, M. Ghanem, Y. Guo, H. Hüning, M. Köhler, D. Yang, “Large Scale Data [95] S. Lohr, “The Age of Big Data”, New York Times, Mining: Challenges and Responses,” In Proceedings http://www.nytimes.com/2012/02/12/sunday- of the 3rd International Conference on Knowledge review/bigdatas-impact-in-the-world.html. 2012. Discovery and Data Mining, pp. 61-64, 1997. [96] E. Begoli, and J. Horey, “Design Principles for [83] S. Ren, and X. Zhao, “A Predictive Maintenance Effective Knowledge Discovery from Big Data,” Method for Products based on Big Data Analysis,” In In Proceedings of the 2012 Joint Working IEEE/IFIP Proceedings of the International Conference on Conference on Software Architecture and European Materials Engineering and Information Technology Conference on Software Architecture, pp. 215-218, Applications (MEITA 2015), pp. 385-390, 2015. 2012.

[84] X. Wu, X. Zhu, G.-Q Wu, and W. Ding, “Data [97] M. Chen, S. Mao, Y. Zhang, and V.C Leung, “Big Mining with Big Data,” IEEE Transactions on data analysis,” Big Data, pp. 51-58, 2014. Knowledge and Data Engineering, 26(1), pp. 97-107, 2014. [98] L.A. Kurgan, and P. Musilek, “A Survey of Knowledge Discovery and Data Mining Process [85] W. Klösgen, “Knowledge Discovery in Databases and Models,” The Knowledge Engineering Review, 21(1), Data Mining,” Foundations of Intelligent Systems, pp. pp. 1-24, 2006. 623-632, 1996. [99] B. Padmanabhan, and A. Tuzhilin, “A Brief-driven [86] H. Frigui, “Membershipmap: Data Transformation Method for Discovering Unexpected Patterns,” In based on Granulation and Fuzzy Membership Proceedings of the 4th International Conference on

Volume 4 Issue 1, January 2020 16 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 Knowledge Discovery and Data Mining, pp. 94-11, Trends in Ecology & Evolution, 27(2 ), pp. 85-93, 1998. 2012.

[100] S. Munir, J.A. Stankovic, C.-J. M. Liang, and S. Lin, [113] D. Gorissen, K. Crombecq, W. Hendrickx, and T. “Reducing Energy Waste for Computers by Human- Dhaene, “Grid Enabled Metamodeling,” In in-the-loop Control,” IEEE Transactions on Emerging Proceedings of the 7th International Meeting on High Topics in Computing, 2(4), pp. 448-460, 2014. Performance Computing for Computational Science, available at [101] M.J. Kane, J. Emerson, and S. Weston, “Scalable http://sumo.intec.ugent.be/sites/sumo/files/sumo/2006 Strategies for Computing with Massive Data,” Journal _07__VECPAR.pdf, 2006. of Statistical Software, 55(14), pp. 1-19, 2013. [114] D. Gorissen, K. Crombecq, I. Couckuyt, and T. [102] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “The Dhaene, “Automatic Approximation of Expensive KDD Process for Extracting Useful Knowledge from Functions with Active Learning,” Foundations of Volumes of Data,” Communications of the ACM, Computational Intelligence, Learning and 39(11), pp. 27-34, 1996. Approximation, 1, pp. 35-62, 2009.

[103] N. Guarino, Formal Ontology in Information [115] Y. Nemchinova, and H. Sayani, “Using Tools in Systems: Proceedings of the 1st International Teaching University Courses in Information Conference (FOIS'98), IOS Press, Trento, 1998. Technology,” In Proceedings of the 7th ACIS International Conference on Software Engineering, [104] D.T. Tempelaar, B. Rienties, and B. Giesbers, “In Artificial Intelligence, Networking, and Search for the Most Informative Data for Feedback Parallel/Distributed Computing (SNPD'06), pp. 361- Generation: Learning Analytics in a Data-rich 367, 2006. Context,” Computers in Human Behavior, 47, pp. 157-167, 2015. [116] S.E. Lenzi, C.E. Standoli, G. Andreoni, P. Perego, and N.F. Lopomo, “A Software Toolbox to Improve [105] Semantics, available at: Time-Efficiency and Reliability of an Observational http://www.historygraphicdesign.com/the-age-of- Risk Assessment Method,” In Proceedings of information/the-international-typographic-style/810- the Congress of the International Ergonomics semantics, 2016. Association, pp. 689-708, 2018.

[106] S. Ramsay, “Special Section: Reconceiving Text [117] H. Murakoshi, M. Kishi, and K. Ochimizu, Analysis: Toward an Algorithmic Criticism,” Literary “Developing Web-based On-demand Learning and Linguistic Computing, 18(2), pp. 167-174, 2003. System,” In Proceedings of the International Conference on Computers in Education, pp. 1223- [107] H.-J. Lenz, “A Rigorous Treatment of Microdata, 1227, 2002. Macrodata, and Metadata,” In Proceedings of the Computational Statistics Conference, pp. 357-362, [118] J.A. Sain, “ESD/MITRE Software Acquisition 1994. Symposium Proceedings; an ESD/Industry Dialogue held in Bedford,” available at [108] S. Berry, J. Levinsohn, and A. Pakes, “Differentiated http://www.dtic.mil/docs/citations/ADA178785,1986 Products Demand Systems from a Combination of Micro and Macro Data: The New Car Market,” [119] J. Fan, F. Han, and H. Liu, “Challenges of Big Data Journal of Political Economy, 112(1), 68-105, 2004. Analysis,” National Science Review, 1(2), pp. 293- 314, 2014. [109] I. Icke, and A. Rosenberg, “Multi-objective Genetic Programming Projection Pursuit for Exploratory Data [120] R. Roy, E. Shehab, A. Tiwari, T. Baines, H. Modeling,” In Proceedings of the 5th Annual Machine Lightfoot, O. Benedettini, and J. Kay, “The Learning Symposium, available at: Servitization of Manufacturing: A Review of https://arxiv.org/pdf/1010.1888.pdf, 2010. Literature and Reflection on Future Challenges. Journal of Manufacturing Technology Management, [110] M. Shehada, and F. Alkhaldi, “Measuring the 20(5), pp. 547-567, 2009. Efficiency and Effectiveness of the Human Resources Training Function at Orange Jordan,” International [121] A. Oussous, F.Z. Benjelloun, A.A. Lahcen, and S. Journal of Quantitative and Qualitative Research Belfkih, “Big Data Technologies: A Survey,” Journal Method, 3(2), pp. 1-12, 2015. of King Saud University-Computer and Information Sciences, 30(4), pp. 431-448, 2018. [111] E. Parzen, “Nonparametric Statistical Data Modeling,” Journal of the American statistical [122] A. Taneja, “Enhancing Web Data Mining: The Study association, 74(365), pp. 105-121, 1979. of Factor Analysis,” Web Usage Mining Techniques and Applications Across Industries, IGI Global, pp. [112] W.K. Michener, and M.B. Jones, “Ecoinformatics: 116-136, 2017. Supporting Ecology as a Data-intensive Science,”

Volume 4 Issue 1, January 2020 17 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 [123] Y. Wang, L. Kung, and T.A. Byrd, “Big Data An Analytical Study,” In Proceedings of the Analytics: Understanding its Capabilities and International Conference on Computing, Potential Benefits for Healthcare Organizations,” Communication and Automation (ICCCA), pp. 37-41, Technological Forecasting and Social Change, 126, 2017. pp. 3-13, 2018. [136] G. Piatetsky, “R, Python Duel as Top Analytics, Data [124] B.T. Mayne, S.Y. Leemaqz, S. Buckberry, and Science Software”, Kdnuggets 2016 Software Poll C.M.R. Lopez, C.T. Roberts, T. Bianco-Miotto, and J. results, http://www. kdnuggets. com/2016/06/r- Breen, “msgbsR: An R Package for Analysing python-topanalytics-data-mining-data-science- Methylation-sensitive Restriction Enzyme software. html. 2016. Sequencing Data,” Scientific reports, 8(1), pp. 2190- 2198, 2018. [137] F. Ye, Y. Chen, Q. Huang, and L. Li, “Developing Cloud-Based Tools for Water Resources Data [125] A.S. Tremsin, S. Ganguly, S.M. Meco, G.R. Pardal, Analysis Using R and Shiny,” In Proceedings of the T. Shinohara, and W.B. Feller, “Investigation of International Conference on Emerging Dissimilar Metal Welds by Energy-resolved Neutron Internetworking, Data & Web Technologies, pp. 289- Imaging,” Journal of applied crystallography, 49(4), 297, 2017. pp. 1130-1140, 2016. [138] A. Goyal, I. Khandelwal, R. Anand, A. Srivastava, [126] D.R. Roe, and T.E. Cheatham III, Parallelization of and P. Swarnalatha, “A Comparative Analysis of the CPPTRAJ Enables Large Scale Analysis of Molecular Different Data Mining Tools by Using Supervised Dynamics Trajectory Data,” Journal of Computational Learning Algorithms,” In Proceeding of the Chemistry, 39(25), pp. 2110-2117, 2018. International Conference on Soft Computing and Pattern Recognition, pp. 105-112, 2016. [127] A. Boettcher, W. Brendel, and M. Bethge, “Large Scale Blind Source Separation,” In Proceedings of [139] C. Giraud-Carrier, and O. Povel, “Characterising Data Bernstein Conference, pp. 118-119, 2016. Mining Software,” Intelligent Data Analysis, 7(3), pp. 181-192, 2003. [128] T. Thilagaraj, and N. Sengottaiyan, “A Review of Educational Data Mining in Higher Education [140] S. Peng, G. Wang, and D. Xie, “Social Influence System,” In Proceedings of the 2nd International Analysis in Social Networking Big Data: Conference on Research in Intelligent and Computing Opportunities and Challenges,” IEEE Network, 31(1), in Engineering, pp. 349–358, 2017. pp. 11-17, 2017.

[129] D.H. Spatti, and L.H.B. Liboni, “Computational [141] I. Taleb, M.A. Serhani, and R. Dssouli, “Big Data Tools for Data Processing in Smart Cities,” Smart Quality: A Survey,” In Proceedings of the 2018 IEEE Cities Technologies, InTech, pp. 41-54, 2016. International Congress on Big Data (BigData Congress), pp. 166-173, 2018. [130] A. Rizwan, A. Zoha, R. Zhang, W. Ahmad, K. Arshad, N.A. Ali, ..., and Q.H. Abbasi, “A Review on [142] A. Gepp, M.K. Linnenluecke, T.J. O‟Neill, and T. the Role of Nano-communication in Future Smith, “Big Data Techniques in Auditing Research Healthcare Systems: A Big Data Analytics and Practice: Current Trends and Future Perspective,” IEEE Access, 6, pp. 41903-41920, Opportunities,” Journal of Accounting Literature, 40, 2018. pp. 102-115, 2018.

[131] Y. Tian, “Accelerating Data Preparation for Big Data [143] B. Prakash, and D.M. Hanumanthappa, “Issues and Analytics,” Doctoral dissertation, Télécom ParisTech, Challenges in the Era of Big Data Mining,” 2017. International Journal of Emerging Trends and Technology in Computer Science (IJETICS), 3, pp. [132] H. Demuth, M. Beale, and M. Hagan, “Neural 321-325, 2014. Network Toolbox 6,” User‟s guide, pp. 37-55, 2008. [144] B. Vijayalakshmi, and M. Srinath, “State-of-the-art [133] E.C. Foster, and S. Godbole, “Overview of Oracle,” Frameworks and Platforms for Processing and Database Systems, Apress, Berkeley, CA, pp. 435- Managing Big Data as well as the Efforts Expected on 442, 2016. Big Data Mining,” Elysium Journal of Engineering Research & Management, 1(2), pp. 1-9, 2014. [134] J. Vijayaraj, R. Saravanan, P.V. Paul, and R. Raju, “Hadoop Security Models-a Study,” In Proceedings [145] S. Joshi, and M.K. Nair, “Survey of Classification of the Online International Conference on Green Based Prediction Techniques in Healthcare,” Indian Engineering and Technologies (IC-GET), pp. 1-5, Journal of Science and Technology, 2016. http://www.indjst.org/index.php/indjst/article/viewFil e/121111/84191. 2018. [135] S.K. Sahu, M.M. Jacintha, and A.P. Singh, “Comparative Study of Tools for Big Data Analytics:

Volume 4 Issue 1, January 2020 18 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 [146] M. Scarnò, “ ADaMSoft: Un Software Open Source e [161] P. Tamayo, C. Berger, M. Campos, J. Yarmus, B. un‟Esperienza nel Calcolo Scientifico,” CASPUR Milenova, A. Mozes, S. Thomas, “Oracle Data Annual Report, pp. 46-47, 2008. Mining,” Data Mining and Knowledge Discovery Handbook, pp. 1315-1329, 2005. [147] H.W. Deng, Y.N. Zhang, B. Ke, and M.T. Li, “Decision Making of Ore Mining Model based on [162] G. Shmueli, P.C. Bruce, I. Yahav, N.R. Patel, and Analytica Software,” The Chinese Journal of K.C. Lichtendahl Jr, Data Mining for Business Nonferrous Metals, Analytics: Concepts, Techniques, and Applications in http://en.cnki.com.cn/Article_en/CJFDTOTAL- R, John Wiley & Sons, 2017. ZYXZ201702014.htm. 2017. [163] M. Hofmann, and R. Klinkenberg, “ Getting Used to [148] U. Cieplik, BV4. 1 Methodology and User-friendly RapidMiner,” RapidMiner, Chapman and Hall/CRC, Software for Decomposing Economic Time Series, pp. 65-76, 2016. German Federal Statistical Office, 2006. [164] R. Cody, Learning SAS by Example: A 's [149] G. Karypis, “CLUTO- a Clustering Toolkit,” Guide, SAS Institute, 2018. Technical Report 02-017, Department of Computer Science, University of Minnesota, [165] S.L. Campbell, J.P. Chancelier, and R. Nikoukhah, http://www.cs.umn.edu/ cluto. 2002. Modeling and Simulation in SCILAB, Springer, New York, 2006. [150] A.B. Comsol, COMSOL Multiphysics® v. 5.2. Stockholm, Sweden, 2016. [166] S. Sonnenburg, S. Henschel, C. Widmer, J. Behr, A. Zien, F.D. Bona, V. Franc, “The SHOGUN Machine [151] I.D., Landau, and V. Landau, “Data Mining et Learning Toolbox,” Journal of Machine Learning Machine Learning dans les Big Data: Une Tentative Research, 11, pp. 1799-1802. 2010. de Démystification,” https://hal.archives- ouvertes.fr/hal-01393640/file/DMML_F.pdf. 2016. [167] D. George, and P. Mallery, IBM SPSS Statistics 23 Step by Step: A Simple Guide and Reference, [152] S.V. Chekanov, Numeric Computation and Statistical Routledge, 2016. Data Analysis on the Java Platform, Springer, Basel, 2016. [168] L. StataCorp, Stata Data Analysis and Statistical Software, Special Edition Release, 2007. [153] E. Coman, M.W. Brewster, S.K. Popuri, A.M. Raim, and M.K. Gobbert, “A Comparative Evaluation of [169] R.R. Bouckaert, E. Frank, M. Hall, R. Kirkby, P. Matlab, Octave, FreeMat, Scilab, R, and IDL on Tara. Reutemann, A. Seewald, and D. Scuse, WEKA http://www.webcitation.org/6BbWqerg3. 2012. Manual for Version 3-9-1, University of Waikato, Hamilton, New Zealand, 2016. [154] J.W. Eaton, GNU Octave 4.2 Reference Manual, Samurai Media Limited, 2017. [170] R. Brennan, “Challenges for Value-driven Semantic Data Quality Management,” In Proceedings of the [155] JASP Team, JASP (Version 0.7. 5.5) [Computer 19th International Conference on Enterprise software]. Google Scholar, pp. 765-766, 2016. Information Systems, 1, pp. 85-392, 2017.

[156] S. Dwivedi, P. Kasliwal, and S. Soni, [171] A.A. Rahimah, and T.R. Naik, “Data Mining for Big “Comprehensive Study of Data Analytics Tools Data,” International Journal of Advanced Technology (RapidMiner, Weka, R tool, Knime),” In Proceedings and Innovation research, 7(5), pp. 697-701, 2015. of the Symposium on Colossal Data Analysis and Networking (CDAN), pp. 1-8, 2016. [172] N. Bikakis, “Big Data Visualization Tools”, arXiv preprint arXiv:1801.08336. 2018. [157] D.J. Higham, and N.J. Higham, MATLAB Guide, Siam, 2016. [173] M. Chen, S. Mao, Y. Zhang, and V.C. Leung, “Big Data Applications,” In Big Data, Springer, pp. 59-79, [158] T. Hothorn, and M.T. Hothorn, “The Maxstat 2014. Package”, http://btr0x2.rz.uni- bayreuth.de/math/statlib/R/CRAN/doc/packages/maxs [174] I. Kopanas, N.M. Avouris, and S. Daskalaki, “The tat.pdf. 2007. Role of Domain Knowledge in a Large Scale Data Mining Project,” In Proceedings of the 2nd Hellenic [159] W. Winston, “Microsoft Excel Data Analysis and Conference on Artificial Intelligence: Methods and Business Modeling,” Microsoft Press, 2016. Applications of Artificial Intelligence, pp. 288-299, 2002. [160] W. Miller, OpenStat Reference Manual, Springer, Science & Business Media, 2012. [175] M. Brodie, M. Greaves, and J. Hendler, “Databases and AI: The Twain Just Met,” In Proceedings of the 2011 STI Semantic Summit, pp: 6-8, 2011.

Volume 4 Issue 1, January 2020 19 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 [176] C. Bizer, P. Boncz, M.L. Brodie, and O. Erling, “The [191] P. Groves, B. Kayyali, D. Knott, and S. Van Kuiken, Meaningful Use of Big Data: Four Perspectives - Four “The Big Data Revolution in Healthcare,” McKinsey Challenges,” ACM Sigmod Record, 40(4), pp. 56-60, Q, http://www.mckinsey.com/industries/healthcare- 2012. systems-and-services/our-insights/the-big-data- revolution-in-us-health-care, 2013. [177] K. Michael, and K.W. Miller, “Big Data: New Opportunities and New Challenges [Guest Editors' [192] W.H. Organization, Who Operational Package for Introduction],” Computer, 46(6), pp. 22-24., 2013. Assessing, Monitoring and Evaluating Country Pharmaceutical Situations, Guide for Coordinators [178] H.-P. Schöner, “Automotive Mechatronics,” Control and Data Collectors, 2007. Engineering Practice, 12(11), pp. 1343-1351, 2004. [193] G.M. Weiss, “Data Mining in Telecommunications,” [179] C. Zheng, M. Bricogne, J. Le Duigou, and B. Eynard, Data Mining and Knowledge Discovery Handbook, “Survey on Mechatronic Engineering: A Focus on pp. 1189-1201, 2005. Design Methods and Product Models,” Advanced Engineering Informatics, 28(3), pp. 241-257, 2014. [194] I. Kļevecka, and J. Lelis, “Pre-processing of Input Data of Neural Networks: The Case of Forecasting [180] X. Yang, P. Moore, and S.K. Chong, “Intelligent Telecommunication Network Traffic,” Telektronikk, Products: From Lifecycle Data Acquisition to 3, pp. 168-178, 2008. Enabling Product-related Services,” Computers in Industry, 60(3), pp. 184-194, 2009. [195] A. Szalay, “Extreme Data-intensive Scientific Computing,” Computing in Science & Engineering, [181] P. Breuer, J. Moulton, and R.M. Turtle, Applying 13(6), pp. 34-41, 2001. Advanced Analytics in Consumer Companies, McKinsey & Company, New York, 2013. [196] G. Brumfiel, “Down the Petabyte Highway,” Nature News, 469(20), pp. 282-283, 2011. [182] J. Li, F. Tao, Y. Cheng, and L. Zhao, “Big Data in Product Lifecycle Management,” The International [197] F.-Y Wang, K.M. Carley, D. Zeng, and W. Mao, Journal of Advanced Manufacturing Technology, “Social Computing: From Social Informatics to Social 81(1-4), pp. 667-684, 2015. Intelligence,” IEEE Intelligent Systems, 22(2), pp. 79- 83, 2007. [183] M.J. Shaw, C. Subramaniam, G.W. Tan, and M.E. Welge, “Knowledge Management and Data Mining [198] A. Lesk, Introduction to Bioinformatics, Oxford for Marketing,” Decision Support Systems, 31(1), pp. University Press, 3rd edition, 2013. 127-137, 2001. [199] J. McDermott, R. Samudrala, R. Bumgarner, K. [184] G. Duncan, “Privacy by Design,” Science, 317(5842), Montgomery, and R. Ireton, Computational Systems pp. 1178-1179, 2007. Biology, Springer, 2009.

[185] E.E. Schadt, “The Changing Privacy Landscape in the [200] C.P. Chen, and C.Y. Zhang, “Data-intensive Era of Big Data,” Molecular Systems Biology, 8(1), Applications, Challenges, Techniques and pp. 612-614, 2012. Technologies: A Survey on Big Data,” Information Sciences, 275, pp. 314-347, 2014. [186] C. Liu, and K.P. Arnett, “Exploring the Factors Associated with Web Site Success in the Context of [201] W. Fan, and A. Bifet, “Mining Big Data: Current Electronic Commerce,” Information & Management, Status, and Forecast to The Future,” ACM SigKDD 38(1), pp. 23-33, 2000. Explorations Newsletter, 14(2), pp. 1-5, 2013.

[187] T. Preis, H.S. Moat, and H.E. Stanley, “Quantifying [202] J. Leskovec, A. Rajaraman, and J.D. Ullman, Mining Trading Behavior in Financial Markets using Google of Massive Datasets, Cambridge University Press, Trends,” Scientific Reports, 3, pp. 1684-1689, 2013. Cambridge, 2014.

[188] E. Fuhrer, E. System for Enhanced Financial Trading [203] M. Zheng, X. Ming, G. Li, and Shi Y., “The Support, Google Patents, Washington, DC, 2000. Framework of Business Model Innovation for Smart Product-service Ecosystem,” In Proceedings of the [189] R. Bryant, R.H. Katz, and E.D. Lazowska, “Big Data NordDesign Conference, 2, pp. 400-409, 2016. Computing: Creating Revolutionary Breakthroughs in Commerce, Science and Society,” The Computing [204] L. Atzori, A. Iera, and G Morabito, “The Internet of Community Consortium Project, pp. 1-15. 2008 Things: A Survey,” Computer Networks, 54(15), pp. 2787-2805, 2010. [190] R. Steinbrook, “Personally Controlled Online Health Data – The Next Big Thing in Medical Care?,” The [205] T.S. Dillon, H. Zhuge, C. Wu, J. Singh, and E. Chang, New England Journal of Medicine, 358(16), pp. 1653- “Web‐of‐things Framework for Cyber-Physical 1656, 2008.

Volume 4 Issue 1, January 2020 20 www.ijarp.org

International Journal of Advanced Research and Publications ISSN: 2456-9992 Systems,” Concurrency and Computation: Practice [217] E.G. Nabati, K.D. Thoben, and M. Daudi, and Experience, 23(9), pp. 905-923, 2011. “Stakeholders in the Middle of Life of Complex Products: Understanding the Role and Information [206] J. Farringdon, A.J. Moore, N. Tilbury, J. Church, and Needs,” International Journal of Product Lifecycle P.D. Biemond, “Wearable Sensor Badge and Sensor Management, 10(3), pp. 231-257, 2017. Jacket for Context Awareness,” In Proceedings of the 3rd International Symposium on Wearable Computers, [218] C.W. Chow, J. Liu, J. Li, N. Swain, K. Reid, and C.P. San Francisco, pp. 107-113, 1999. Saint, “Development of Smart Data Analytics Tools to Support Wastewater Treatment Plant Operation,” [207] P. Valckenaers, B. Saint Germain, P. Verstraete, J. Chemometrics and Intelligent Laboratory Systems, Van Belle, K. Hadeli, H. Van Brussel, “Intelligent 177, pp. 140-150, 2018. Products: Agere versus Essere,” Computers in Industry, 60(3), pp. 217–228, 2009.

[208] I. Horváth, Z. Rusák, and Y. Li, “Order beyond Chaos: Introducing the Notion of Generation to Characterize the Continuously Evolving Implementations of Cyber-Physical Systems,” In Proceedings of the ASME 2017 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, pp. 1-14, 2017.

[209] G.G. Meyer, K. Främling, and J. Holmström, “Intelligent Products: A survey,” Computers in Industry, 60(3), pp. 137-148, 2009.

[210] K. Främling, J. HolmströM, J. Loukkola, J. Nyman, and A. Kaustell, “Sustainable PLM through Intelligent Products,” Engineering Applications of Artificial Intelligence, 26(2), pp. 789-799, 2013.

[211] S. Earley, “Analytics, Machine Learning, and the Internet of Things,” IT Professional, 17(1), pp. 10-13, 2015.

[212] S. Szykman, S.J. Fenves, W. Keirouz, and S.B. Shooter, “A Foundation for Interoperability in Next- generation Product Development Systems,” Computer-Aided Design, 33(7), pp. 545-559, 2001.

[213] L. Aroyo, and D. Dicheva, “The New Challenges for E-Learning: The Educational Semantic Web,” Educational Technology & Society, 7(4), pp. 59-69, 2004.

[214] J. Bézivin, H. Bruneliere, F. Jouault, and I. Kurtev, “Model Engineering Support for Tool Interoperability,” In Proceedings of the Workshop in Software Model Engineering - A MODELS 2005 Satellite Event, pp. 1-16, 2005.

[215] S. Pandey, and S. Nepal, “Cloud Computing and Scientific Applications – Big Data, Scalable Analytics, and Beyond,” Future Generation Computer Systems, 29, pp. 1774–1776, 2013.

[216] H. Haugen, and B. Ask, “Real Integration of ICT into Subject Content and Methodology Requires more than Technology, Infrastructure and Standard Software,” In Proceedings of the World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education, pp. 107-117, 2010.

Volume 4 Issue 1, January 2020 21 www.ijarp.org