On the Performance of Cloud Services and Databases for Industrial Iot Scalable Applications

electronics

Article On the Performance of Cloud Services and Databases for Industrial IoT Scalable Applications

Paolo Ferrari * , Emiliano Sisinni * , Alessandro Depari, Alessandra Flammini, Stefano Rinaldi , Paolo Bellagente and Marco Pasetti Department of Information Engineering, University of Brescia, 25123 Brescia, Italy; [email protected] (A.D.); alessandra.ﬂ[email protected] (A.F.); [email protected] (S.R.); [email protected] (P.B.); [email protected] (M.P.) * Correspondence: [email protected] (P.F.); [email protected] (E.S.)

Received: 28 July 2020; Accepted: 28 August 2020; Published: 3 September 2020

Abstract: In the Industry 4.0 the communication infrastructure is derived from the Internet of Things (IoT), and it is called Industrial IoT or IIoT. Smart objects deployed on the field collect a large amount of data which is stored and processed in the Cloud to create innovative services. However, differently from most of the consumer applications, the industrial scenario is generally constrained by time-related requirements and its needs for real-time behavior (i.e., bounded and possibly short delays). Unfortunately, timeliness is generally ignored by traditional service provider, and the Cloud is treated as a black box. For instance, Cloud databases (generally seen as “Database as a service”—DBaaS) have unknown or hard-to-compare impact on applications. The novelty of this work is to provide an experimental measurement methodology based on an abstract view of IIoT applications, in order to define some easy-to-evaluate metrics focused on DBaaS latency (no matter the actual implementation details are). In particular, the focus is on the impact of DBaaS on the overall communication delays in a typical IIoT scalable context (i.e., from the field to the Cloud and the way back). In order to show the effectiveness of the proposed approach, a real use case is discussed (it is a predictive maintenance application with a Siemens S7 industrial controller transmitting system health status information to a Cloudant DB inside the IBM Bluemix platform). Experiments carried on in this use case provide useful insights about the DBaaS performance: evaluation of delays, effects of involved number of devices (scalability and complexity), constraints of the architecture, and clear information for comparing with other implementations and for optimizing configuration. In other words, the proposed evaluation strategy helps in finding out the peculiarities of Cloud Database service implementations.

Keywords: distributed measurement systems; automation networks; Industry 4.0; cloud computing

1. Introduction The fourth industrial revolution, commonly addressed as the Industry 4.0 paradigm, leverages on the idea of a digital twin of real-world systems (e.g., machineries, tools, and so on) implemented in the Cloud domain and continuously interacting with the physical twin by means of Internet-based communication infrastructure [1–3]. On the digital side, analytics are executed in order to evaluate the system history and forecast future behavior, thus improving availability, reliability and efficiency of production systems. On the physical side, low-cost smart field devices perform preliminary data processing and transmit refined information about the physical quantities of interest to the Internet. The previously described scenario leads to the generation and management of very large amounts of data, which pave the way to innovative services aiming at increasing overall performance in terms of cost, lifetime, and efficiency [4–8].

Electronics 2020, 9, 1435; doi:10.3390/electronics9091435 www.mdpi.com/journal/electronics Electronics 2020, 9, 1435 2 of 17

As a matter of fact, communications in the Industry 4.0 scenario rely on a subset of the Internet of Things (IoT) paradigm, which is commonly referred to as the Industrial IoT or IIoT [9]; in particular, the well-known pyramidal arrangement of legacy industrial communications, whose lower level was occupied by fieldbuses, is flattened thanks to the adoption of the Internet as a sort of universal backbone [10]. As a consequence, the time-related constrains of fieldbuses are somehow inherited by such Internet-based communications, which can spread on geographical scale [11,12]. In turn, latency in concluding service requests probably remains the most important key performance indicator (KPI). Unfortunately, there is no universal agreement on how such a performance evaluation must be carried out on new architecture, and the development of metrics univocally defined is still an open research issue [13–15]. Moreover, the impact of remote database accesses in the overall delay is usually not detailed, since the whole Cloud is generally considered as a single black-box entity. This paper has the following novel contents:

It provides easy-to-compute metrics to objectively evaluate the time performance of Cloud • databases (Database as a service—DBaaS—paradigm), due to the very important role they assume in data analytics services [16]. It introduces a reference model of an Industry 4.0 scenario (which includes a closed loop acquiring • data from the production line and providing back information about the system KPI of interest [17]). It provides an abstract view of the above reference model, so that the metrics are independent • from the peculiar mechanism implemented to access the database (which only depends on the way the system has been realized, and not on the service it offers). It considers real-world use cases for the experimental evaluation of the proposed methodology • and of the associated set of metrics. The testbeds confirm the general effectiveness of the proposed approach, especially when the scalability of the system under test must be evaluated and compared.

The paper is arranged as follows. The next Section provides a brief overview of the IIoT paradigm and possible Cloud database scenarios. In Section3 the proposed approach is detailed, while Section4 describes the considered real-world use cases. The experimental testbench setup to evaluate the eﬀectiveness of the proposed methodology is in Section5. Section6 presents a brief overview of the demo systems used for trade fairs. Finally, the conclusions are drawn.

2. Materials and Methods Any Industry 4.0 applications exist in a Cyber-Physical System (CPS), consisting of both the real-world product and its digital counterpart [18]. In particular, applications rely on the capability to collect information about the status of a so-called smart thing (or object) during its whole lifetime, from the initial conceiving to the aftermarket, including the disposal and recycle stage. Collected data are saved in distributed data repository built up according to the Cloud computing paradigm; the very same data are used to update the status of the smart object “digital twin”, which is nothing but a description (i.e., a model) of the device in the digital domain. The backbone connecting the physical and the cyber domains is based on the Internet: the IIoT is the IoT subset which is sometimes defined as the Internet for the machines. The very large amount of data generated by means of the IIoT paradigm is the fundament of any Industry 4.0 service. The most common example is furnished by the currently commonly adopted predictive maintenance approach. In the past, maintenance was carried out purely on time basis, i.e., the older the equipment the more frequent maintenance procedures were executed. However, faults very often occur on a random basis, thus vanishing any time-based maintenance scheduling. Nowadays, more accurate failure models are implemented in the cyber space, based on artificial intelligence techniques to learn the actual system/product behavior from the IIoT-derived data, so that more effective maintenance can be scheduled and, based on the results of the prediction, control applications may be configured to provide a real-time feedback via the equipment actuators. Electronics 2020, 9, 1435 3 of 17 Electronics 2020, 9, x FOR PEER REVIEW 3 of 17

AsAs stated stated in thein introduction,the introduction, data aredata saved are insaved databases in databases hosted in hosted the Cloud, in the generally Cloud, exploiting generally theexploiting DBaaS approach, the DBaaS often approach, referred often to as referred “Managed to as Databases” “Managed sinceDatabases” AWS introduced since AWS itsintroduced Relational its DatabaseRelational Service Database (RDS) Service back (RDS) in 2009. back In in other 2009. words, In other the words, DBaaS the acronym DBaaS refers acronym to the refers software to the infrastructuresoftware infrastructure that permits that to setup, permits use to and setup, scale use databases and scale by databases means of aby common means of set a of common abstractions set of primitives,abstractions thus primitives, not caring thus of the not peculiar caring of implementation the peculiar implementation details [19]. details [19]. Currently,Currently, DBaaS DBaaS is is the the fastest fastest growing growing Cloud Cloud service; service; and and the the cost cost of of o ﬀofferedered services services is is mainly mainly determineddetermined by by the the writing writing speed speed (in terms (in terms of transaction of transaction per second), per second), the storage the size storage and thesize available and the computationavailable computation capability. Oncapability. the contrary, On the the contrary, delay in the completing delay in writingcompleting and readingwriting and back reading operations back isoperations usually not is taken usually into not account, taken into since account, it is generally since it considered is generally a non-critical considered issue,a non-critical and Cloud-based issue, and applicationsCloud-based are applications generally “slow” are generally [20]. However, “slow” [20]. such However, a hypothesis such a may hypothesis be false may in most be false industrial in most applications,industrial applications, where real-time where behavior real-time is oftenbehavior mandatory. is often mandatory.

TheThe Considered Considered Scenario Scenario TheThe advent advent of of the the Industry Industry 4.0 4.0 and and of of the the IIoT IIoT communications communications allows allows programmable programmable logic logic controllerscontrollers (PLC), (PLC), Supervisory Supervisory Control Control And Data And Acquisition Data (SCADA)Acquisition systems, (SCADA) and sensors systems,/actuators and tosensors/actuators be directly and freely to be interconnected. directly and freely Moreover, interconnected. the Internet Moreover, backbone the extends Internet connectivity backbone extends to the Cloud,connectivity possibly to inthe real-time. Cloud, possibly As a matter in real-time. of fact, As the a communication matter of fact, the changes communication from the client-serverchanges from approach,the client-server e.g., used approach, by human e.g., operator used by for human accessing op theerator Web, for to accessing the publisher-subscriber the Web, to the approach,publisher- wheresubscriber the relationship approach, where between the the relationship data producer between and the the data data producer consumer and is looselythe data defined consumer [10]. is Consequently,loosely defined innovative [10]. Consequently, services operating innovative on servic a geographicales operating scale on a have geographical been devised, scale have offering been manydevised, additional offering benefits many additional with respect benefits to a local with scale respect parameters to a local optimizationscale parameters (e.g., optimization [21]). In such (e.g., a context,[21]). In advanced such a context, data mining advanced techniques data mining can be techniques activated [can22], be leading activated to important [22], leading savings to important during productionsavings during [23]. production [23]. InIn this this paper, paper, a a reference reference Cloud-based Cloud-based applicationapplication has been considered, considered, as as shown shown in in Figure Figure 1.1 .In Inparticular, particular, without without any any generality generality loss, loss, a data sourcesource providesprovides informationinformation aboutabout a a generic generic smart smart objectobject (e.g., (e.g., a a smart smart sensor sensor deployed deployed in in a a production production process). process). Data Data flow flow to to the the Internet Internet by by means means of of anan IoT IoT Gateway Gateway (which (which consists consists of of a a software software interface interface or or may may be be a a hardware hardware peripheral), peripheral), where where are are storedstored and and processed processed in in the the Cloud. Cloud. Indeed, Indeed, user user applications applications reside reside in in the the Cloud Cloud as as well, well, analyzing analyzing datadata from from the the field field and and providing providing feedback feedback actions, actions, sent sent to to the the final final destination. destination.

Figure 1. The reference Cloud-based application architecture for deﬁnition of delay and latency metrics Figure 1. The reference Cloud-based application architecture for definition of delay and latency including the Cloud Database. The black patch shows the data transfer to the Cloud, while the blue metrics including the Cloud Database. The black patch shows the data transfer to the Cloud, while path shows the data transfer from the Cloud. the blue path shows the data transfer from the Cloud. The Cloud environment allows to collect information from very diverse data sources, storing The Cloud environment allows to collect information from very diverse data sources, storing them in the Cloud database (with DBaaS paradigm) and feeding many diﬀerent applications them in the Cloud database (with DBaaS paradigm) and feeding many different applications (e.g., (e.g., implementing machine learning techniques for predictive maintenance, as previously stated). implementing machine learning techniques for predictive maintenance, as previously stated).

Electronics 2020, 9, x FOR PEER REVIEW 4 of 17

Electronics 2020, 9, 1435 4 of 17 Regarding the data dispatching, message-oriented middleware is preferred to client-server REST API (REpresentational State Transfer Application Programming Interface) based on the HTTP protocol.Regarding It is well theknown data that dispatching, messaging message-oriented protocols are particularly middleware well-suited is preferred for to IIoT client-server applications, REST sinceAPI they (REpresentational natively adopt State the Transferpublisher/subscriber Application Programming paradigm and Interface) are event-driven. based on the Many HTTP of protocol. such messagingIt is well knownprotocols that exist messaging today, protocolseach one arewith particularly its own pros well-suited and cons, for but IIoT all applications, sharing a common since they architecture.natively adopt In this the work, publisher without/subscriber any loss paradigmof generality, and the are Message event-driven. Queuing Many Telemetry of such Transport messaging (MQTT)protocols has existbeen today,adopted. each one with its own pros and cons, but all sharing a common architecture. In thisRegarding work, withoutthe Cloud any database loss of generality,access, two the differen Messaget modalities Queuing can Telemetry be considered Transport for the (MQTT) sake of has completenessbeen adopted. (see Figure 2). The first is traditional polling strategy, in which a DB (Database) request is startedRegarding by the user, the asynchronously Cloud database with access, respect two di toff theerent DB. modalities In such a cancase, be time considered related performance for the sake of is completenessalso affected by (see the Figure polling2). Thecycle first and is traditionalits uncertainty. polling Another strategy, approach in which relies a DB on (Database) the use requestof so- calledis started database by the trigger, user, asynchronously that is a purposely with respect defined to the DB.procedure In such stored a case, timein the related DB performancewhich is synchronouslyis also affected executed by the polling when cyclea specific, and its predefined uncertainty. action Another occurs approach within relies the onDB the[24]. use Generally, of so-called triggersdatabase are created trigger, to that run is when a purposely data are definedmodified, procedure e.g., after storedinsert, inupdate, the DB or whichdelete operations. is synchronously The useexecuted of triggers when thus a specific, allows that predefined certain actionactions occurs are accomplished within the DB regardless [24]. Generally, of the triggersuser activity. are created All theto commercially run when data available are modified, solutions e.g., usually after support insert, update, them. or delete operations. The use of triggers thusThe allows trigger that approach certain actions to synchronously are accomplished issue regardlessa DB request of the is particularly user activity. interesting All the commercially for IIoT applications,available solutions in which, usually as previously support stated, them. real-time behavior is generally required.

Figure 2. Two methodologies for retrieving data form Databases with a short latency. (a) The write Figureaction 2. triggersTwo methodologies a user function for thatretrieving reads backdata form the recently Databases written with data;a short (b) latency. The Database (a) The is write polled actionregularly triggers to fetcha user recently function added that data.reads back the recently written data; (b) The Database is polled regularly to fetch recently added data. The trigger approach to synchronously issue a DB request is particularly interesting for IIoT 3. applications,How to Evaluate in which, the Performance as previously of stated, the Architecture real-time behavior is generally required.

3. HowThe aim to Evaluateof this work the is Performance to estimate how of the the Architecture Cloud database services can impact the performance of Industry 4.0 applications. In particular, the reference architecture is a Cloud platform, and the performanceThe aim are of evaluated this work using is to estimate time relate howd indicators the Cloud like database latencies services and delays. can impact the performance ofThe Industry latency 4.0 of applications. an architecture In particular, based on thedata reference exchange architecture is depending is aon Cloud the data platform, path. andIn all the complexperformance systems, are evaluatedlike Cloud-based using time architectures, related indicators there likeare latenciessome parts and that delays. cannot be directly controlledThe by latency the user. of an Many architecture times, the based use on of dataa black-box exchange approach is depending is preferable. on the dataIn this path. paper, In all insteadcomplex of using systems, simulations like Cloud-based for latency architectures, estimation there (as in are [25,26]), some parts an experimental that cannot be methodology directly controlled has beenby thedeveloped. user. Many The times,authors the think use ofthat, a black-box once the me approachtrics are is defined, preferable. the Inexperimental this paper, insteadapproach of can using besimulations easily reproduced for latency on estimation many architectures, (as in [25,26 ]),as andemonstrated experimental in methodology previous works has been and developed. related literaturesThe authors [25–28]. think that, once the metrics are defined, the experimental approach can be easily reproduced on many architectures, as demonstrated in previous works and related literatures [25–28]. 3.1. Introduction of Time Related Metrics 3.1. Introduction of Time Related Metrics The relevant metrics must be defined considering the reference architecture of a classic IoT applicationThe relevantfor industrial metrics purpose. must beFigure defined 1 introduces considering the thearchitecture reference highlighting architecture the of services a classic in IoT theapplication Cloud. Generalizing for industrial the purpose. Industry Figure 4.0 process,1 introduces data the are architecture generated highlightingin field when the the services system in is the producing,Cloud. Generalizing they are collected the Industry by gateways, 4.0 process, and data th areen generatedtransported in fieldin the when Cloud. the systemData storage is producing, and elaborationthey are collected take place by in gateways, the Cloud. and Mixing then transportedof data coming in thefrom Cloud. various Data sources storage and and multiple elaboration parallel take

Electronics 2020, 9, 1435 5 of 17 place in the Cloud. Mixing of data coming from various sources and multiple parallel processing by means of several applications is possible. Finally, decisions and parameters calculated online may be sent back to the machines in the field, closing the loop. The idea of a generally-applicable performance evaluation is tied to the use of time related metrics that highlight the time needed by data to travel from the source to the intended destination. Please, note that the focus is on the Cloud database for evaluating its contribution to latency, therefore the Cloud application is a simple task where the data pass-through without being manipulated. Please note also that a message-based protocol (e.g., MQTT protocol) is used in this work. As required by messaging protocols, there is the “Broker” that handles and dispatches the messages (sent by the “publisher”), forwarding them to the destinations (called “subscribers”). According to the reference architecture shown in Figure1, at time T0 the source of data provides a new information. At time T1, the IoT gateway sends the collected data into the Cloud using a messaging protocol. In the Cloud, the user application X receives, at time T2, the incoming messages from the gateway. The application X stores the message data payload into the Cloud database, completing at time T3 its task. The data now are uploaded and the black path of Figure1 is complete. Whatever is the database access implemented, the inverse path (blue color in Figure1) starts exactly at T3, because at that time the data are available inside the database and further delays are due only to read and send operation (using method shown in Figure2). The second user application Y is not related to application X, they operate separately. Application Y completes data retrieving from the database at time T4, and it propagates the message to the destination endpoint. Additionally, application Y uses a messaging protocol, and send data though a Broker. The entire information loop ends at time T5. The authors developed a methodology for collecting timestamps and combining them in previous works [28–30]. In order to be combined, the timestamps must be taken from the local system clock (i.e., inside the device, the machine, or the Cloud) after it is synchronized to UTC (Universal Time Coordinated). In this paper the synchronization is done with NTP (Network Time Protocol) as in [31–33]. The goal is to define metrics for objective comparison of different systems. The timestamps are collected and the following primary metrics are computed:

The delay, called D , between the time the IoT Gateway publishes the data and the time they are • PC received in the Cloud inside the user application, formally

D = T2 T1 (1) PC − The delay, called D , for carrying out a Cloud database write operation (i.e., for storing the • CD data), formally D = T3 T2 (2) CD − The delay, called D , for retrieving the data from the Cloud database, formally • DC D = T4 T3 (3) DC −

The DDC can be related to the call of a trigger function that reacts to events inside the database (fastest reaction, since it can send the recently written data to the user application), or it can be related to a polling cycle that periodically read back data from the Cloud database. The delay, called D , for transmitting the data to the subscriber, formally • CS D = T5 T4 (4) CS −

Additionally, the following secondary metrics, used for highlighting particular behaviors, can be calculated as a combination of the previous ones: Electronics 2020, 9, 1435 6 of 17

The latency, called L , for storing a data in database and, then, retrieve it back, formally • CDC L = D + D = T4 T2 (5) CDC CD DC −

The LCDC can be used for characterizing the global behavior of the Cloud database, considered as a black box, when access parameters are changed (e.g., trigger functions or polling). The latency, called L , between publisher and subscriber in the case the data are stored ﬁrst in • PDS the Cloud database and then sent to subscriber, formally

L = D + L + D = T5 T1 (6) PDS PC CDC CS −

The LPDS is the reference metric for characterizing the responsiveness of a Cloud service with Cloud Database inside the feedback loop. The latency, called L , between publisher and subscriber in the case the data are sent immediately • PS to the subscribed before they are stored in the database, formally

L = L L = T5 T1 T4 + T2 (7) PS PDS − CDC − −

The LPS is the reference metric for characterizing the responsiveness of a Cloud service with Cloud Database outside the feedback loop (only used for storing data or logging).

Last, it is worth to say that there is another metric related to scalability introduced, for opportunity reasons, in Section 5.4.

4. The Considered Use Case In this work, a typical predictive maintenance [34,35] application has been considered as the reference use case; in particular, the solution provided by Metalwork for pneumatic valves controller, formally known as EB80, has been adopted. The EB80 solution aims at inferring the health status of electro-pneumatic actuators collecting diagnostic information of actuators, including reaction times, operating temperatures and so on. By the way, each electro-pneumatic actuator forwards such diagnostic data towards a local PLC. The latter leverages on an IoT-gateway to further propagate data into the Cloud, where they are processed according to the Cloud computing paradigm, i.e., they are stored in a Cloud database (according to the aforementioned DBaaS approach) and are analyzed by algorithms remotely executed in the Cloud. The capability of the proposed methodology to evaluate various solutions no matter the actual implementation details, has been verified considering two different architectures of the predictive maintenance system. In the first one, the IoT-gateway publishes data on a generic MQTT-Broker, which oversees the propagation of acquired information towards an IBM Cloud platform, including an IBM Cloudant Database [36]. Activities carried out in the IBM Cloud are managed via Node.js configured by means of Node-RED dataflows. The very same generic MQTT-Broker is used by “actuator” and/or end users (e.g., technicians) to access results furnished by the prediction system processing. In the second scenario, the generic MQTT-Broker is substituted by an IBM-MQTT-Broker, that also in this case is adopted for both receiving data from the field and furnishing results to end users. Both the considered arrangements are depicted in Figure3. Electronics 2020, 9, 1435 7 of 17 Electronics 2020, 9, x FOR PEER REVIEW 7 of 17

Figure 3.3. The experimental setup for the considered useuse case.case. The database access and the cloud application are the samesame forfor bothboth thethe consideredconsidered Brokers.Brokers. 4.1. The Actual Testbench Implementation 4.1. The Actual Testbench Implementation The previously described reference scenarios have been implemented by means of industrial The previously described reference scenarios have been implemented by means of industrial devices and software applications, as shown in the previously introduced Figure3; in particular: devices and software applications, as shown in the previously introduced Figure 3; in particular: Data from the ﬁeld are retrieved by an EB80, the Metalwork electro-pneumatic system including, • Data from the field are retrieved by an EB80, the Metalwork electro-pneumatic system including, other than the solenoid valve assembly, digital and analog I/O and power supply modules; other than the solenoid valve assembly, digital and analog I/O and power supply modules; The local controller is implemented by a S7 PLC from Siemens (the S71215C, supporting • The local controller is implemented by a S7 PLC from Siemens (the S71215C, supporting Ethernet-based connectivity via both regular TCP/IP stack and PROFINET (an Industrial Real-Time Ethernet-based connectivity via both regular TCP/IP stack and PROFINET (an Industrial Real- Ethernet protocol)); Time Ethernet protocol)); The IoT gateway is the Hilscher netIOT device, running a Linux-derived operating system and • The IoT gateway is the Hilscher netIOT device, running a Linux-derived operating system and supporting conﬁguration and management by means of Node-RED interface; the netIOT device supporting configuration and management by means of Node-RED interface; the netIOT device embeds an Intel J1900 processor, 4 GB of RAM and 64 GB solid state disk; embeds an Intel J1900 processor, 4 GB of RAM and 64 GB solid state disk; Maintenance related results are collected by an embedded system designed around an IOT2040 • Maintenance related results are collected by an embedded system designed around an IOT2040 fromfrom SiemensSiemens andand running a Yocto LinuxLinux operatingoperating systemsystem distribution;distribution; thethe processor is an Intel Quark X1020,X1020, complementedcomplemented byby aa 11 GBGB ofof RAM;RAM; • The generic MQTT-Broker is an instance of the free, open source iot.eclipse.org broker; as a • The generic MQTT-Broker is an instance of the free, open source iot.eclipse.org broker; as a consequence, QualityQuality of of Service Service (QoS) (QoS) is notis not guaranteed, guaranteed, due todue the to free the subscription free subscription to the service; to the The IBM Cloud is furnished according to the free subscription reserved for testing proof-of-concept • service; • implementations;The IBM Cloud is for furnished this reason, according the QoS to is the not fr guaranteedee subscription as well. reserved for testing proof-of- Fieldconcept devices implementations; also include a for Siemens this reason industrial, the QoS workstation, is not guaranteed which acts as as well. the local supervisor but has notField been devices involved also in include the experimental a Siemens industrial tests discussed workstation, in the rest which of the acts paper. as the local supervisor but hasThe not EB80 been system, involved the netIOTin the experimental and the IOT2040 tests devicesdiscussed are in all the located rest of at the the paper. engineering campus of theThe University EB80 system, of Brescia; the netIOT local connectivity and the IOT2040 is provided devices via are the all Universitylocated at the network, engineering which campus ensures reliabilityof the University and high of bandwidth.Brescia; local Accordingly, connectivity delays is provided in the via University the University local network network, are which ignored. ensures reliabilityIt must and be high also highlightedbandwidth. that Accordingly, actual placement delays ofin boththe University the iot.eclipse.org local network and IBM are MQTT-Broker ignored. is unknown;It must indeed,be also duehighlighted to the virtualization that actual strategiesplacement adopted of both bythe the iot.eclipse.org providers, location and IBM can MQTT- change timeBroker by is time. unknown; indeed, due to the virtualization strategies adopted by the providers, location can change time by time. 4.2. Implementation of Timestamp Probes and Measurements Data Collection

4.2. ImplementationAll the metrics of introduced Timestamp in Pro thebes Section and Measurements 3.1 are based Data on the Collection availability of as accurate as possible timestamps which allow to track in time the messages. All the metrics introduced in the Section 3.1 are based on the availability of as accurate as Authors already demonstrated in previous works that a viable solution is provided by Node-RED; possible timestamps which allow to track in time the messages. for instance, in [30] they demonstrated that timestamping uncertainty in the order of few milliseconds Authors already demonstrated in previous works that a viable solution is provided by Node-RED; for instance, in [30] they demonstrated that timestamping uncertainty in the order of few

Electronics 2020, 9, 1435 8 of 17 can be obtained in typical IoT-like applications. Node-RED is supported by a browser-based frontend which is used to graphically connect and configure the processing nodes implementing the desired application (in agreement with the dataflow programming paradigm). The actual software execution is event-driven and leverages on a Node.js runtime, which is readily available on many different platforms, thus shortening development time and confirming portability. Once a new timestamp is collected, it is inserted into a purposely defined measurement data structure, which moves along the predictive maintenance data and it is finally populated with the five timestamps Ti, i = 1–5, required to evaluate the proposed metrics. Each new measurement data structure, finalized once it reaches the end user (in this case the IOT2040), is added to the “Meas” Cloudant database shown in Figure3.

5. Experimental Results The experiments are divided into three major campaigns: the ﬁrst one is the longest one and it took ten days; the second and the third campaigns were shorter but more focused on speciﬁc aspects related to the database evaluation. Note that the scenarios with two Brokers are simultaneously evaluated, meaning that the involved Cloud functional blocks are duplicated.

5.1. Long Term Reference Experiment The ﬁrst experiment is related to the reference situation: data are published every minute and each run ends with the subscriber receiving the data. Exactly 15,000 runs are carried out, simultaneously collecting the measurements from both possible paths shown in Figure3. Due to the fact that the connection is over Internet, during the days some very long delay situations happened. In order to keep valid measurement data and discard outliers, the outlier detection threshold has been set to exclude the 1% of the measures (i.e., 150 samples are labeled as outliers). The results in Table1 show the D PC, which is the delay between the publisher (i.e., IoT gateway) and the user application in Cloud. The advantage of using native direct interface to IBM Cloud oﬀered by the Watson Broker is clear; it is faster (more than 100 ms less in average) than the external MQTT Eclipse Broker. Furthermore, the Watson Broker is less variable than the Eclipse Broker and the maximum delay is one third.

Table 1. Statistics of the DPC (Delay from Publisher to Cloud user application) collected in the experiments. All values are in milliseconds.

Broker Type Min Average P95% P99% Max Std. Dev. Eclipse 86 145 235 286 338 43 IBM Watson 4 32 55 88 100 15

The results in Table2 regard the D CD metric. i.e., the delay for writing the data in the Cloud database. The first observation is that the results are independent of the Broker type, since they are only related to the Cloud database that is used. The second observation is that the distribution of the measurements is bimodal (i.e., with measures divided in two groups) as shown in Figure4. For this specific database (IBM Cloudant) the delay (when a write rate of 1 document per minute is used) can be centered on an average of about 160 ms or centered on an average of 30,160 ms. Figure4 shows that the probability of experiencing the longer delay is low (5%) but not negligible. This experimental finding is of major importance, and it is the reason why other deeper experimental campaigns have been carried out later on (see in the following). Electronics 2020, 9, 1435 9 of 17

Electronics 2020, 9, x FOR PEER REVIEW 9 of 17 Table 2. Statistics of the DCD (Delay from Cloud user application to Database) collected in the

experiments.Table All 2. values Statistics are of inthe milliseconds. DCD (Delay from Cloud user application to Database) collected in the experiments. All values are in milliseconds. Lower Group of Measures Higher Group of Measures Lower Group of Measures Higher Group of Measures Broker TypeBroker Type Average Average Std. Std. Dev. Dev. Average Average Std. Dev. Std. Dev. EclipseEclipse 150150 42 42 30,152 30,152 36 36 IBM WatsonIBM Watson 168 168 45 45 30,175 30,175 62 62

30 Eclipse 2 IBM Watson (s) 1 CD D 0 0 0.2 0.4 0.6 0.8 1 Probability

Figure 4. Distribution of the DCD metric during the experiments. There are two groups, the Figure 4. Distribution of the DCD metric during the experiments. There are two groups, the distribution is bimodal.distribution is bimodal.

The results about the DDC metric are in Table 3. In these experiments, the database is accessed The resultsusing the about trigger the functions. DDC metric Hence, areas expected, in Table the3 .delay In these for retrieving experiments, a recently the written database data using is accessed using the triggerthe callback functions. function is Hence, independent as expected, of the Broker the type. delay The forexperiment retrieving results a recentlyare really very written similar data using the callbackin functionthe two situations. is independent In average, of theonly Broker about 50 type. ms are The needed experiment to get back results (i.e., are into really the user very similar in the two situations.application) the In average,new data stored only in about the Cloud 50 ms database. are needed to get back (i.e., into the user application) the new data storedTable 3. in Statistics the Cloud of the database.DDC (Delay from Database to Cloud user application) collected in the experiments. All values are in milliseconds. Table 3. Statistics of the DDC (Delay from Database to Cloud user application) collected in the experiments. All valuesBroker are inType milliseconds. Min Average P95% P99% Max Std. Dev. Eclipse 2 50 73 94 119 13 Broker TypeIBM MinWatson 2Average 53 P95% 78 99 P99% 120 Max 13 Std. Dev.

EclipseThe results of the D 2CS are in the Table 50 4. As in the 73 case of the 94 DPC the delay 119 for sending 13 data to subscriberIBM Watson is shorter with 2 the Watson 53Broker with respect 78 to the 99use of Eclipse 120 Broker (130 13ms more in average). Once again, the IBM Watson has a lower variability than the Eclipse Broker.

The resultsTable of the4. Statistics DCS are of the in D theCS (Delay Table from4. As Cloud in theuser caseapplication of the to Subscriber)DPC the delaycollected for in sendingthe data to subscriber is shorterexperiments. with All the values Watson are in milliseconds. Broker with respect to the use of Eclipse Broker (130 ms more in average). Once again, theBroker IBM Type Watson Min has Average a lower variabilityP95% P99% than Max the Std. Eclipse Dev. Broker. Eclipse 102 183 331 364 406 73 Table 4. Statistics ofIBM the Watson DCS (Delay 11 from 27 Cloud user 32 application 44 56 to Subscriber) 4 collected in the experiments. All values are in milliseconds. Finally, the results about the LPS and LPDS metrics are reported as they globally resume the goal ofBroker this first Type experiment. Min The latency Average between publis P95%her and subscriber P99% describe Max the entire Std. reaction Dev. time of the closed loop architecture. If the database is excluded from the path, the LPS is reported in Eclipse 102 183 331 364 406 73 Table 5. It worth to recall that this metric gives the best possible performance related only to data IBM Watson 11 27 32 44 56 4 transfer. As a matter of fact, since cloud elaboration and database operations are excluded, only Internet delays and internal delay of IBM Cloud are considered. The scenario all based on IBM Cloud Finally, the results about the L and L metrics are reported as they globally resume the goal of PS PDS this first experiment. The latency between publisher and subscriber describe the entire reaction time of the closed loop architecture. If the database is excluded from the path, the LPS is reported in Table5. It worth to recall that this metric gives the best possible performance related only to data transfer. As a matter of fact, since cloud elaboration and database operations are excluded, only Internet delays and internal delay of IBM Cloud are considered. The scenario all based on IBM Cloud has a great advantage with respect to the mixed scenario where the Broker is external (Eclipse Broker). The average LPS is 69 ms in the first case and 330 ms (i.e., five time more) in the second one. Electronics 2020, 9, x FOR PEER REVIEW 10 of 17

Electronics 2020, 9, 1435 10 of 17 has a great advantage with respect to the mixed scenario where the Broker is external (Eclipse Broker). The average LPS is 69 ms in the first case and 330 ms (i.e., five time more) in the second one. Table 5. Statistics of the LPS (Latency between Publisher and Subscriber without passing in the database) collectedTable 5. in Statistics the experiments. of the L AllPS (Latency values are between in milliseconds. Publisher and Subscriber without passing in the database) collected in the experiments. All values are in milliseconds. Broker Type Min Average P95% P99% Max Std. Dev. EclipseBroker Type 215 Min 331Average P95% 487 P99% 551 Max 612 Std. Dev. 89 IBM WatsonEclipse 30215 59331 48787 551 124 612 137 89 15 IBM Watson 30 59 87 124 137 15

On the other hand, the discussion of the LPDS metric is more complicated. The latency between On the other hand, the discussion of the LPDS metric is more complicated. The latency between publisher and subscriber when the Database is included is clearly tied to the behavior of the database publisher and subscriber when the Database is included is clearly tied to the behavior of the database (and IBM Cloudant seems to introduce a variable delay, i.e., a variable DCD). For this reason, the LPDS (and IBM Cloudant seems to introduce a variable delay, i.e., a variable DCD). For this reason, the LPDS is varying from 230 ms to 30,230 ms following the bimodal distribution of DCD. is varying from 230 ms to 30,230 ms following the bimodal distribution of DCD. 5.2. Database Access Experiment 5.2. Database Access Experiment The behavior of the database has been studied with another experiments, aimed to determine if the typeThe of behavior software accessof the database chosen in has the userbeen applicationstudied with could another lead toexperiments, diﬀerent results. aimed The to experimentdetermine if consiststhe type in changingof software the access Node-RED chosen ﬂow in in the order user to useapplication the native could IBM Cloudantlead to different Node for results. Node-RED The orexperiment implementing consists the Cloudantin changing API the REST Node-RED access usingflow in base order HTTP to use Node the native for Node-RED. IBM Cloudant Only Node IBM Watsonfor Node-RED Broker is or used. implementing The experiment the Cloudant took 5 hAPI with REST a new access data using published base everyHTTP minute.Node for Table Node-6 RED. Only IBM Watson Broker is used. The experiment took 5 h with a new data published every shows the percentage of values of the LCDC metrics that are greater than 5 s. Comparing these results withminute. Figure Table4, it can6 shows be seen the that percentage the solution of thatvalues minimizes of the theLCDC occurrence metrics that of the are very greater long delaythan 5 is s. theComparing one that uses these API results REST with for bothFigure writing 4, it can and be reading seen that data the insolution the database. that minimizes the occurrence of the very long delay is the one that uses API REST for both writing and reading data in the database.

Table 6. Percentage of database access with LCDC greater than 5 s. Table 6. Percentage of database access with LCDC greater than 5 s. Read Type Read Type Write Type Node-RED Node API REST Write Type Node-RED Node API REST Node-REDNode-RED node node 13.3% 13.3% 12.3% 12.3% API REST 6.3% 5.33% API REST 6.3% 5.33%

ReferringReferring toto FigureFigure5, 5, and and examining examining more more in in details details the the distribution distribution of of L CDCLCDC (in(in thethe fourfour configurationsconfigurations of of Table Table6 and6 and excluding excluding the the L CDCLCDC> > 55 s),s), itit isis evidentevident thatthatthere thereare are no no di differencesfferences in in termsterms of of distribution distribution shape shape or or center. center. Hence, Hence, the the diff differenterent software software access access method method only only impacts impacts on the on occurrencethe occurrence rate ofrate the of long the delays,long delays, leading leading to the to general the general suggestion suggestion to use to API use REST API forREST both for write both andwrite read and accesses. read accesses.

(a) (b) 0.3 0.3 Read type: Node-RED node 0.25 Read type: Node-RED node 0.25 Read type: API REST Read type: API REST 0.2 0.2

0.15 0.15 obability obability r r P 0.1 P 0.1

0.05 0.05

0 0 0 1000 2000 3000 4000 5000 0 1000 2000 3000 4000 5000 L (ms) L (ms) CDC CDC

FigureFigure 5. 5.Distribution Distribution of of the the L CDCLCDCmetric metric (with (withL LCDCCDC << 5 s)s) inin thethe fourfourconﬁgurations configurations of of the the software software accessaccess method; method; (a ()a write) write type type is is Node-RED Node-RED node node and and (b ()b write) write type type is is REST REST API. API.

ElectronicsElectronics 20202020, 9,, 9x, FOR 1435 PEER REVIEW 1111 of of17 17

5.3. Database Scalability Experiment 5.3. Database Scalability Experiment The third part of the experiments is related to the study of the scalability of IoT applications based onThe Cloud third databases. part of the The experiments goal is to isdetermine related to if thescaling study can of produce the scalability changes of with IoTapplications respect to thebased reference on Cloud scenario databases. evaluated The goal in the is to first determine (long term) if scaling experiment. can produce In changesdetails, within the respect reference to the experimentreference scenarioonly one evaluated device is inusing the ﬁrstthe Cloud (long term) application experiment. and only In details, one database in the reference access per experiment minute is onlyconsidered. one device When is using the scenario the Cloud scales, application it is ex andpected only to one have database more and access more per devices minute ispublishing considered. theirWhen data the to scenario the Cloud. scales, In this it is last expected experiment, to have the more number and more of de devicesvices that publishing publish theirdata datahas been to the increasedCloud.In to this10, 100 last and experiment, 500. The thedevices number are not ofdevices synchronized, that publish and they data publish has been data increased randomly to 10,with 100 theand same 500. rate, The that devices is one are new not synchronized,data message andevery they minute. publish In dataother randomly words, now, with the samedatabase rate, must that is beone accessed new data 10, 100, message and 500 every times minute. per minute. In other Sinc words,e, the now,deployment the database of hundreds must be of accessed devices is 10, not 100, feasibleand 500 with times the per budget minute. of this Since, project, the deployment the publishi ofng hundreds of data ofis simulated devices is notwith feasible a software with thethat budget also takesof this care project, of randomizin the publishingg transmission of data is instants. simulated with a software that also takes care of randomizing transmissionThe details instants. of this particular experiments are shown in Figure 6. The path between the user applicationThe detailsand the of database this particular is interrupted experiments and the are Device shown Simulator in Figure software6. The pathis now between generating the userthe messagesapplication as andthey the come database from is many interrupted devices. and An theother Device important Simulator block software is introduced is now generating in these the experiments,messages as the they Rate come Limiter. from many It is always devices. ne Anothereded to importantrespect the block maximum is introduced allowed in thesewrite experiments,rate, rw, of thethe selected Rate Limiter. database. It is alwaysThis block needed is not to respectused in the the maximum long-term allowed experiments, write rate, since rw ,there of the was selected no possibilitydatabase. that This a block single is notdevice used sending in the long-term data every experiments, minute could since violate there was database no possibility constraints. that a Conversely,single device as sendingthe number data of every devices minute increases, could violatethe use database of the rate constraints. limiter becomes Conversely, mandatory, as the number since usuallyof devices databases increases, rise theerrors use and of the drop rate data limiter if the becomes maximum mandatory, access rate since is usuallyexceeded. databases In other rise words, errors scalableand drop systems data ifmust the use maximum such blocks, access so rate it is is inse exceeded.rted in this In other experiment. words, scalableThe behavior systems of the must Rate use Limitersuch blocks, is simple: so itit isworks inserted dividing in this the experiment. time in a slot The of behavior one second; of the it postpones Rate Limiter the issending simple: of it write works requestdividing to the the database time in a if slot the of maximum one second; number it postpones of write theper sending seconds ofhas write been request reached to in the the database current if second.the maximum number of write per seconds has been reached in the current second.

Figure 6. Adding the Device Simulator software for testing the scalability of the system under test. Figure 6. Adding the Device Simulator software for testing the scalability of the system under test. The Rate Limiter block is required not to violate the Database write rate limit. The Rate Limiter block is required not to violate the Database write rate limit. Even using a Rate Limiter, there is an upper limit to the scalability of the system that is due to the Even using a Rate Limiter, there is an upper limit to the scalability of the system that is due to database write rate limit and to the average generation period, tg, of the devices the database write rate limit and to the average generation period, tg, of the devices

Nmax = rw tg. (8) Nmax = rw·tg·. (8) ReferringReferring to to the the Figure Figure 6,6 ,an an additional additional metric metric can can be be defined: deﬁned: • TheThe delay, delay, called called D DRL, that, that the the Rate Rate Limiter Limiter introduces introduces in inorder order to to enforce enforce the the write write rate rate limit limit of of • RL thethe database database (this (this delay delay varies varies for for each each message message since since it depends it depends on on previous previous message), message), formally formally

DRLD = T3= −T3 T3*. T3*. (9) (9) RL − For the application of the previous discussion to the current use case, the following parameters have been used: rw = 10 write/s (standard rate for IBM Cloudant Lite), tg = 60 s, N = [10,100,500]. Note that, with the given values, the Nmax = 600 for the considered use case. The experiment simulates an

Electronics 2020, 9, 1435 12 of 17

For the application of the previous discussion to the current use case, the following parameters have been used: rw = 10 write/s (standard rate for IBM Cloudant Lite), tg = 60 s, N = [10,100,500]. Note that, with the given values, the Nmax = 600 for the considered use case. The experiment simulates an operation interval of 30 min during two moments (day and night) in order to highlight possible problems of congestions of the cloud infrastructure. The results regarding the DRL are shown in Table7 and Figure7. It is clear that the impact of the Rate Limiter is negligible if the number of devices is far from the Nmax, but it becomes dominant when the devices are 500. In that case, delays up to 3 s are possible. As expected, there are no noticeable diﬀerences between experiments carried out during the day or during the night, since the Rate Limiter requires a very limited computation.

ElectronicsTable 7. 2020Statistics, 9, x FOR of thePEER D REVIEWRL (delay introduced by the rate limiter block) collected in the scalability 12 of 17 experiments. All values are in milliseconds. operation interval of 30 min during two moments (day and night) in order to highlight possible problemsNodes of congestions Conditions of the Mincloud infrastructure. Average P95% P99% Max Std. Dev. The10 results regarding Day the DRL 0 are shown 5 in Table 127 and Figure 54 7. It is 68 clear that 126 the impact of the 10 Night 0 4 7 60 113 61 Rate Limiter is negligible if the number of devices is far from the Nmax, but it becomes dominant when 100 Day 0 16 97 143 281 67 the devices are 500. In that case, delays up to 3 s are possible. As expected, there are no noticeable 100 Night 1 17 100 169 301 57 differences500 between Day experiments 2 carried 731out during 1970 the day 2434 or during 3063 the night, 72 since the Rate Limiter500 requires a Nightvery limited computation. 3 778 1989 2457 3273 76

(a) 0.4 0.1 10 Devices 500 Devices 0.3 100 Devices 0.08 0.06 0.2 0.04

Probability 0.1 Probability 0.02 0 0 02040600123 D (ms) D (s) RL RL (b) 0.4 0.1 10 Devices 500 Devices 0.3 100 Devices 0.08 0.06 0.2 0.04 Probability 0.1 Probability 0.02 0 0 02040600123 D (ms) D (s) RL RL

Figure 7. Distribution of the DRL metric varying the number of devices and the moment of the day: (a) duringFigure the7. Distribution day; (b) during of the the D night.RL metric varying the number of devices and the moment of the day: (a) during the day; (b) during the night. The results regarding the LCDC metric in the scalability experiment are shown in Table8 and

Figure8.Table The database 7. Statistics itself of the introduces DRL (delay an introduced overall latency by the that rate seems limiter depending block) collected on the in etheﬀective scalability write rate. In detail,experiments. the higher All values the number are in milliseconds. of devices, the higher the write rate and the lower the database

Nodes Conditions Min Average P95% P99% Max Std. Dev. 10 Day 0 5 12 54 68 126 10 Night 0 4 7 60 113 61 100 Day 0 16 97 143 281 67 100 Night 1 17 100 169 301 57 500 Day 2 731 1970 2434 3063 72 500 Night 3 778 1989 2457 3273 76

The results regarding the LCDC metric in the scalability experiment are shown in Table 8 and Figure 8. The database itself introduces an overall latency that seems depending on the effective write rate. In detail, the higher the number of devices, the higher the write rate and the lower the database latency. From the LCDC distribution it appears that the support of the distribution shrinks as the number of devices grows. Although, this behavior seems non-intuitive, it may be in accordance with the previous results about the sporadic long delay introduced by IBM Cloudant. Further discussion

Electronics 2020, 9, 1435 13 of 17

latency. From the LCDC distribution it appears that the support of the distribution shrinks as the number of devices grows. Although, this behavior seems non-intuitive, it may be in accordance with the previous results about the sporadic long delay introduced by IBM Cloudant. Further discussion is carried out in the next section. Last, the moment of the day when the experiment is done has a very limited eﬀect, it appears to have some inﬂuence only when the number of devices is low.

Table 8. Statistics of the LCDC (latency introduced by the database) collected in the scalability experiments. All values are in milliseconds.

Nodes Conditions Min Average P95% P99% Max Std. Dev. 10 Day 172 308 487 716 1622 126 Electronics 202010, 9, x FORNight PEER REVIEW 164 252 360 456 682 61 13 of 17 100 Day 107 218 313 409 1432 67 is carried100 out in the next Night section. Last, 109 the moment 219 of the 309 day when 399 the experiment 1238 is 57done has a very 500 Day 101 186 259 327 2033 72 limited effect,500 it appears Night to have 104some influence 196 only when 273 the number 377 of 2657 devices is 76low.

(a) 0.2 10 Devices 0.15 100 Devices 500 Devices 0.1

Probability 0.05

0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 L (s) CDC (b) 0.2 10 Devices 0.15 100 Devices 500 Devices 0.1

Probability 0.05

0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 L (s) CDC

Figure 8. Distribution of the LCDC (latency introduced by the database) collected in the scalability experimentsFigure 8. Distribution varying the numberof the L ofCDC devices (latency and introduced the moment by of the the database) day: (a) during collected the day;in the (b )scalability during theexperiments night. varying the number of devices and the moment of the day: (a) during the day; (b) during the night. 5.4. Final Discussion about Results and Eﬀectiveness of the Proposed Methodology

TheTable proposed 8. Statistics experimental of the L methodologyCDC (latency introduced for the characterization by the database) of scalable collected industrial in the applicationsscalability based onexperiments. Cloud services All values and databases are in milliseconds. has been applied to an example architecture producing signiﬁcant results with importantNodes insights.Conditions It is worth Min to sayAverage that no optimizationP95% P99% nor conﬁgurationMax Std. Dev. has been done, so the results are just10 a pictureDay of the situation172 and308 they might487 be improved.716 1622 However, 126 the important thing “per se” is that10 the proposedNight procedure164 is252 able to highlight360 such456 inner682 behavior61 of architecture blocks, which are100 usually consideredDay as107 black boxes.218 313 409 1432 67 100 Night 109 219 309 399 1238 57 500 Day 101 186 259 327 2033 72 500 Night 104 196 273 377 2657 76

5.4. Final Discussion About Results and Effectiveness of the Proposed Methodology The proposed experimental methodology for the characterization of scalable industrial applications based on Cloud services and databases has been applied to an example architecture producing significant results with important insights. It is worth to say that no optimization nor configuration has been done, so the results are just a picture of the situation and they might be improved. However, the important thing “per se” is that the proposed procedure is able to highlight such inner behavior of architecture blocks, which are usually considered as black boxes.

Electronics 2020, 9, 1435 14 of 17

The primary result is a full characterization of all the components of the architecture under test, with the possibility, for the end user, to determine its weak and strong points. Modifications or improvements can be quantitatively evaluated using the proposed procedures. The first secondary result is the highlighting of the database behavior. Going more specifically, the IBM Cloudant database has a variable delay that depends on the write rate. Contrasting with usual expectations, the IBM Cloudant reduces latency as the write rate increases. With one write per minute, the delay can be very long, up to 30 s in 5% of the cases. When the number of writing increases, the delay decreases, reaching a minimum for a number of writing close to the maximum allowed write rate. There could be several explanations for this behavior; the most probable is the presence inside the database interfaces of a timeout task that waits some time (for additional incoming requests) before executing the write. When the access is sporadic the entire waiting period must expire (i.e., this is the long delay), while if more requests are received the write operations are performed (as a batch) when the number of requests reach a threshold. Another secondary result is the underlining of the indirect effect of database write limitation on the scalability. While it is clear that the write rate limit is the upper bound of the scaling range of the considered application, the effect that such limit has on delays is clearly visible only with the proposed methodology. In particular, every single device experiences a variable delay that is a function of the total number of devices. Higher number of devices means higher delays. The last secondary result is about the evaluation of the real-time capability of architecture under tests. The extensive statistics obtainable with the proposed method can quantify the reaction time of the entire loop (i.e., LPDS, from machine to machine passing in the cloud and in the database). For instance, the use case architecture can be adopted for slowly changing industrial applications only, given that the longest reaction time can be greater than 30 s.

6. Demo of the Use Case and Last Remarks The experimental setup used for the experiments has been transformed in a demo show bench that Metalwork uses in the trade fair. In particular, the demo system (shown in Figure9), has been operating during all the 2019 at Hannover Fair and SPS Fair in Europe. The visitors can interact with the EB80 pneumatic valve systems (e.g., causing malfunctions like “air leakage”) and see the feedback Electronicsof the Cloud 2020, 9 based, x FOR predictivePEER REVIEW maintenances in the dashboard. 15 of 17

FigureFigure 9. 9. RealReal demo demo of of the the predictive predictive maintenance maintenance sy systemsstems based based on on Cloud Cloud services services following following the the architecturearchitecture of of the the use use case case in in Figure Figure 33..

7. Conclusions The Industrial IoT or IIoT is one of the most important parts of the Industry 4.0. Thanks to it, the so-called cyber physical space can be realized, and the overall efficiency and quality of industrial processes can be improved by means of new services. The large amount of data generated in these industrial applications is usually stored in Cloud databases, leveraging on the DBaaS approach. Therefore, timeliness of Cloud databases is a main concern in any IIoT solution. Unfortunately, neither real-time is easily ensured on large distributed system nor accurate and effective methodologies have been established to evaluate time-related performance. In this work, authors propose: a measurement methodology based on an abstraction view of IIoT control applications: and a set of metrics able to objectively evaluate time performance of such systems. As a result, the Cloud is treated as a gray box and details about the Cloud database can be estimated. The suggested approach can be easily used for “stress tests” in order to verify the scalability of the system under evaluation. The on-the-field validation of the propose method has been carried out by means of an extensive measurement campaign based on a real system dealing with predictive maintenance. More in detail, this real-world use case includes: a Siemens S7 PLC, an electro-pneumatic actuators EB80 from Metalworks; and the IBM Bluemix Cloud platform, offering the IBM Cloudant database. Data are transferred using MQTT message-oriented protocol. Thanks to the proposed measurement strategy, some interesting peculiarities of the real system under evaluation are highlighted; for instance, the way the Cloud database stores and retrieves records has a large impact on the overall time performance; and, since different optimization techniques are probably applied depending on the estimated activity, this delay may hugely vary when the number of nodes changes.

Funding: The research has been partially funded by University of Brescia and MoSoRe Project.

Acknowledgments: The authors would like to thank eLux laboratory and the University of Brescia for hosting the experimental setup.

Conflicts of Interest: The authors declare no conflict of interest.

Electronics 2020, 9, 1435 15 of 17

During the show it was possible to carry out some additional experiments about the quantity of data exchanged by the ﬁeld devices and the Cloud. Based on a 10-day long experiment, the publisher EB80 in “demo mode” (which, in order to easy visual interaction with visitors, means a faster publishing rate of tg = 10 s and a full parameter upload,) and the subscriber (i.e., the dashboard panel) consume about 16 Mbytes of aggregate data per day. Such a last result is given just to underline that also the amount of published and subscribed data should be taken into account for scalability analysis.

7. Conclusions The Industrial IoT or IIoT is one of the most important parts of the Industry 4.0. Thanks to it, the so-called cyber physical space can be realized, and the overall efficiency and quality of industrial processes can be improved by means of new services. The large amount of data generated in these industrial applications is usually stored in Cloud databases, leveraging on the DBaaS approach. Therefore, timeliness of Cloud databases is a main concern in any IIoT solution. Unfortunately, neither real-time is easily ensured on large distributed system nor accurate and effective methodologies have been established to evaluate time-related performance. In this work, authors propose: a measurement methodology based on an abstraction view of IIoT control applications: and a set of metrics able to objectively evaluate time performance of such systems. As a result, the Cloud is treated as a gray box and details about the Cloud database can be estimated. The suggested approach can be easily used for “stress tests” in order to verify the scalability of the system under evaluation. The on-the-field validation of the propose method has been carried out by means of an extensive measurement campaign based on a real system dealing with predictive maintenance. More in detail, this real-world use case includes: a Siemens S7 PLC, an electro-pneumatic actuators EB80 from Metalworks; and the IBM Bluemix Cloud platform, offering the IBM Cloudant database. Data are transferred using MQTT message-oriented protocol. Thanks to the proposed measurement strategy, some interesting peculiarities of the real system under evaluation are highlighted; for instance, the way the Cloud database stores and retrieves records has a large impact on the overall time performance; and, since different optimization techniques are probably applied depending on the estimated activity, this delay may hugely vary when the number of nodes changes.

Author Contributions: Conceptualization, P.F.; Data curation, S.R. and E.S.; Formal analysis, E.S.; Funding acquisition, A.F.; Investigation, M.P. and A.D.; Methodology, P.F.; Project administration, P.F. and A.F.; Software, S.R., and P.B.; Supervision, A.F.; Validation, P.F.; Writing—original draft, P.F., S.R. and E.S. All authors have read and agreed to the published version of the manuscript. Funding: The research has been partially funded by University of Brescia and MoSoRe Project. Acknowledgments: The authors would like to thank eLux laboratory and the University of Brescia for hosting the experimental setup. Conﬂicts of Interest: The authors declare no conﬂict of interest.

References

1. Xu, L.D.; Xu, E.L.; Li, L. Industry 4.0: State of the art and future trends. Int. J. Prod. Res. 2018, 56, 2941–2962. [CrossRef] 2. Xu, H.; Yu, W.; Griﬃth, D.; Golmie, N. A Survey on industrial internet of things: A cyber-physical systems perspective. IEEE Access 2018, 6, 78238–78259. [CrossRef] 3. Li, J.; Yu, F.R.; Deng, G.; Luo, C.; Ming, Z.; Yan, Q. Industrial internet: A survey on the enabling technologies, applications, and challenges. IEEE Commun. Surv. Tutor. 2017, 19, 1504–1526. [CrossRef] 4. Tao, F.; Zuo, Y.; Xu, L.D.; Zhang, L. IoT-Based intelligent perception and access of manufacturing resource toward cloud manufacturing. IEEE Trans. Ind. Inform. 2014, 10, 1547–1557. 5. Du, S.; Liu, B.; Ma, H.; Wu, G.; Wu, P. IIOT-based intelligent control and management system for motorcycle endurance test. IEEE Access 2018, 6, 30567–30576. [CrossRef] Electronics 2020, 9, 1435 16 of 17

6. Kurte, R.; Salcic, Z.; Wang, K.I. A distributed service framework for the internet of things. IEEE Trans. Ind. Inform. 2020, 16, 4166–4176. [CrossRef] 7. Nguyen, T.; Gosine, R.G.; Warrianl, P. A systematic review of big data analytics for oil and gas industry 4.0. IEEE Access 2020, 8, 61183–61201. [CrossRef] 8. Tang, H.; Li, D.; Wan, J.; Imran, M.; Shoaib, M. A reconfigurable method for intelligent manufacturing based on industrial cloud and edge intelligence. IEEE Internet Things J. 2020, 7, 4248–4259. [CrossRef] 9. Villalonga, A.; Beruvides, G.; Castaño, F.; Haber, R.E. Cloud-based industrial cyber–physical system for data-driven reasoning: A review and use case on an industry 4.0 pilot line. IEEE Trans. Ind. Inform. 2020, 16, 5975–5984. [CrossRef] 10. Muhuri, P.K.; Shukla, A.K.; Abraham, A. Industry 4.0: A bibliometric analysis and detailed overview. Eng. Appl. Artif. Intell. 2019, 78, 218–235. [CrossRef] 11. Wollschlaeger, M.; Sauter, T.; Jasperneite, J. The future of industrial communication: Automation networks in the era of the internet of things and industry 4.0. IEEE Ind. Electron. Mag. 2017, 11, 17–27. [CrossRef] 12. Bellagente, P.; Ferrari, P.; Flammini, A.; Rinaldi, S.; Sisinni, E. Enabling PROFINET devices to work in IoT: Characterization and requirements. In Proceedings of the IEEE Instrumentation and Measurement Technology Conference, I2MTC, Taipei, Taiwan, 23–26 May 2016. 13. Szymanski, T.H. Supporting consumer services in a deterministic industrial internet core network. IEEE Commun. Mag. 2016, 54, 110–117. [CrossRef] 14. Rocha, M.S.; Sestito, G.S.; Dias, A.L.; Turcato, A.C.; Brandão, D.; Ferrari, P. On the performance of OPC UA and MQTT for data exchange between industrial plants and cloud servers. Acta IMEKO 2019, 8, 80–87. [CrossRef] 15. Ferrari, P.; Rinaldi, S.; Sisinni, E.; Colombo, F.; Ghelfi, F.; Maffei, D.; Malara, M. Performance evaluation of full-cloud and edge-cloud architectures for Industrial IoT anomaly detection based on deep learning. In Proceedings of the 2019 II Workshop on Metrology for Industry 4.0 and IoT (MetroInd4.0&IoT), Naples, Italy, 4–6 June 2019; pp. 420–425. 16. Tan, D.P.; Li, L.; Zhu, Y.L.; Zheng, S.; Ruan, H.J.; Jiang, X.Y. An embedded cloud database service method for distributed industry monitoring. IEEE Trans. Ind. Inform. 2018, 14, 2881–2893. [CrossRef] 17. Ferrari, P.; Sisinni, E.; Depari, A.; Flammini, A.; Rinaldi, S.; Bellagente, P.; Pasetti, M. Evaluation of the impact of cloud database services on industrial IoT applications. In Proceedings of the 2020 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Dubrovnik, Croatia, 25–28 May 2020; pp. 1–6. 18. Stojmenovic, I. Machine-to-machine communications with in-network data aggregation, processing, and actuation for large-scale cyber-physical systems. IEEE Internet Things J. 2014, 1, 122–128. [CrossRef] 19. Yifeng, L.; Junshi, G.; Jiaye, Z.; Jihong, G.; Shuigeng, Z. Towards efficiently supporting database as a service with QoS guarantees. J. Syst. Softw. 2018, 139, 51–63. 20. Wang, P.; Chen, X.; Sun, Z. Performance modeling and suitability assessment of data center based on fog computing in smart systems. IEEE Access 2018, 6, 29587–29593. [CrossRef] 21. Grau, A.; Indri, M.; Lo Bello, L.; Sauter, T. Industrial robotics in factory automation: From the early stage to the internet of things. In Proceedings of the IECON 43rd Annual Conference of the IEEE Industrial Electronics Society, Beijing, China, 29 October–1 November 2017; pp. 6159–6164. 22. Yun, U.; Lee, G.; Yoon, E. Efficient high utility pattern mining for establishing manufacturing plans with sliding window control. IEEE Trans. Ind. Electron. 2017, 64, 7239–7249. [CrossRef] 23. Yun, U.; Lee, G.; Yoon, E. Advanced approach of sliding window based erasable pattern mining with list structure of industrial fields. Inform. Sci. 2019, 494, 37–59. [CrossRef] 24. Semanˇcik,L. Recording of data monitoring access to databases using triggers. In Proceedings of the 2019 Communication and Information Technologies (KIT), Vysoke Tatry, Slovakia, 9–11 October 2019; pp. 1–5. 25. Benhamida, F.Z.; Casado-Mansilla, D.; Bennani, C.; Lopez-de-lpina, D. Toward a delay tolerant internet of things. In Proceedings of the 9th International Conference on the Internet of Things, art. 34, Bilbao, Spain, 22–25 October 2019; pp. 1–4. 26. Guha Roy, D.; Mahato, B.; De, D.; Buyya, R. Application-aware end-to-end delay and message loss estimation in internet of things (IoT)—MQTT-SN protocols. Future Gener. Comp. Syst. 2018, 89, 300–316. [CrossRef] Electronics 2020, 9, 1435 17 of 17

27. Daponte, P.; Lamonaca, F.; Picariello, F.; de Vito, L.; Mazzilli, G.; Tudosa, I. A survey of measurement applications based on IoT. In Proceedings of the IEEE Workshop on Metrology for Industry 4.0 and IoT, MetroInd 4.0 and IoT 2018, Brescia, Italy, 16–18 April 2018; pp. 157–162. 28. Ferrari, P.; Flammini, A.; Rinaldi, S.; Sisinni, E.; Maﬀei, D.; Malara, M. Impact of quality of service on cloud based industrial IoT applications with OPC UA. Electronics 2019, 7, 109. [CrossRef] 29. Silva, D.R.C.; Oliveira, G.M.B.; Silva, I.; Ferrari, P.; Sisinni, E. Latency evaluation for MQTT and WebSocket protocols: An Industry 4.0 perspective. In Proceedings of the IEEE Symposium on Computers and Communications, Natal, Brasil, 25–28 June 2018; pp. 1233–1238. 30. Ferrari, P.; Flammini, A.; Sisinni, E.; Rinaldi, S.; Brandao, D.; Rocha, M.S. Delay estimation of industrial IoT applications based on messaging protocols. IEEE Trans. Instr. Meas. 2018, 67, 2188–2199. [CrossRef] 31. Depari, A.; Fernandes Carvalho, D.; Bellagente, P.; Ferrari, P.; Sisinni, E.; Flammini, A.; Padovani, A. An IoT based architecture for enhancing the eﬀectiveness of prototype medical instruments applied to neurodegenerative disease diagnosis. Sensors 2019, 19, 1564. [CrossRef] 32. Carvalho, F.; Ferrari, P.; Sisinni, E.; Depari, A.; Rinaldi, S.; Pasetti, M.; Silva, D. A test methodology for evaluating architectural delays of LoRaWAN implementations. Pervasive Mobile Comput. 2019, 56, 1–17. [CrossRef] 33. Ferrari, P.; Bellagente, P.; Depari, A.; Flammini, A.; Pasetti, M.; Rinaldi, S.; Sisinni, E. Evaluation of the impact on industrial applications of NTP Used by IoT devices. In Proceedings of the 2020 IEEE International Workshop on Metrology for Industry 4.0 & IoT, Roma, Italy, 3–5 June 2020; pp. 223–228. 34. Compare, M.; Baraldi, P.; Zio, E. Challenges to IoT-enabled predictive maintenance for industry 4.0. IEEE Internet Things J. 2020, 7, 4585–4597. [CrossRef] 35. Yu, W.; Dillon, T.; Mostafa, F.; Rahayu, W.; Liu, Y. A global manufacturing big data ecosystem for fault detection in predictive maintenance. IEEE Trans. Ind. Inform. 2020, 16, 183–192. [CrossRef] 36. Meraji, S.; Lavoy, C.; Hall, B.; Wang, A.; Rothenstein, G.; Davis, P. Towards performance evaluation of cloudant for customer representative workloads. In Proceedings of the 2016 IEEE International Conference on Cloud Engineering Workshop (IC2EW), Berlin, Germany, 4–6 April 2016; pp. 144–147.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).