D2.1 Vision, State of the Art and Requirements Analysis

Ref. Ares(2019)6876701 - 06/11/2019

A cybersecurity framework to GUArantee Reliability and trust for Digital service chains

www.guard-project.eu

Editor: Domenico Striccoli (CNIT) Version: 1.0 Status: Final Delivery date: 06/11/2019 Dissemination level: PU (Public)

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 833456

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Deliverable Factsheet

Grant Agreement No.: 833456 Project Acronym: GUARD Project Title: A cybersecurity framework to GUArantee Reliability and trust for Digital service chains Call: H2020-SU-ICT-2018-2020 “Cybersecurity” Topic: SU-ICT-01-2018 “Dynamic countering of cyber-attacks” Start date: 01/05/2019 Duration: 36 months

Deliverable Name: D2.1 Vision, State of the Art and Requirements Analysis V1.0 Related WP: WP2 System Architecture and Continuous Integration Due Date: 31/10/2019

Editor: Domenico Striccoli (CNIT) Contributor(s): Matteo Repetto (CNIT), Alessandro Carrega (CNIT), Giuseppe Piro (CNIT), Gennaro Boggia (CNIT), Armend Duzha (MAGG), Mauro Coletta (MAGG), Andrea Romanelli (MAGG), Antonino Albanese (ITL), Maurizio Barbaro (ITL), Jose Ignacio Carretero (FIWARE), Giovanni Coppa (WOB), Cornelio Hopmann (WOB), Peter Laitner (M&S), Damir Haskovic (M&S), Markus Wurzenberger (AIT), Florian Skopik (AIT), Manos Papoutsakis (FORTH), Nicolas Kylilis (8BELLS), Joanna Kolodziej (NASK), Denitsa Kozhuharova (LIF), Kalina Ruseva (LIF) Reviewer(s): Joanna Kolodziej (NASK) Approved by: All partners

Disclaimer

This document reflects the opinion of the authors only. While the information contained herein is believed to be accurate, neither the GUARD consortium as a whole, nor any of its members, their officers, employees or agents make no warranty that this material is capable of use, or that use of the information is free from risk and accept no liability for loss or damage suffered by any person in respect of any inaccuracy or omission. This document contains information, which is the copyright of GUARD consortium, and may not be copied, reproduced, stored in a retrieval system or transmitted, in any form or by any means, in whole or in part, without written permission. The commercial use of any information contained in this document may require a license from the proprietor of that information. The document must be referenced if used in a publication.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Executive Summary

The main objectives of this deliverable are (i) to describe the project vision, the applicability scenarios, and demonstration use cases; (ii) to perform an analysis of the State of the Art and current market solutions; (iii) to identify and describe the technical and architectural requirements of the GUARD framework. The description of the project concept, applicability scenarios and demonstration use cases aims at elaborating on how the GUARD concept is going to change the current practice in operating large distributed environments from the cybersecurity perspective, with specific focus on virtualized and cyber-physical systems. It will elaborate on existing commercial products and solutions and will point out how the GUARD concept will provide advanced assurance and protection, allowing involved actors to detect and quickly and effectively respond to sophisticated cyber-attacks. Use cases will be introduced to validate and demonstrate the level of maturity of the project, as well as to directly involve relevant stakeholders for concrete business planning. The analysis of the scientific literature, research initiatives and current market solutions is indispensable to select candidate technologies and approaches, and to identify missing gaps and technical priorities to be tackled in the GUARD context. Outcomes from this task will also be used to outline the exploitation strategy (e.g., standardization initiatives, market differentiation, competitive and SWOT analysis, main targets and stakeholders for tailored communication and dissemination activities, etc.). Topics of interest will include (but are not limited to) fast paths for packet processing, security information and event management, identity and access management, data models and APIs for distributed systems, techniques for threat detection, user- machine interfaces, user awareness and involvement in cybersecurity practice. The requirement analysis aims to identify, collect and analyse the set of specific requirements and the associated engineering tools that will guide the definition and development of the GUARD platform. The requirements analysis will be done based on the relevant application scenarios and the missing gaps/competitive advantages. System requirements will take in particular consideration a) the technical and administrative heterogeneity of distributed infrastructures; b) the need for human interfaces and tools for immediate response; and c) integration with emerging architectures and existing middleware for distributed systems. This analysis will be carried out by taking into account the interests of all the involved stakeholders/actors and their interactions.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Document History

Version Date Author(s) Comments 0.1 18/09/2019 Domenico Striccoli (CNIT) ToC 0.2 23/09/2019 Domenico Striccoli (CNIT) First draft Matteo Repetto (CNIT) 0.3 24/09/2019 ALL partners Contribution to Section 4 0.4 03/10/2019 ALL partners Contribution to Sections 3 and 4 0.5 14/10/2019 ALL partners Contribution to Section 2 0.6 23/10/2019 Kalina Ruseva (LIF) Contribution to Section 4 and document ready for internal review 0.7 25/10/2019 Joanna Kolodziej (NASK) Internal review and comments for improvements 0.8 30/10/2019 Domenico Striccoli (CNIT) Addressing the comments from internal Nicolas Kylilis (8BELLS) review 1.0 06/11/2019 Domenico Striccoli (CNIT) Final version ready for submission

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Contributors

Organization Author(s) E-Mail CNIT Domenico Striccoli [email protected] CNIT Matteo Repetto [email protected] CNIT Alessandro Carrega [email protected] CNIT Giuseppe Piro [email protected] CNIT Gennaro Boggia [email protected] MAGG Armend Duzha [email protected] MAGG Andrea Romanelli [email protected] MAGG Mauro Coletta [email protected] ITL Antonino Albanese [email protected] ITL Maurizio Barbaro [email protected] FIWARE Jose Ignacio Carretero [email protected] WOB Giovanni Coppa [email protected] WOB Cornelio Hopmann [email protected] M&S Peter Leitner [email protected] M&S Damir Haskovic [email protected] AIT Markus Wurzenberger [email protected] AIT Florian Skopik [email protected] FORTH Manos Papoutsakis [email protected] 8BELLS Nicolas Kylilis [email protected] NASK Joanna Kolodziej [email protected] LIF Denitsa Kozhuharova [email protected] LIF Kalina Ruseva [email protected]

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Table of Contents

EXECUTIVE SUMMARY ...... 3

DOCUMENT HISTORY ...... 4

CONTRIBUTORS ...... 5

1 INTRODUCTION ...... 15

1.1 Purpose and Scope ...... 15 1.2 Contribution to other Deliverables ...... 15 1.3 Structure of the Document ...... 16

2 PROJECT CONCEPT ...... 17

2.1 The digital economy and the arising of digital services ...... 17 2.2 Outdated security paradigms for digital services ...... 18 2.3 Evolving threats landscape ...... 20 2.4 Challenges and emerging trends ...... 22 2.5 Two-fold cyber-security dimension for digital business chains ...... 25 2.6 Vision: beyond infrastructure-centric paradigms ...... 26 2.7 Approach ...... 28 2.8 Key values ...... 31 2.8.1 Scope ...... 31 2.8.2 Efficiency ...... 32 2.8.3 Detection of complex multi-vector attacks and identification of new threats ...... 33 2.8.4 Awareness ...... 33 2.9 Application scenarios ...... 34 2.9.1 The automotive scenario ...... 34 2.9.2 The FIWARE Lab scenario ...... 37 2.9.3 The Healthcare Marketplace scenario ...... 39 2.9.4 The metropolitan city scenario ...... 43 2.9.5 The smart grid scenario ...... 45 2.9.6 The smart city scenario ...... 48 2.9.7 The wearable devices scenario ...... 51

3 STATE OF THE ART ANALYSIS ...... 54

3.1 Introduction ...... 54 3.2 Fast paths for packet processing ...... 54 3.2.1 Programmable data planes for packet processing ...... 54 3.2.2 Technologies for building fast data planes ...... 56 3.3 Security Information ...... 60 3.3.1 Scientific literature ...... 60

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3.3.2 Research Initiatives ...... 61 3.3.3 Current market solutions ...... 64 3.4 APIs and Data models...... 65 3.4.1 APIs ...... 65 3.4.2 Common API types ...... 65 3.5 Data protection and Identity and Access Management ...... 71 3.5.1 Inputs from the scientific literature ...... 71 3.5.2 Research initiatives ...... 74 3.5.3 Reference Standard and relevant Market Solutions ...... 74 3.5.4 Candidate technologies and missing gaps ...... 75 3.6 Machine learning and other techniques for threat detection ...... 75 3.6.1 Scientific literature ...... 76 3.6.2 Current market solutions and research initiatives ...... 77 3.6.3 Gaps and GUARD Solutions ...... 78 3.7 Data Inspection Tools ...... 79 3.7.1 nDPI ...... 79 3.7.2 Open Virtual Switch (OVS) ...... 80 3.7.3 AlienVault - AlienVault OSSIM ...... 81 3.8 Human machine interfaces and cybersecurity practices ...... 81 3.8.1 Behavioural, social and human aspects ...... 84 3.8.2 User awareness and involvements in cybersecurity ...... 85 3.9 Conclusions ...... 86

4 REQUIREMENTS ANALYSIS ...... 88

4.1 Methodology ...... 88 4.2 Taxonomy ...... 90 4.3 Requirements list and description ...... 91

REFERENCES ...... 115

D2.1 Vision, State of the Art and Requirements Analysis V1.0

List of Figures Figure 1. Business chains entails multiple digital processes over the whole globe...... 17 Figure 2. Digital services are composed by complex chains of software, processes, and devices...... 18 Figure 3. Common security models are based on the concept of "security perimeter," which may include remote resources and sites...... 19 Figure 4. Comparison between typical times to compromise a system and to detect an attack. The bars report the relative frequency, while the lines represent cumulative distribution functions. The shaded areas indicate the probability that a system is compromised in less than a day and the probability that the time to detect an attack is longer than a week . 21 Figure 5. The on-going evolution from infrastructure-centric to service-centric cyber-security architectures...... 26 Figure 6. A forward-looking perspective for the evolution of cyber-security architectures for digital services...... 27 Figure 7. The GUARD concept revolves around the idea of improving awareness for improving response...... 27 Figure 8: Wrapping software programs with suitable abstractions and models is necessary to automate deployment and management...... 28 Figure 9. Main APIs envisioned in the realization of the GUARD concept...... 29 Figure 10. GUARD addresses the complexity and multi-vector nature of recent cyber-security threats by moving from current narrow-scope silos to a more integrated multi-vendor layered and open framework...... 31 Figure 11. Composition of digital services and infrastructures for the automotive sector...... 35 Figure 12. FIWARE Lab can be used to deploy private instances of GEs or to connect to public services...... 37 Figure 13. Selection of digital services from resource catalogues...... 39 Figure 14. Information Exchange System for the Metropolitan City...... 43 Figure 15. Operation of the Smart Grid...... 46 Figure 16. Smart City solution in Wolfsburg...... 48 Figure 17. Remote AR services based on edge computing...... 52 Figure 18. CAESAIR workflow...... 63 Figure 19. 8Bells vDPI Dashboard ...... 79 Figure 20. Common Firewall Setup ...... 80 Figure 21. Stakeholders of the GUARD framework. They are targets of one or more requirements...... 89

List of Tables Table 1. Main technologies for kernel hooks...... 56 Table 2. Main technologies for data planes in user space...... 57 Table 3. Main technologies for data plane in kernel space...... 58 Table 4. Main technologies for fully programmable data planes...... 58 Table 5. Comparison of control and configuration protocols...... 59

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Acronyms and Abbreviations

5G 5th Generation

AA Active Ageing

AAL Ambient Assisted Living

ABAC Attribute-Based Access Control

ABSC Attribute-Based SignCryption

ABE Attribute Based Encryption

ACM Association for Computing Machinery

AD Anomaly-based Detection

ALG Application Level Gateway

AMPQ Advanced Message Queuing Protocol

ANN Artificial Neural Network

API Application Programming Interface

APT Advanced Persistent Threat

AR Augmented Reality

BPF Berkeley Packet Filter

CAPEC Common Attack Pattern Enumeration and Classification

CERT Computer Emergency Readiness Team

CFG Control Flow Graph

CBAC Claims Based Access Control

CBT Computer-Based Training

CoA Courses of Action

CP-ABE Ciphertext-Policy Attribute-Based Encryption

CPS Communicating Sequential Processes

CPU Central Processing Unit

CS Cyber-Security

D2.1 Vision, State of the Art and Requirements Analysis V1.0

CSIRT Computer Security Incident Response Team

CTI Cyber Threat Intelligence

CVE Common Vulnerability and Exposures

CVSS Common Vulnerability Scoring Scheme

CybOX Cyber Observable Expression

DDoS Distributed Denial of Service

DDR Double Data Rate

DLPI Data Link Provider Interface

DMA Direct Memory Access

DNS Domain Name System

DoS Denial of Service

DPI Deep Packet Inspection

DSO Distribution System Operator eBPF extended Berkeley Packet Filter

ECC Elliptic Curve Cryptography

ECDSA Elliptic Curve Digital Signature Algorithm

FIDC Future Internet and Distributed Cloud

FPGA Field Programmable Gate Array

GUI Graphical User Interface

HCI Human Computer Interaction

HIDS Host-based Intrusion Detection System

HMAC Hashed Message Authentication Code

HMI Human Machine Interface

HMM Hidden Markov Model

HTTP HyperText Transfer Protocol

I/O Input/Output

IAM Identity and Access Management

D2.1 Vision, State of the Art and Requirements Analysis V1.0

IaaS Infrastructure as a Service

IBAC Identity Based Access Control

IBE Identity Based Encryption

ICT Information Communication Technology

IDS Intrusion Detection System

IED Intelligent Electronic Devices

IT Information Technology

IoT Internet of Things

IODEF Incident Object Description Exchange Format

IPS Intrusion Prevention System

ISO/OSI International Standardization Organization/Open System Interconnection

JVT JSON Web Tokens

KP-ABE Key Policy-Attribute-Based Encryption

LDAP Lightweight Directory Access Protocol

LoRaWAN Long Range Wide Area Network

LTE Long Term Evolution

MAEC Malware Attribute Enumeration and Characterization

MAS MultiAgent System

MCI Mild Cognitive Impairment

MEC Mobile Edge Computing

MILE Managed Incident Lightweight Exchange

ML Machine Learning

MQTT MQ Telemetry Transport

NAC Network Access Control

NBAC Authentication Based Access Control

NFV Network Function Virtualization

NIDS Network-based Intrusion Detection System

D2.1 Vision, State of the Art and Requirements Analysis V1.0

NIST National Institute of Standards and Technology

NS Network Slice

NP Network Provider

NVD National Vulnerability Database

OC Operational Centre

OEM Original Equipment Manufacturer

OM Owner of the Mall

OPP Open Packet Processor

OS Operating System

OSNIT Open Source Intelligence

OTP One Time Password

OWL Web Ontology Language

PBAC Policy Based Access Control

PCI Peripheral Component Interconnect

POS Point of Sale

PUF Physically Unclonable Function

RAM Random Access Memory

RBAC Role Based Access Control

REST Representational State Transfer

RID Real-time Inter-network Defence

RPC Remote Procedure Call

RSU RoadSide Unit

SD Signature Detection

SDN Software Defined Network

SEIS Self-Efficacy in Information Security

SIEM Security Information and Event Management

SPID Sistema Pubblico di Identità Digitale

D2.1 Vision, State of the Art and Requirements Analysis V1.0

SOAP Simple Object Access Protocol

SOC Security Operation Centre

SPA Stateful Protocol Analysis

SR/IOV Single Root Input/Output Virtualization

SSO Single Sign On

STIX Structured Threat Information eXpression

SVM Support Vector Machine

TAXII Trusted Automated eXchange of Indicator Information

TCAM Ternary Content-Addressable Memories

TCP Transmission Control Protocol

TCP/IP Transmission Control Protocol/Internet Protocol

TLS Transport Layer Security

TSO Transmission System Operator

TSP Traffic Services Provider

TTP Tactics, Techniques, and Procedures

UDP User Datagram Protocol

UEM Unified Endpoint Management

UI User Interface

V2I Vehicle-to-Infrastructure

V2V Vehicle-to-Vehicle

VM Virtual Machine

VPN Virtual Private Network

WiFi Wireless Fidelity

WLAN Wireless Local Area Network

WMN Wireless Mesh Network

XACML eXtensible Access Control Markup Language

XML eXtensible Markup Language

D2.1 Vision, State of the Art and Requirements Analysis V1.0

ZBAC authoriZation Based Access Control

ZMTP ZeroMQ Message Transport Protocol

D2.1 Vision, State of the Art and Requirements Analysis V1.0

1 Introduction

1.1 Purpose and Scope

The purpose of the deliverable D2.1 “Vision, State of the Art and Requirements Analysis” is to collect the outcomes from the tasks T2.1-T2.4 of the WP2. This document provides a description of the GUARD vision and approach, a State-of-the-Art analysis, and a description of the technical requirements that will drive the implementation of the GUARD platform. It is based on the results of the analysis of business needs and market trends, as well as the main technological gaps identified in the preliminary phase of the project approach. The goal of the description of the project concept is to present the most important changes that the GUARD project will carry out in distributed environments from the cybersecurity perspective. This task points out how the GUARD concept will provide advanced assurance and protection, allowing involved actors to detect and quickly and effectively respond to sophisticated cyber-attacks. Some relevant application scenarios are also presented, to highlight the most suitable GUARD features that can be applied to each of them and the related improvements expected by the application of the GUARD framework. The analysis of the scientific literature, research initiatives and current market solutions is necessary to identify the most suitable candidate technologies and approaches, the missing gaps and the technical priorities that have to be tackled in the GUARD context. This task focuses on several topics of interest, like fast paths for packet processing for the development of efficient monitoring, inspection, and enforcement tools; Cyber Threat Intelligence (CTI) techniques, needed to correlate external threat information and feed detection algorithms; the automatic recognition of patterns for known attacks and deviations from normal system behaviour in case of unknown attacks through Machine Learning and other techniques for threat detection; open APIs for retrieving the security context of running services and detecting the presence of and notify access to private and sensitive data; and end-user interfaces allowing to take appropriate and timely actions in response to detection of attacks. The identification, collection and analysis of the GUARD technical requirements is useful to guide the definition and development of the GUARD platform. The requirements analysis is based on the relevant application scenarios and the missing gaps/competitive advantages discussed in this document. The collected requirements take into account different aspects: the technical and administrative heterogeneity of distributed infrastructures, the need for human interfaces and tools for immediate response, and the integration with emerging architectures and existing middleware for distributed systems. The number of requirements and their related description is the result of continuous interactions with all consortium partners, to identify all involved stakeholders/actors and their interests in the GUARD functionalities and the related priority degrees.

1.2 Contribution to other Deliverables The topics developed in this deliverable are used as input for the development of tasks in WP3, WP4, WP5. The description of the main features of the GUARD project and the application scenarios described in task T2.1 will contribute to deliverables D3.2, D3.3 and D3.4. Indeed, the description of the GUARD environment and application scenarios are indispensable prerequisites for the design and development of web-based tools for human interaction with the GUARD framework that will be developed in D3.2, of tools for creating secure business chains by applying users policies on traffic flows and data distribution in D3.3, and of modules to share information and consume CTI in D3.4. Task T2.1 contributes to the WP4 since it tracks the path for monitoring, inspection, and enforcement mechanisms in WP4, in terms of fast and lightweight processing for network

D2.1 Vision, State of the Art and Requirements Analysis V1.0 packets, log data and system calls, advanced configuration and programmability, the collection and creation of the security context, and techniques of identity management and access control. The State-of-the-Art analysis in T2.2 is at the basis of D3.2 and D3.4, because it allows choosing the most suitable solutions for candidate technologies and approaches to develop the GUARD modules for human-machine interfaces in D3.2 and CTI in D3.4. It is an input for deliverables D4.1 and D4.3, contributing to the development of packet processing, identity management and access control techniques in D4.1, and to the definition of interfaces for programmability in D4.3. It also contributes to deliverables D5.1 and D5.2, to identify and implement attack detection and threat identification strategies. The analysis and identification of GUARD requirements in T2.3 is an important starting point to define the reference architecture in the deliverable D2.2, that will be used as a blueprint by the other WPs to define and develop the GUARD components. It provides important guidelines to develop the features, modules and tools of the GUARD framework that will be part of the deliverables D3.2, D3.3 and D3.4, and of the packages WP4 and WP5.

1.3 Structure of the Document This document is structured as follows. Section 2 describes the project concept, explaining the structure of digital services with their new challenging and emerging trends and paradigms, how they relate to the digital business chains, and the approach followed by the GUARD project to follow this new vision of service-centric architectures, with the goal of guaranteeing a high security level to the whole service chains. In addition, it includes some relevant application scenarios that are useful to highlight the GUARD main features, how they can be applied to each scenario, and the improvements brought by the application of the GUARD framework. Section 3 provides an overview of the current state of the art on the abovementioned topics. For sake of brevity, it does not go into too much detail. A much more detailed analysis is included in Annex A, which can be found at: https://guard-project.eu/wpcontent/uploads/2019/11/Annex-A-Detailed-SoA-Analysis.pdf Section 4 provides a detailed description of the scientific literature, research initiatives and market solutions aiming to identify and select the most suitable technologies and approaches, the missing gaps and the technical priorities in the design and implementation of the GUARD framework. The main topics tackled in this section comprise packet processing, data correlation, detection algorithms, data models and interfaces, threat detection, and identity management and access control strategies. Section 5 describes the technical requirements of the GUARD framework, to accurately identify the requirements that contribute to the definition and development of the GUARD framework. The requirements will take into account: (i) the structure, heterogeneity and complexity of the service chains; (ii) the most important architectural aspects to tackle new and advanced threats and vulnerabilities; (iii) the integration with existing architectures and distribute systems; and (iv) the diffusion and management of sensitive information.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2 Project concept

2.1 The digital economy and the arising of digital services Data will be the key driver for the digital economy. Novel digital products and services are expected to create, process, share, and consume data and content in a digital continuum, blurring the frontiers between application domains and breaking the current closed silos of information. Examples of such ecosystems have already been identified for material sciences, energy, manufacturing and logistics, health care, and smart cities. Today most business processes follow a fully-digital workflow, including design, implementation, creation, purchase, production, trading, delivery, and after-sales services. Novel digital products and services emerge in business ecosystems with unprecedented rate with respect to the past, making monolithic ICT systems unsuitable to follow the market dynamics. The digital economy is therefore re-writing the rules of business, and creating new models where services from multiple vendors are connected together to create, process, share, and distribute data and content. The ICT industry is already tackling the new challenges, by introducing new architectures and patterns that bring more agility in the creation and management of new services and products, based on the provisioning of basic digital services that can be easily combined together to create more complex business chains.

Figure 1. Business chains entails multiple digital processes over the whole globe.

From a technical perspective, the creation of a business chain consists in the logical interconnection (not necessarily with a strictly linear pattern) of several processes, software, and devices, and in feeding them with relevant user’s data and context across multiple technological and administrative domains. Let’s consider, for example, an industrial supply chain that encompasses manufacturing, transportation, and assembly. Many business models are today shaped around just-in-time models and tight scheduling of all

D2.1 Vision, State of the Art and Requirements Analysis V1.0 industrial and handling processes, so to ensure regularity of supply and to avoid the need for large storage of components and products. This requires capillary control over all stages and sharing information, events and data among the different actors involved in the chain. Moreover, one of the most valuable keys to boost innovations and new revenue streams is unprecedented service agility, which means digital services and business chains are expected to emerge and dissolve much faster than traditional value creating networks.

Figure 2. Digital services are composed by complex chains of software, processes, and devices.

Future services will be created with unprecedented degrees of freedom in their dimensions: digital infrastructures, software technologies, and data. They will be implemented using distributed frameworks and patterns, designed to facilitate integration, deployment and management. Convergence among existing software paradigms, such as cloud computing, software-defined networking, and the Internet of Things (IoT) is expected to this purpose, leveraging automaticity and dynamic composition through service-oriented and everything-as-a-service models applied to cyber-physical systems. As a matter of fact, micro-services, web services, service-oriented architectures, programming models, distributed middleware, and software orchestration are already present in latest frameworks and market technologies for both the software, CPS, and telecommunication industries. In Europe, the FIWARE initiative has led to the definition of open and royalty-free APIs for development and deployment, while the International Data Spaces association is now working on a more complete architecture for the Industrial Data Space [1]. ETSI has defined a complete architecture for network function virtualization, which is expected to be largely adopted by upcoming 5G infrastructures [2]. TOSCA is currently the de-facto standard for describing the structure of cloud applications [3].

2.2 Outdated security paradigms for digital services The need for ever-more agility and adaptability of the software to the evolving context at both the infrastructure and service layer is pushing towards self-adaptability in execution environments and self-awareness in programs. Fully-automated software and environments will evolve and morph during run-time, without the explicit control of software engineers. The dark side effect of this evolution is the risk for large unpredictability,

D2.1 Vision, State of the Art and Requirements Analysis V1.0 due to non-deterministic, opaque and partially inscrutable service topology. This raises questions about the overall behaviour of the system, the location of personal and sensitive data, the sanity of the software, the availability of the whole service, and, most of all, the ability to perform quick remediation and mitigation actions in case something goes wrong. Unfortunately, cyber-security paradigms have not advanced at the same pace. The growing complexity, multi- domain nature, outdated security paradigms, and scarce automation of related processes make digital services ever more vulnerable to security breaches and ever less trustworthy. New architectures and usage models for building digital services have already revealed the substantial inadequacy of legacy security appliances to effectively protect distributed and heterogeneous systems against cyber-threats [4]. Common security paradigms for the enterprise are based on the deployment of discrete security appliances that build a security perimeter that encloses all valuable ICT assets. Firewalls, IPS/IDS, and antivirus are commonly used today, often integrated in enterprise-wide tools that give visibility over the network and devices. The presence of portable devices already represents a problem, since such devices might get compromised outside the organization, but are typically considered trusted devices once brought inside. When enterprise processes include ICT resources from the outside, breaches must be created in the perimeter. VPNs and TLS are commonly used to extend the security perimeter till to remote resources, but this may weaken the defence if such resources are not fully trusted. This is the typical case for IoT devices, which are often prone to tampering, but also for cloud services deployed in 3rd party’s infrastructures. Cloud services only provides simple firewalling services, tailored to software-defined environments (e.g., with features such as “security groups” to bundle all resources belonging to the same tenant), but cannot be managed in a consistent way across multiple providers. Riding the virtualization wave, several vendors of security appliances are now delivering software instances of their applications, suitable to be deployed in cloud environments. This help build a uniform and integrated solution across multiple sites, but comes at the cost of additional resource consumption (i.e., VMs or containers) and overhead. For this reason, this approach cannot be typically adopted for IoT devices and other resource- constrained devices.

Figure 3. Common security models are based on the concept of "security perimeter," which may include remote resources and sites.

Novel computing paradigms bring more security concerns than legacy ICT installations, but things are even worst when the broadest range of digital services is considered. Indeed, the concept of security perimeter is no more applicable when external resources belong to other entities (e.g., authentication services, public/private IoT,

D2.1 Vision, State of the Art and Requirements Analysis V1.0 context brokers, etc.). It should also be considered that for various reasons (cost, usability, performance) IoT devices have commonly a weak security posture. The creation of value-chains across multiple infrastructures blurs the boundary between public zones and private domains, hence making hard to apply the security perimeter model in a trustworthy and effective way. In addition, relying on humans' ability for hardening, verification of security properties, attack detection, and threat identification is no more practical, and that is clearly an unacceptable practice especially when critical infrastructures and large chains are involved [5]. Since valuable ICT assets cannot be easily enclosed within a trusted physical sandbox any more, there is an increasing need for a new generation of pervasive and capillary cyber-security paradigms over distributed, multi- domain, and geographically-scattered systems.

2.3 Evolving threats landscape While trying to circumvent the growing effectiveness of cyber-defence technologies, latest generation of attacks have evolved along multiple directions:

• they use multiple vectors and stealth techniques to keep anomalies related to each single attack below the typical detection thresholds (i.e., in case of volumetric attacks) [6]; • they are highly customized and change over time, so to elude signature-based detection strategies [7]; • they mostly leverage stateful protocols and layer-7 applications, so to exploit software vulnerabilities and misconfigurations, due to infrequently updated systems, lack of technical skills, very short software release cycles [8]; • they exploit the weak security posture and proliferation of resource-constrained IoT devices, which lie outside the enterprise’s perimeter and cannot be effectively protected for performance and cost reasons [9]; • encrypted channels are often used between malware and their remote-control servers, to bypass inspection by intermediate security appliances [9]; • large usage of automation and artificial intelligence, to create attack tools for the black market [9]; • attacks against APIs exposed by web services and serverless architectures, based on malicious injections, insecure object references, access violations and abuse; • they often target new domains where the application of ICT has not yet been consolidated in best practices: recent reports highlight 600\% increase in attacks against IoT devices and 29\% increase in vulnerabilities of industrial control system (ICS) [10]. Latest generation of attacks are becoming increasingly sophisticated, targeted, stealthy, and persistent along both spatial and temporal dimensions, i.e., Advanced Persistent Threats (APT). Spatial dimension: typical APT begin by scanning the target system to make an inventory of public resources and identify possible attack vectors (web sites, emails, DNS, etc.). They usually target the weakest link in the chain and try to propagate afterwards to more valuable resources. A number of complementary actions are then initiated to gain access, including social engineering and phishing, internal port scanning, malware injections, and so on. All these actions target different subsystems (e.g., public web servers, DNS, users, internal networks, hosts); when taken alone, they might be confused with legal operations by standalone security appliances. Temporal dimension -- The execution of the different stages may take days, weeks, or even months. For example, fraudulent emails might be sent for days or weeks before a rash or careless user opens the embedded links. Again, an installed malware might be left snooping for days before starting to collect data. It is therefore challenging for any standalone security

D2.1 Vision, State of the Art and Requirements Analysis V1.0 appliances to store long historical traces, and to correlate events that may have happened weeks or months ago.

Figure 4. Comparison between typical times to compromise a system and to detect an attack. The bars report the relative frequency, while the lines represent cumulative distribution functions. The shaded areas indicate the probability that a system is compromised in less than a day and the probability that the time to detect an attack is longer than a week [11].

The different stages of an APT attack may be easily mistaken to be independent events, and this makes APT challenging to detect, defend and mitigate [12]. As a matter of fact, estimations from the cyber-security industry say that on average the time to compromise a system is shorter than days in 84\% of attacks, while the time to discover on-going attacks is longer than days in 78\% of cases (see the shaded areas in Figure 4). For this reason, legacy detection techniques (as blacklisting and malware signature recognition) are likely to fail in these contexts and novel approaches for detection and mitigation are required [13],[14]. Currently, most threats come from cyber-physical systems and virtualization environments, where cybersecurity paradigms have not been able to keep the pace of ground-breaking evolution in computing and networking models. The increasing dynamicity and heterogeneity of ICT installations due to bring-your-own-device policies, remote connectivity, externalization, and integration with cyber-physical systems are progressively eroding any security boundary, hence blurring the previous distinction between internal and external threats. Among externalization paradigms, cloud computing raises most concerns due to the hypervisor layer and multi-tenancy [1]. As a matter of fact, the attack surface is increased by the larger number of components: guest environments (virtual machines), host operating systems (servers), hypervisors, management interface, shared storage and networks [15]. Tenant isolation should provide independent and secure execution sandboxes, leveraging technologies as hypervisors, network virtualization, and virtual storage. However, the shared infrastructure widens the class of local adversaries, also including other tenants and the infrastructure providers, raising new attack models (grey boxes, involving tenants and the cloud providers) in addition to mainstream white (employees) and black boxes (external attackers).

D2.1 Vision, State of the Art and Requirements Analysis V1.0

The proliferation of smart (yet not properly hardened) devices connected to the Internet has introduced a charming and easy attack vector, which has hit the headlines several times for very large DDoS. Recent botnets like Mirai, Brickerbot, and Hajime have not only demonstrated the vulnerability of IoT installations but also the possibility to exploit compromised devices to amplify the size of the attack and to carry out large DDoS attacks at the terabyte scale. Though large deviations from historical traffic patterns are easy to detect, the presence of a huge number of distinct rivulets of traffic are far more complex to detect, correlate, and consequently mitigate. The predominant interspersion of lonely valuable resources with unsafe computing and communication infrastructures makes the application of the security perimeter at each site ineffective, because of the overhead to run complex agents in end devices, especially in case of resource-constrained “things”.

2.4 Challenges and emerging trends To effectively address the new generation of cyber-threats that affects digital infrastructures and services, a more integrated and coordinated approach is necessary, which overcomes the intrinsic limitations of existing systems based on discrete security appliances (e.g., firewall, IPS/IDS, deep packet inspection, SIEM). Specific challenges to be tackled in this respect include: • slow and ineffective detection of attacks, due to partial and incoherent security information, non- interoperable algorithms targeting specific cyber-attacks only (e.g., network or server DDoS, intrusion detection/prevention, malware identification); • difficulty in identifying new threats and vulnerabilities, because of limited information and data available, ineffective correlation of data from multiple sources, lack of big data and machine-learning techniques; • outdated and inefficient architectures for sharing security context, which are able to combine the need for fine-grained knowledge with efficient resource usage. In fact, effective detection of attacks and identification of new threats may need detailed logs, events, and measurements for processing and correlation, but this would result in excessive overhead for both network and storage systems; • technology and business lock-in by the adoption of verticals with tightly integrated network/cybersecurity solutions, often from a single vendor; • intrinsic rigidity, due to the difficulty to change architecture and system configuration: network partitioning, deployment of hardware or software security appliances (including the necessary agents), routing and switching policies in case of traffic diversion towards scrubbing centres; • slowness and inertia to share the knowledge of new threats and attacks, due to heavy involvement of humans in the loop and the need for manual operations, and to lack of common representation formats; • ineffective interaction with users: though security dashboards are continuously evolving and enhancing user-friendliness, they are usually reserved to technical staff and do not integrate seamlessly in other administrative and management processes (e.g., risk assessment, legal and normative requirements). Secure composition and operation of dynamic business chains require some steps forward with respect to existing leading-edge technologies. Current challenges and emerging trends are all suggesting that such paradigms should evolve from discrete appliances into integrated, pervasive and capillary systems, which are able to correlate events in both time and space dimensions, and could provide timely operational information to feed novel disruptive approaches capable of estimating the risk in real-time. Next generation frameworks for detection and reaction are expected to combine fine-grained and precise information with efficient processing, elasticity with robustness, autonomy with interactivity [1]. In addition, digital business chains raise unique trust

D2.1 Vision, State of the Art and Requirements Analysis V1.0 concerns, especially related to partially unknown and fast-changing topologies. Specific requirements to fulfil these challenging requirements include: the shift from centralized to distributed architectures, the programmability of the infrastructure, the need for robustness and data protection, dynamic adaptation to changing environments and conditions through orchestration, automation, correlation and sharing.

Distributed architectures. With the increasing integration of cloud, edge, and fog resources in critical business and industrial processes, a clear separation of internal and external resources is no more possible, hence even elastic perimeters made by VPN become ineffective. The need to evolve towards distributed and capillary architectures has resulted in commercial micro-firewalls and distributed firewalls, which build on the concept of micro-segmentation [16]. The concept of ‘distributed firewall’ has been recently proposed for virtualization environments, to integrate packet inspection and filtering in hypervisors [17], witnessing the importance of pervasive and capillary control. Distributed firewalls for virtualization environments and cloud computing build on the concept of and deploy packet inspection rules in hypervisors, while keeping centralized control. They enable very fine-grained control over security policies, beyond mere IP-based structure [17].This approach is currently used for enforcing filtering rules, but does not have the flexibility to provide deep inspection capability tailored to the specific needs for detecting threats and on-going attacks. More flexibility is required that leverages infrastructure programmability to perform inspection and monitoring actions well beyond static packet matching available in current commercial products. A shift to more distributed architectures is therefore on-going to tackle the elastic nature of the enterprise’s perimeter, but coordination is still mostly missing.

Improving programmability. The increased security has led to the elaboration of more complex attacks, which eventually turns into more difficult detection. In addition, the ever-growing number and complexity of protocols and applications makes their traffic and behaviour increasingly difficult to understand and to model, which complicates the detection of anomalies [18]. Accordingly, inspection is evolving from simple memory-less string matching to stateful rules (such as regular expressions). As immediate consequence, more processing power (hence CPU cycles) are required to check packets and instructions against more elaborated rules. Therefore in- line detection is likely to overwhelm distributed software-based implementations of security appliances especially in case of large volumetric attacks. The size and distributed nature of the system to protect suggests the need to combine local processing with centralized analysis, so to achieve the best balance between efficiency and effectiveness. In recent years, a great effort has been undertaken to make infrastructures programmable, both in the networking and computing domain. Virtualization technologies, hypervisors, and software-defined networks has laid the foundation for the transition from mere configurability (i.e., choose among pre-defined behaviours) to real programmability (i.e., define the behaviour). Improved programmability at the infrastructure level brings more dynamism in running detection and monitoring tasks. This means that lightweight processing could be used for normal operation, while reverting to deeper inspection at the early stage of any suspicious anomaly (or upon signalling from some knowledge-sharing framework), with clear benefits on the overall processing load. Further, a distributed and capillary architecture, with inspection capability in each network device and hypervisor, automatically addresses scalability, since the processing resources grow with the system size. This increases efficiency and boost better performance, especially when attacks are complex to detect.

Reliability and data protection. Trustworthy of the processed information, events, and knowledge is of paramount importance, since an inappropriate response may be more damaging than the original attack. Any loss of integrity in the infrastructural components (i.e., hardware tampering) or control protocols may result in inaccurate, inappropriate, manipulated, or poisoned context information, which gives a forged situational awareness and eventually leads to ineffective, late, or even counterproductive reactions. Encryption and

D2.1 Vision, State of the Art and Requirements Analysis V1.0 integrity services are almost always available in control channels, as well as user authentication and access control, but no certification of origin and time of information is usually available. However, trustworthy and integrity of the collected information are also fundamental requirements for maintaining historical evidence with legal validity, to be used for example in forensics investigations. At the same time, privacy issues must be tackled in order to not disclose any personal and sensible information without explicit consent, even to technical and management staff, apart in case of criminal investigation. Specific challenges include collecting and conserving events and traffic patterns in a confidential way, anonymising data before analyses, and making them accessible in clear form only in case of legally authorized investigation.

Orchestration and management. The progressive introduction of more programmable devices brings more flexibility and dynamicity in processing, but also requires a control plane that exposes device capability, and an orchestration plane that automates the process of on-the-fly building and deploying the configuration/code. Orchestration is largely used in service-oriented and microservice-based architectures, webservices, cloud and NFV management. It automates life-cycle operations, such as discovery, deployment, scaling, resilience, according to specific intention and needs of users. Policy-based security management is an administrative approach for simplifying access control and security management of networks, services, etc. Policies are sets of operating rules, usually in the form ‘on event, if condition, then action,’ that reflect the resource owner’s intention of adequately protecting valuable resources. Policies are typically specified separately to the system implementation to allow them to be easily modified without altering the system. In the transition from centralized to distributed architectures for cyber-security systems, it is indisputable that orchestration will play a crucial role in shaping the behaviour of the capillary programmable infrastructure, i.e., to delegate filtering and pre-processing tasks to programmable resources, including network switches, hypervisors, and smart things, with tight coordination with the deployment and life-time management of software. Through orchestration, the granularity, detail, and periodicity of collected information can be tuned dynamically according to specific policies (e.g., increase granularity in a specific area where anomalies have been detected).

Data correlation. The dynamicity, sophistication, and speed of attacks require autonomous response to provide timely and effective countermeasures. Sharing and correlating events and anomalies within the same and among different domains is also essential in order to (even proactively) anticipate any emerging or upcoming threat already (partially) detected somewhere. In-deep analytics are required to detect and identify threats from elementary and apparently uncorrelated events. Some tools are already available to this purpose (e.g., the ECOSSIAN platform and the Caesair model [19]). Pervasive and fine-grained monitoring of ICT installations will produce an impressive amount of data, even if programmability is used to tune the deep of inspection according to the actual need. The challenge is to add predictive and proactive capabilities to existing security tools and systems, in order to prevent attacks by analysing the environment, rather than merely react in case of compromise. Existing algorithms already make use of flow-level information for network volume anomaly detection, though this only represents the crumbs of what may be available tomorrow. New algorithms for vulnerability analysis and threat detection may be based on the ideas of the Attack Graphs [20], Attack Surface analysis [21], Kill Chain definitions [22] and Attack trees models [23] with the support of the deep learning techniques, petri nets [24], and game theory models [25]. Correlation should also include automatic selection of the algorithms for the analysis of the threats based on the threat potential negative impact, both environment-dependent and environment-independent.

Automation, reaction and sharing. Despite the availability of a broad range of security appliances and massive digitalization in all economical and business sectors, many security processes still rely on the presence of humans

D2.1 Vision, State of the Art and Requirements Analysis V1.0 in the loop. Highly-skilled personnel is required to select the right set of products, to deploy them in the right place, to properly configure them and the infrastructure, to correlate warnings and indications, and to share cyber-threats and security incidents with national CERTs/CSIRTs; in particular, this last process is still largely based on paperwork and emails. National CSIRTs provide support at the procedural level, foster cooperation and sharing of best practice, but do not cover technical issues. This will certainly delay the knowledge of new threats and attacks, as well as the elaboration of remediation actions and countermeasures for every different context. While intuition and interpretation beyond rigid schemes are unique capabilities of humans, they are not equally developed for each being. This is why valuable security staff is a major asset for every organization, and more automation should be available to streamline their work and unburden manual and repetitive tasks [26]. In this respect, more integration of cyber-security management with all business processes would be required. Though security dashboards are continuously evolving and enhancing user-friendliness, they are usually reserved to technical staff and do not integrate seamlessly in other administrative and management processes (e.g., risk assessment, legal and normative requirements). The creation of tailored user-interfaces and notification messages will increase the degree of awareness, preparedness and control at all organizational levels (technical staff, management, legal staff), especially for non-technical aspects (i.e., privacy and trust).

2.5 Two-fold cyber-security dimension for digital business chains The two main dimensions in a business chain (i.e., services and data) are directly reflected in the same number of security aspects: service integrity and data sovereignty. As a matter of fact, when connecting to an external service (be it is a public sensor, industrial process, financial information, or customer database) the matter is not only if we trust its provider, but also how to detect attacks on that component and avoid their propagation along the chain. Therefore, on the one hand, there is the need to know who, how, and where will process private data and sensitive information, while, on the other hand, there is the need to timely detect any attack or threat that may compromise the integrity, confidentiality, or availability of data and processes. Just to make a very simple yet illustrative example, let's consider a very conceptual electricity supply chain made of a power plant and a distribution operator. To correctly operate the system, the power injected by the power plant into the grid must balance the current load. We can imagine the indication of the current load as a digital service, which fed the power plan control system. In case the former gets compromised, the latter may be driven away the equilibrium point, with catastrophic consequences for the safety of the electrical grid and the attached users. A new breed of cyber-security paradigms is therefore needed to protect digital services along the different dimensions. The main challenges in this context can be briefly summarized as follows:

• to increase the information base for analysis and detection, while preserving privacy; • to improve the detection capability by data correlation between domains and sources; • to verify reliability and dependability by formal methods that take into account configuration and trust properties of the whole chain; • to increase awareness by better propagation of knowledge to the humans in the loop.

In this respect, the main targets for the architecture proposed in this work are: i) situational awareness and its correct and tailored representation to humans; ii) the ability to detect threats and to avoid their propagation in complex and partially unknown business chains; iii) confidence in the trustworthiness and reliability of mutual relationships with other parties involved in the business chains; iv) the ability to share, collect and correlate

D2.1 Vision, State of the Art and Requirements Analysis V1.0 security-related data from heterogeneous multi-domain and multi-tenancy systems, without disclosing any confidential information; v) the application of analytics by machine learning and other artificial intelligence paradigms in distributed systems.

2.6 Vision: beyond infrastructure-centric paradigms From a purely architectural perspective, most cyber-security appliances have been traditionally designed to protect the physical infrastructure, not the services implemented on top of it. This is manifest when considering recent solutions for the cloud, where distributed firewalls, antivirus, intrusion detection systems, and identity/privacy management are often implemented in the hypervisor layer to provide security services to all tenants, as schematically shown in Figure 5.a. The progressive dichotomization between the software and the underlying hardware brought by the adoption of virtualization and cloud paradigms has boosted a transition from infrastructure-centric to service-centric architectures (see Figure 5.b). When the Infrastructure-as-a-Service paradigm is used, virtual instances of security appliances are “plugged” into service graphs, leveraging the large correspondence with physical infrastructures that is present in this model. Each tenant retains full control and responsibility of security management for its own graphs, without the need to rely on and trust external services. However, application to other computing models is not straightforward. Typical examples include serverless architectures1, where no custom appliances can be installed, and cyber-physical systems, due to resource-constrained devices.

Figure 5. The on-going evolution from infrastructure-centric to service-centric cyber-security architectures.

Chasing more efficiency, the next evolutionary step is a service-centric architecture that removes the need for legacy security appliances, embeds security capabilities into each software element, and orchestrates them by a common security manager that (logically) centralizes the detection processes (see Fig. Figure 5.c). Just like an operating system exposes computational, data and communication resources to applications via Application Programming Interfaces (APIs), so future digital services should include APIs that expose security properties, including configurations, monitoring, inspection, and enforcement rules. The vision is a multi-layer framework (see Figure 6) that distils trust, privacy, and situational awareness from a large set of data (application logs, security events, packet inspection, network statistics, presence of private and sensitive data, security configurations, etc.), providing tailored messages to different actors (technical, legal and management staff, final users).

1 Serverless architectures are based on cloud functions that can be accessed and chained by REST APIs. They are currently provided by the biggest cloud player: AWS Lambda, Microsoft Azure Functions, Google Cloud Functions.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Figure 6. A forward-looking perspective for the evolution of cyber-security architectures for digital services.

The GUARD concept revolves around three main axes: visibility, detection, and traceability. First, visibility of the digital services involved in the chain, as well as the chain topology itself; visibility includes the service properties (type, vendor, location, security features) and its execution (events, logs, performance). Second, detection concerns known and zero-day attacks, leveraging programmability to change the scope and depth of inspection, so to effectively investigate suspicious conditions and anomalies. Third, traceability means the capability to follow the propagation of data within the chain, understanding its usage and potential loss of confidentiality. This should also include technical measures to limit the propagation in a supervised or automatic way. The final objective is to improve awareness at the different layers of the business processes: end users, IT staff, management, lawyers, end users, etc. The overall GUARD concept can be briefly summarized as: “improved awareness for improved response and mitigation.”

Figure 7. The GUARD concept revolves around the idea of improving awareness for improving response.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2.7 Approach The implementation of the overall vision depicted in Figure 6 is based on the identification of three main logical blocks, as indicated in the bottom part of that picture.

Local security agents and APIs. Local security agents encompass a heterogeneous set of technologies for monitoring and enforcement. This includes logging and event reporting capability developed by programmers into their software, logging facilities built into the kernel (e.g., syslog daemon [6]) as well as monitoring and enforcement frameworks built in the kernel and system libraries that inspect network traffic and system calls (e.g., the eBPF/IOVisor framework [27], Intel DPDK [28], FD.io [29]). Enforcement will also cover data protection, by ensuring they are accessed, shared, and exported according to the owner’s policies in terms of data minimization, purpose limitation, integrity, and confidentiality. As already remarked, one unique selling point will Figure 8: Wrapping software programs with be programmability. This means that monitoring operations, suitable abstractions and models is necessary to types and frequency of event reporting, level of logging is automate deployment and management. selectively and locally adjusted to retrieve the exact amount of knowledge, without overwhelming the whole system with unnecessary information. The purpose is to get more details for critical or vulnerable components when anomalies are detected that may indicate an attack, or when a warning is issued by cyber-security teams about new threats and vulnerabilities just discovered. The usage of standard external interfaces is the common denominator for any orchestration technology. They give some form of abstraction (e.g., APIs) which wraps each software components and makes them “handleable” by automation tools. Such abstraction may include descriptive elements (name, vendor, description), capabilities (provided functions and/or interfaces, required functions and/or interfaces,), deployment constraints (CPU, RAM, disk, bandwidth), and so on. GUARD will extend FIWARE APIs to describe what security capabilities the component provides, e.g., logging, event reporting, filtering, deep packet inspection, system call interception, data tracking, packet filtering. In addition, existing management APIs may be used to perform lifecycle management actions on compromised components (e.g., removing or replacing a non-trustable component). Extensions will be based on existing languages and semantics as Netconf/Yang [30], P4 [31], RestConf [32].

Security manager. The security manager is the smart engine that collects, stores, and processes security data. By selectively querying all software components involved in the business chain, this layer builds the logical topology of the overall service, including the security properties and capabilities of each node. The abstraction provides both real-time and historical information, hence allowing both on-line and off-line analyses. One of the main advantages of collecting information is the availability of non-redundant data from different subsystems (application, network, memory, I/O), instead of relying on a single source of information (network traffic) as is the common practice nowadays. The same data base is shared by a collection of processing algorithms, which include typical functions currently available as separate appliances: Intrusion Prevention/Detection Systems (IPS/IDS), Network Access Control (NAC), Antivirus, Application Level Gateways (ALG), and more. Since centralization of processing may easily result in excessive network overhead for collecting data and measures, it is important to shape the inspection, monitoring, and collection processes to the actual need, by re-configure the reporting behaviour (logs, events, network traffic, system calls, etc.); programming also include the

D2.1 Vision, State of the Art and Requirements Analysis V1.0 capability to offload lightweight aggregation and processing tasks to each virtual environment, hence reducing bandwidth requirements and latency. Beyond the mere (re-)implementation of legacy appliances for performance and efficiency matters, the specific structure of the GUARD framework paves the way for a new generation of detection algorithms, arguably by combining rules-based and machine learning methodologies with big data techniques; the purpose is to locate vulnerabilities in the graph and its components, to identify possible threats, and to timely detect on-going attacks. Beyond this, trust and policy models must be in place to assess the reliability of the different actors involved in the chain, by considering certifications and sanity checks, and to protect data throughout their lifecycle spanning capture, transmission, storage (potentially redundant and remote), and destruction [33].

User interface. Though reporting on-going attacks to cybersecurity staff remains among the top priorities for the user interface, the delivery of secure services encompasses a broader sense of awareness. As the final users (governments, organizations, and individuals) perceive the importance of data and the associated risks, they feel the need that adequate mechanisms for protection are in place, through proof, auditability, certification and traceability. The technical gap should be filled by proper representation of security and privacy issues, overcoming the lack of awareness and education of the users. In a similar way, tailored informative contents should be delivered at different levels of the companies’ structure. Assessing compliance with binding regulations is a typical task for the legal staff, who might not have the technical background to fully understand the implications of evolving service. In a similar way, risk assessment at the management layer also requires to automatically feed existing tools, reducing the reliance on labour intensive and potentially error-prone analysis by experts. Finally, it is of paramount importance to extend the scope of situational awareness beyond one organisation, without either disclosing internal secrets and classified information or breaking privacy.

While recognizing the objective difficulty in applying legacy security paradigms for multi-domain business chains deployed over open and public execution environments, GUARD leverages standardized APIs and interfaces to build an open cyber-security framework. Just like an operating system exposes computational, data and communication resources to applications via an API, so future digital services will include APIs that expose security properties, including configurations, monitoring, inspection, and enforcement rules.

Figure 9. Main APIs envisioned in the realization of the GUARD concept.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Figure 9 shows a more concrete view of the overall GUARD vision. Local security agents for monitoring and enforcement are integrated in the same services that build the business chain (pictorially represented by the graph in the left side). They create a rich, capillary, and pervasive context fabric for the security context. The security manager is here broken into three main logical components:

• the Security Context Broker, which collects and stores the security context from the local agents, and provides a high-level abstraction for accessing data in a graph-oriented way (so to facilitate the identification of existing dependencies and correlations); • the Detection and analysis toolkit, which is a collection of algorithms and procedures to analyse the data, recognize known attacks, identify anomalies and new attack patterns, track the propagation of data, assess vulnerabilities and risk levels; • the Security controller, which implements the smart logic to orchestrate the overall system according to users’ intents (in the form of policies); it is responsible to select the detection and analysis algorithms to run, to configure and program the local agents, to trigger notifications to system users.

The processing flow of the security context is represented by the green arrows, from the service topology (left side), through data collection and abstraction (Context Broker), data analysis and attack detection, up to the smart control logic (Security Controller) and end users (right side), in general belonging to multiple domains. The red arrows instead show the control flow, from the Security Controller to the local agents, through the mediation of an abstraction layer implemented by the Context Broker. Three interfaces are envisioned to interconnect the logical blocks. From the left side, the first interface gives access to i) programmable inspection and monitoring capabilities in the infrastructure (smart things, execution containers, virtual functions, cloud applications); ii) security properties and configurations for data protection (encrypted channels, integrity and digital signing mechanisms, certificates, vendors/owners, sharing policies). It will be developed as extension to existing APIs in relevant domains (FIWARE) and will be used to create a shared security context. It may be considered as part of the management interface in web services or service-oriented architectures, and will be used to create a shared security context. The usage of standard external interfaces is the common denominator for any orchestration technology. They give some form of abstraction (e.g., APIs) which wraps each software component and makes it “handleable” by automation tools. Such abstraction may include descriptive elements (name, vendor, description), capabilities (provided functions and/or interfaces, required functions and/or interfaces), deployment constraints (CPU, RAM, disk, bandwidth), and so on. The second interface will abstract the overall service topology and will allow high-level queries for data retrieval and fusion by detection algorithms. It will give visibility of all infrastructure components involved in the same service and will support the retrieval of real-time and historical security context (logs, packet inspection, position of user data, trustworthiness, etc.) for all of them. It will also allow to shape the granularity and depth of data based on the application’s needs. The last interface will report threats, attacks, and unreliable or not-compliant configurations; it will also enable to interact with the underlying environment to feed the detection algorithms with the knowledge of new threats and to investigate anomalies. The positioning of the three APIs also separates four kinds of exploitable assets that are expected from the Project.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2.8 Key values GUARD addresses the growing need to evolve from discrete cyber-security appliances to integrated, pervasive and capillary systems, which decouple distributed context monitoring from (logically) centralized detection logic (see Figure 10). Key enabler for this evolution will be the architectural separation between analysis and data sources, mediated by proper abstraction; this paradigm will result in an open, modular, pluggable, extendable, and scalable security framework.

Figure 10. GUARD addresses the complexity and multi-vector nature of recent cyber-security threats by moving from current narrow- scope silos to a more integrated multi-vendor layered and open framework.

Key values delivered by the implementation of the GUARD concept will include scope, efficiency, detection, and awareness.

2.8.1 Scope Today’s approaches to check systems during design-time are inapplicable for dynamically composed services given that: a) a service behaves as a black-box, thus a composite service’s building blocks is not necessarily open to inspection; b) at runtime, service components may be replaced by new ones; and c) the interaction among different services leads to non-deterministic outcomes. Relying on security mechanisms at the infrastructure layer (e.g., distributed firewalls, antivirus, intrusion detection systems, and identity/privacy management, often implemented in the hypervisor layer) is no more convenient because of the limited inspection that is allowed by privacy rules, and the difficulty in implementing federation and collaboration mechanisms between providers at both the technical and procedural level.

GUARD moves the scope from infrastructure-level to service-level, which is not yet available today, by developing an integrated and holistic approach to tackle security and privacy risks in complex, extended, and dynamically evolving systems. This means security and privacy engineering become an integral part of the software engineering flow, because specific security capabilities must be integrated into each software bundle (monitoring, inspection, filtering, fusion, lightweight processing, security properties). This is not enough, because all these capabilities must be properly orchestrated for continuous security assessment of complex, dynamic, and evolving system topologies. The development and standardization of a common API goes in the

D2.1 Vision, State of the Art and Requirements Analysis V1.0 direction of enabling plug-and-play insertion of every software components in the overall end-to-end security solution. Available discovery, composition, and orchestration functions will be used to track the service topology and detect any change. GUARD aims at overcoming the typical non-compositionality of security and privacy, thus adding a new dimension to the typical self-* properties (self-configuration, self-optimization, self-healing). The de-composition in two layers, one for monitoring and the other for detection, enables a cost-effective and scalable methodology to engineer security and privacy in new systems and services, since the same detection algorithms and reaction policies easily adapt to new components, without requiring full re-design of the whole system. Part of this scalability largely relies on the possibility to evolve autonomous systems into programmable systems: autonomous systems work with pre-specified basic rules, which are used to configure, heal, optimize and protect automatically thereby reducing human efforts and involvement. Even though automaticity implies some degree of adaptation to the evolving context, GUARD will improve this concept by creating programmable systems, where the behaviour is not limited to a fixed set of actions/policies but can be fine-tuned in both inspection and detection capabilities. Beyond mere configuration of what parameters have to be measured and against which thresholds, GUARD will also enable the definition of powerful programs for data fusion and aggregation, to be run locally inside specific software components. The large freedom in what, where, and how to collect will also enable new unprecedented ways to carry out investigation on new threats, by shaping the reporting to what is really necessary.

2.8.2 Efficiency The increase in the complexity of attacks makes modelling an attacker’s profile increasingly difficult; as a result, rules have evolved from memoryless simple string matching to stateful automata (such as regular expressions) and more computing cycles are required. New threats are invented with increasing speed; hence, the methods of correlation and analysis must be developed to keep the accuracy at acceptable level and to provide meaningful and useful information. Existing tools for monitoring and inspection already rely of very efficient technologies, though the duplication of similar tasks undermines the efficiency of the overall system. In addition, there is a low level of programmability, which means a bad tradeoff between the granularity of inspection and the actual level required by detection algorithms. In this respect, GUARD will boost efficient and effective collection and processing of security context from heterogeneous sources. Key aspects of the GUARD framework include the usage of safe and efficient monitoring and inspection hooks in the kernel, by leveraging the eBPF technology to create programs can be dynamically created and injected in the kernel at run-time. The eBPF is a generic in-kernel, event-based virtual CPU, which leverages an assembly- like syntax for very efficient and quick processing. While originally conceived to filter network packets only, it has now evolved to catch a broader set of kernel events; in general, any kernel event can be potentially intercepted (Kprobes, Uprobes, syscalls, tracepoints), making eBPF capable of analysing message (socket-layer) received, data written to disk, page fault in memory, files in/etc folder being modified. eBPF enables arbitrary code to be dynamically injected and executed in the Linux kernel while at the same time providing hard safety guarantees in order to preserve the integrity of the system, hence reducing the attack surface. Since the amount and type of information to be collected may change during time, the ability to shape the inspection and monitoring tasks to the actual needs is of paramount importance to balance effective and reliable detection with low overhead. GUARD will go beyond the basic monitoring capability today envisioned by flow- level or log reporting, which includes stateless and/or stateful inspection criteria on flows and/or packets, aggregation and storing capabilities. The target is to push commands, instructions, and configurations locally, hence moving towards real (lightweight) processing tasks. The set of supported programming abstractions (e.g.,

D2.1 Vision, State of the Art and Requirements Analysis V1.0 extended Finite State Machines, P4, BPF programs) will be discoverable by the component APIs, which totally agrees with the self-composition property. Beyond data collection, scalable and powerful representation and querying of security properties and capabilities are important to properly feed the detection logic and analysis algorithms. The ambition is also to provide data fusion capabilities, so that pre-processing and aggregation of data may be accomplished by the same query, so to optimize look ups in the abstraction model.

2.8.3 Detection of complex multi-vector attacks and identification of new threats Existing algorithms for detection already provide a good level of protection, but it is often limited to known threats and attack patterns. This is expected to grow with the progressive introduction of machine learning and other artificial intelligence techniques. GUARD will investigate new automatic detection methods of known attacks and threats, by the comprehensive analysis of the existing machine learning methods, including, K- Nearest Neighbours, Naive Bayes, Graph Kernel and Support Vector Machine. GUARD will also verify the efficiency of the Multiagent systems (MASs) in the detection of anomalies in the network behaviours, by developing an intelligent integration of MASs with deep learning and artificial neural network for the improvement of the detection of malicious datasets especially in the cases of massive attacks and attack campaigns. Detection of the unknown attacks is the result of the monitoring of the network behaviour and identification of anomalies - non-conforming interesting patterns compared to the well-defined notion of normal behaviour of the network. GUARD will develop the two-trier model with the machine learning methods at the bottom, based on the idea of similar models. The machine learning methods will be selected from the provided comprehensive comparative analysis for the known attacks and combined with the multilevel correlation analysis among the attributes of the correct and malicious data.

2.8.4 Awareness Today there is limited visibility for users about how and when their data are processed. Authorization for data usage is merely based on electronic or written consensus, but there are no standard tools to track and limit the propagation of data. GUARD will design a set of APIs to query each service about the presence and usage of private and sensitive data, in order to trace the location and propagation of data among services. Any access to data will trigger a notification and the verification of user’s policies. This way, beyond enforcement of data access, record will be kept about the transfer of data to other services, enabling later verification of persistence and request for removal. The definition of fine-grained access and usage rules will contribute to the creation of privacy-by-design systems, where only the minimal amount of information is shared through enforcement policies. Also, in this case, GUARD will primary look at the IDS initiative for relevant models and protocols. The composition of digital services may span a wide range of applications, infrastructures, devices. Some services in the chain might not satisfy minimum security requirements imposed by the user. In this case, users should be aware of the weakness, and should be able to decide whether it is acceptable or not. Trusted computing has already been largely investigated (especially in virtualized environments like the cloud and NFV), but these kinds of solutions are not still available for composite services. GUARD will address assessment of trustworthiness, by formal methods to derive the trust level from the set of available security properties for each service involved in the chain. Trustworthiness will involve the two dimensions of identity (service owner/provider) and integrity (software).

D2.1 Vision, State of the Art and Requirements Analysis V1.0

With respect to awareness, bare technical information (e.g., the available algorithms for encryption or integrity, the software version) will be totally useless for most users. The ambition here is to elaborate more user-friendly representations, based on graphical bars or warning/alerting signs. The ambition is also the identification of proper methods in the user interface to avoid warnings are systematically ignored by users (one possible solution is the notification to higher levels in the organizational structure).

2.9 Application scenarios This section describes some relevant application scenarios, highlighting the most suitable GUARD features that can be applied to each of them, and the related improvements expected by the application of the GUARD framework. Each scenario is described thorough the following template:

Name Motivation Description Expected GUARD Features Expected Improvements where

• Name specifies the application scenario; • Motivation describes the scope of the proposed scenario; • Description describes in detail the scenario, its main components and functionalities, and how the GUARD framework can interact with them; • Expected GUARD features lists the GUARD features suitable to the application scenario under analysis; • Expected improvements describes the benefits and improvements in the security level that the scenario under analysis can reach thanks to the application of the GUARD framework.

2.9.1 The automotive scenario Name Automotive The automotive industry is integrating ever more ICT technologies in vehicles for sensing and diagnostic of mechanical parts, driving assistance and safety, entertainment, and infomobility. Some years ago the main focus was on continuous and seamless Vehicles-to- Vehicles (V2V) and Vehicle-to-Infrastructure (V2I) connectivity, through specific Motivation architectures and protocols as CALM [34] and 802.11p/WAVE [35]. Today, the scope is broader, and also includes integration with the cloud and other digital services. However, car manufactures are reluctant to integrate external components (such as road sensors, cloud computing services) into their systems for autonomous driving, just because they have no means to trust information and data. Figure 11 shows a possible scenario for implementing digital services in the automotive Description sector. The left side pictorially depicts the reference scenario, whereas the right side shows the expected workflow between the security architecture and remote services.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Automotive

Figure 11. Composition of digital services and infrastructures for the automotive sector. We suppose a Traffic Services Provider (TSP) deploys and manages a number of Roadside Units (RSUs) to provide infomobility and smart driving services. The TSP owns the RSUs, but it relies on third parties’ infrastructures and services for composing its offering. It rents a network slice from a Network Provider (NP) to interconnect all its RSUs on the geographic scale; it also connects to an Autonomous Driving Application deployed in the cloud. The final users of the system are the drivers, through their connected vehicles, which only see the TSP interface but are not aware of the different infrastructures and domains involved in the creation of traffic services. That means the users have no way to know the processing pipeline of their data, including the owner, properties, and location of infrastructures and services. The same identity of the TSP must be trusted a-priory by the car manufacturer or in real-time by the users, without any technical means to verify. In a typical workflow, vehicles detect the RSU and present the user the list of available services they discover. We suppose that the GUARD framework is integrated in the vehicle's driving assistance systems, maybe as a library or plug-in, so there is a common interface for the user. Before using the system, the user is expected to load his preferences and security policies (step 1), which are continuously evaluated while interacting with external services. Let's suppose the user loads a trust policy, which evaluates the level of trustworthiness of external services before authorizing their usage. We omit the description of identity management and access control for the sake of brevity, although these represent inescapable procedures in the system. Upon connection, the GUARD system assesses the trustworthiness of the TSP, by checking its identity (for instance, by digital certificates) and the presence of known security services among its APIs (step 2). One of these APIs may enable notification in case external services or infrastructures are used. The list of mandatory and optional features that must be implemented to consider the peer trusted is defined by the user’s policy. A simple trust policy may compare the TSP identity against an internal list of trusted service providers; more elaborated policies may build its reputation based on external chains of trust. Now imagine the user activates a remote Cooperative Driving service, which feeds the on- board driving assistance systems with information about the behaviour of the surrounding vehicles, the presence of obstacles and roadworks, weather conditions (strong wind, ice on road, low visibility due to heavy rain or fog, etc.), and so on. Since this service has severe safety implications, the GUARD system uses the security API exposed by the TSP to receive notifications in case external services are invoked. The local security agent on the RSU detects the need to use the external network connection, a cloud service, and a third-party

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Automotive application; therefore, it notifies the Context Broker about the identities and credentials of the involved third parties, as well as access points to their security agents (step 3). Based on this information, the GUARD system updates the current service topology, with the identity of the new services and the supported security capabilities. The new chain is again evaluated according to the user’s trust policy, to verify whether the level of trustworthiness of all the involved actors fulfils the preferences of the user. The lack of security agents in some domains will clearly be considered negatively when assessing the trustworthiness. Even if all actors in the chain are trusted, their infrastructures could be attacked. Safe autonomous driving requires timely and accurate information, so the user also activates a DoS policy. This policy selects one algorithm for DoS detection and one or more inspection programs that can feed it with the necessary measurements; it pushes such programs on the security agents present in the Network Slice (NS) and the cloud IaaS (step 4). Once the program has been successfully loaded and activated in the execution environments, they start reporting measurements (step 5), which the DoS algorithm starts processing. It is worth noting that, even with a partial view of the underlying infrastructures (network slice and cloud virtual network), a proper set of measurements (e.g., latency and jitter) can be enough to detect volumetric attacks, in addition to more targeted attempts to resource- exhaustion. When the DoS detection algorithm triggers an alert, the GUARD system immediately warns the user and turns off autonomous driving (step 6); optionally, it could also notify the affected service, so that it can further investigate the issue and in case start mitigation. The service could be automatically reactivated later on, after the alert from the DoS algorithm is ceased. • Automatic operation according to user’s policies • Trust services exposed by GUARD APIs Expected GUARD • Identification of service topologies Features • Programmability of security features • Detection of anomalies and attacks on the network • Awareness to users Current digital services largely hide the usage of external infrastructure and services, hindering the possibility to understand what happens to both data in transit and data at rest. The detection and mitigation of attacks is always possible locally, but this may be of little benefit in case of volumetric DoS, which saturates the Internet link. The underpinning concepts behind the GUARD approach leverages the presence of Expected security APIs to expose identity, properties, and local security services. Through such APIs, Improvements it should be possible to discover the service topology, including external services and infrastructures, as well as to retrieve monitoring information and set enforcement rules. In the previous example about volumetric DoS, the activation of proper filtering in the Internet Service Provider’s infrastructure would avoid the saturation of the local link. (Semi-)autonomous operation according to user’s policies is also important, to reduce the time for reaction and to effectively operate on behalf of less skilled users.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2.9.2 The FIWARE Lab scenario Name Commercial FIWARE Lab FIWARE Lab is deployed as a Cloud Solution for multitenant designed to deploy FIWARE based solutions. There are some public deployments, but commercial solutions could also be implemented. Commercial FIWARE Lab solutions would provide multitenant different kind of resources, so several Tenants would be able to share the same Cloud resources from the same provider. In addition to computing, networking, and storage Motivation resources, the Lab provides a number of components and datasets to compose applications. Components can be installed standalone or shared with other users. FIWARE Lab is a federation of cloud infrastructures provided by several institutions. Potential users may be puzzled because currently there are no mechanisms for protecting appliances against Internet threats and certifying the large catalogue of software and data available.

The left side of Figure 12 depicts a typical usage of FIWARE Lab. FIWARE users usually interact with a web dashboard. They select a cloud provider among those available and request IaaS resources institutions, under the constraints of their subscription (1). The user can install his own applications in VMs (e.g., Apache web server, MySQL), select software widgets or existing mash ups from the Catalogue, and connect to existing datasets (2); he can also connect to external operators, cloud services, or IoT devices (3). The final composite service is then used by customers of the FIWARE user.

Description Figure 12. FIWARE Lab can be used to deploy private instances of GEs or to connect to public services. The FIWARE Lab dashboard interacts with the GUARD framework. Before starting operation, the candidate service topology is pushed to GUARD for trust assessment. GUARD queries all involved components through the security APIs, and collects information about vendor, location, certification, security properties. Based on the availability of security APIs and the information provided, the trust level is compared with what required by the user, and a positive/negative indication is returned. Based on the kind of service and the expectations from his customers, the FIWARE user activates security services provided by the GUARD framework, for example a DoS mitigation service. When the DoS mitigation feature is enabled, the GUARD framework checks the capabilities of the FIWARE components and activates traffic measurements on critical nodes that support network inspection. Traffic measurements are collected on external interfaces (i.e., accessed by customers or other external components) and internal interfaces (i.e., only used by the service components), so to detect both external and internal attacks. “Internal” attacks may be carried out by other cloud users that share

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Commercial FIWARE Lab the same infrastructure. “External” attacks include both direct attacks to the service and indirect attacks to public infrastructures, such as DNS and telco’s networks. Inspection, classification, and measurements on multiple traffic flows can easily overwhelm computing, networking, and storage resources. GUARD starts with a coarse- grained set of measurements on the aggregate traffic only; when deviations from historical/expected patterns or other anomalies are detected, the accuracy is improved by fine-grained statistics on smaller aggregates or single flows. Once the source of the attack is identified, selective packet filtering is activated in the most appropriate components. • Security APIs exposed by FIWARE software, widgets, and infrastructures • Programmable deep packet inspection and enforcement of network traffic Expected GUARD • Analysis and correlation of network statistics and measurements from multiple Features sources • Dynamic behaviour to balance depth of inspection with overhead • Artificial intelligence and Machine Learning for detection and control The current limitation of FIWARE Lab is the lack of security services at the infrastructure level; users could install software instances of common cyber-security appliances, but this consumes resources, has limited visibility, and needs additional software and integration effort for cross-correlation of events and data. The adoption of GUARD technology will extend the scope of FIWARE and other orchestration APIs to security. FIWARE Lab users will be able to select the proper set of components and infrastructures based on available security capabilities, according to the criticality of their services and the expectations from their customers. Once the service is operational, GUARD protects from different attack vectors, including external attacks from the internet or internal threats from another FIWARE Lab users. Though the description has mostly focused on supporting FIWARE Lab users that implement a Expected FIWARE service, the implementation of the GUARD concept will also be beneficial for Improvements infrastructure providers (i.e., cloud and device owners). The cloud is itself a kind of digital service, which needs to be monitored to ensure legal and correct operation by users. By checking the trustworthiness, certification, and operation of cloud resources, the cloud owner may be able to identify violations of the Usage or Service Level Agreements by customers (including, of course, FIWARE users), as well as to pinpoint malicious behaviours that jeopardize other users or the entire infrastructure. Despite the availability of attack detection, threat investigation, and mitigation capability, GUARD will also allow the best usage of cloud and physical resources, in terms of overhead and costs, by leveraging the envisioned programmability of GUARD interfaces to adjust local operation so to fit at best the different needs at a given time of deployment.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2.9.3 The Healthcare Marketplace scenario Name Healthcare Marketplace The composition of tailored digital services is often based on the selection from public or private repositories of components (aka catalogues), which includes applications, infrastructures, data. Such catalogues are usually conceived as marketplaces, where digital components are published by suppliers and bought by service providers or final users. Though multiplicity of available components offers a rich set of alternative Motivation products, their integrity and the trustworthiness of vendors often are not as trivial as their technical features to evaluate. Yet their technical heterogeneity does not facilitate the definition of uniform cyber-security frameworks. Indeed, services composed from public marketplaces are frequent targets of cyber-attacks and thus need a high level of cybersecurity as there are stored supplier data, product data, personal payment data and confidential user information. This scenario considers the selection of products, data, and services with specific security features, so to be connected with monitoring and detection frameworks managed by web-based dashboards with intuitive interfaces, as shown in Figure 13.

Description

Figure 13. Selection of digital services from resource catalogues. Healthcare Catalogues & Marketplaces: Resources are organized in digital catalogues, which include information about supplier data, product information, usage scenarios, documents etc. One implementation of the catalogues is the CATAALOG, a curated catalogue for AAL and smart health solutions and technologies. CATAALOG provides an overview on existing and upcoming smart solutions for active and healthy ageing. The term CATAALOG includes “AAL” for Ambient Assisted Living, as well as “AA” for Active Ageing – it is “The Catalog with the extra A for Ageing”. The CATAALOG contains a comprehensive collection of AAL products and healthcare solutions, and relevant suppliers. The description of resources is enriched with security features; they include both information about suppliers and chain of trust, as well as embedded monitoring and enforcement capabilities that can be dynamically composed to create integrated cybersecurity frameworks. Additional information may include known vulnerabilities and open bugs, which can be effectively used in assessing the risk of using that component. The

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Healthcare Marketplace description may be retrieved from metadata associated to software and datasets, or by specific APIs for online infrastructures and services.

Intelligent Healthcare Advisors: They help different target groups to gather the desired information about products, services, technologies or any other entity they are looking for. Digital advisor asks the users around 7 to 10 questions, aiming to narrow down the number of products, but still showing a relevant selection (step 1a). This allows the digital advisors to provide the target-groups with specific information, leading to a selection of (compatible) products, fitting the user’s specific needs and requirements. Here, it is important to keep the questions short and simple, use plain language and simple icons. Though mainly addressed at the target groups which are in need for information about relevant solutions, i.e. clients and governments, digital advisors also provide information to businesses which want to enter or expand their existing AAL business to deliver a fully functional ICT environment with specific web and mobile services for older adults and their relatives, for businesses as well as for governments and municipalities involved in AAL. An integral part of the selection and composition process is verification of existing threats and vulnerabilities, by scanning existing security bulletins from internal/national/regional CERTs/CSIRTs (step 1b). The severity of threats, both on single components (Operating Systems, software, libraries, databases, functions) as well as their combination and configuration, should be compared with the criticality of the whole application, through a risk assessment procedure on behalf of the end-users. The availability of structured and common description of security properties and threats allows such a platform to present regional, national and international AAL products and services in accordance with the needs of the end users, and combines them in a consistent way to deliver integrated cyber-security solutions (step 2).

Data Repositories: They contain marketplace systems, core product and catalogue supplier data, with further large amounts of publicly available data and research, patents, or procurement knowledge. There are connections between the company’s data storage system with leading external data repositories like the Data Market Austria and clients.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Healthcare Marketplace Thus, there is the need to establish trustworthiness of data and their providers, as well as to limit the usage of data to what was settled in the contractual agreements.

Once the set of services and products have been selected from the marketplace, deployed, and chained together, the management of cyber-security may be delegated to another entity, which plays the role of Security Provider (step 3). Security Providers: A Security Provider manages cyber-security services on behalf of other entities, extending the scope and capabilities of existing Security Operation Centres (SOC). It operates complex processing infrastructures with intelligent assessment, detection, correlation, and analysis capabilities on large datasets. It also suggests mitigation and remediation actions, even translating them into suitable configurations and settings whenever the proper enforcement capabilities are available. To provide its services, the Security Provider must connect to control and management hooks of the deployed services, likely by REST APIs or other RPC method (step 4a); this entails an authentication and authorization framework must be in place to manage identity and access control between the Service Provider and the set of controlled services. Security Providers are in charge of attack detection, threat investigation, integrity verification, and any other real-time security service. They interactively carry out these tasks by programmatic control of the security hooks available in each digital components: for example, they change the verbosity of logs (error, warning, info, debug), they define custom statistics to be collected on network traffic, they analyse packets to verify encryption is correctly applied, they selectively inspect packets’ bodies to check the syntax (after decryption), and so on. In addition, new security bulletins from CERTs/CSIRTs should be continuously retrieved and analysed to identity potential vulnerabilities in the monitored components, configurations, topologies (step 4b). Awareness and notification to end-users is a critical task for Security Providers, which are responsible to timely depict the current and evolving situation through intuitive and user- friendly communication that support the decision process (step 5a). This includes emails, SMSs, messaging services (WhatsApp, Telegram, Slack, Skype), phone calls, etc. The creation of tailored content to the intended recipients would greatly increase the

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Healthcare Marketplace understanding of the message and the undertaking of proper decisions. Beyond the due notification to end users, it is also important sharing the knowledge of new vulnerabilities and threats with other stakeholders. Hence, in case of a incident, a security report is automatically generated according to standard format and sent to relevant CERTs/CSIRTs (step 5b). • Description of security properties and capabilities (metadata, APIs) that enhances current information for services in catalogues • Security analytics to monitor the data flow and activities • Embedded data monitoring and enforcement capabilities in the services available in public catalogues • Programmable security hooks for logging and packet filtering, deep packet Expected GUARD inspection Features • Intuitive, graphical, and user-friendly interfaces for both security providers and end-users, with content tailored to the different roles • Incident guidelines to keep the risks at a minimum level and to be prepared in case of attacks against the data infrastructure • Automation in the interpretation and preparation of security reports from/to CERTs/CSIRTs The CATAALOG framework is a rich marketplace where users can compare and select eHealth products and services, with optional assistance from digital intelligent advisors. This process is usually unaware of security aspects, which are often seen as add-on features to be implemented at deployment time. However, the efficiency and effectiveness of cyber-security frameworks are largely dependent on integration of monitoring, inspection, and enforcement capabilities since design time. In this respect, the knowledge of security properties of digital services is a fundamental pre-requisite for selecting the proper set of components that can be easily composed and cooperate Expected together, without the need to install and deploy additional after-market tools. Improvements The availability of common security APIs greatly facilitates the creation of distributed cyber-security frameworks, by “plugging” the selected components to the same detection and analysis engine. Eventually, this is expected to create consolidated architectures and tools, which can be easily ported to the different services built from the marketplace, hence reducing the risk for human mistakes in the design, development, and deployment of tailored solutions for each different scenario. As new vulnerabilities and bugs are discovered during operation, these will be inserted among security properties in the catalogue, so that unreliable components can be safely discarded until they are updated or patched.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2.9.4 The metropolitan city scenario Name Information Exchange System in a Metropolitan City The identification of citizens is a fundamental aspect for the public administration. Different documents are often used for different purposes: IDs, passports, driving licences, fiscal codes, insurance numbers, etc. Common information (name, surname, data and place of birth, home address, phone number, marital status) is duplicated on these documents, and keeping it in sync is a real challenge today, usually left to the responsibility of citizens. In Italy, some initiatives like PEC (Certified Email service) and Motivation SPID (Public System for Digital Identity) are already providing concrete steps towards digitalization of public services, but they still rely on existing physical documents. The road towards full digitalization of the public administration also includes novel procedures to create, maintain, update, transfer and share personal information, under the constraints of privacy settled by the GDPR and relevant regulations. This requires increasing security and trust between the multiple entities and implementations that are involved in the transactions. This application scenario concerns an Information Exchange System between public administration entities of a Metropolitan City. In Italy, a Metropolitan City is an administrative extension of large cities, including the smaller cities and villages in their surroundings. In short, a "Metropolitan City" is made up of a central body (typically the municipality of the provincial capital) and a set of peripheral bodies (the other municipalities and public institutions of the province). Within a Metropolitan City, multiple local and decentralized agencies keep information about citizens and their relationship with the state government: civil registry, public vehicles registry, real estate registry, and so on. The Information Exchange System is an IT system that allows the automatic exchange of information between different entities, minimizing human intervention through the use of web services. The system foresees that the services made available by all the institutions of the Metropolitan City ecosystem are published through an API Gateway.

Description

Figure 14. Information Exchange System for the Metropolitan City. The exchange of information is in both directions so each institution can collect data of its own interest and provide updates of the information it holds. Potentially there are many possible operations and, in some cases, very sensitive. For example, it is possible to implement the transfer of a citizen's residence from one municipality to another or to

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Information Exchange System in a Metropolitan City request the debt situation of a citizen in one municipality in order to evaluate the legitimacy of incentives or tax deductions in another municipality. Access to such information might also be granted to private entities, in case this is needed by the contract (e.g., address and age for health or car insurance). One of the main aspects of this scenario is, of course, to ensure that all actors in the system have guaranteed levels of security and trust in the other actors with whom they exchange data. The institutions that join the Information Exchange System must follow an accreditation procedure in order to be recognized as "reliable" entities and become part of the ecosystem. For example, the whole system may rely on SPID, which is the currently identity management system for the public administration. The SPID itself becomes an additional digital component of the public service chain. The definition of the Information Exchange System includes a common data model for citizens’ data, and a Gateway to access, update, and transfer them. The implementation of the Gateway includes i) some form of fine-grained access control (e.g., Attribute-Based), which are used to define which operations are possible on the different fields by specific users or groups of users; ii) a local GUARD agent, which translates the GUARD APIs in internal configurations for access control. The local GUARD agent is also responsible to notify both authorized and denied access requests to the relevant entities (i.e., citizens and municipalities). Every local gateway connects to the core GUARD system. It provides a common centralized point to define access policies, independently of the current location of data. Both citizens and municipalities define restrictions on the propagation and usage of data, under the restrictions imposed by their different roles (for instance, citizens can deny access to private entities, but cannot remove data or deny access to municipalities). Access to data may be unconditional or conditional, depending on the fact the users (citizens and municipalities) are asked for granting permission. Continuous reporting of monitoring information to the GUARD Security Manager is used to detect suspicious activities or potential attacks to the infrastructure. • Data tracking and access control Expected GUARD • Detection of misuse Features • Policies for propagation of personal and sensitive data • Awareness to multiple classes of users The services of the public administration are undergoing a radically digitalization process, through complete integration between different agencies is still far to be reached. Citizens are usually required to exhibit their personal identification badges when requesting public and private services, and to notify any changes in their address, marital status, ownerships, and so on. This duplication of data is prone to transcription errors Expected and missing notifications, which lead to incomplete or outdated information. Sharing Improvements personal data in a selective way is becoming a common procedure nowadays (the ORCID database for researchers is a meaningful example of consenting access and data tracking), but a generalized approach to the problem is still missing. GUARD provides a set of APIs to query each service about the presence and usage of private and sensitive data, in order to trace the location and propagation of data among

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Information Exchange System in a Metropolitan City services. Any access or use of data will trigger a notification and the verification of user’s policies. The definition of fine-grained access and usage rules will contribute to the creation of privacy-by-design systems, where only the minimal amount of information is shared, in compliance with the principle of data minimisation defined in the GDPR. Differently from existing services, GUARD will also enable analysis of access requests, and their correlation with other factors (network traffic, authentication requests, etc.), hence contributing to create a broad and global security context for identification of advanced and persistent threats through machine learning or other techniques. In the specific context addressed by this application scenario, the main benefits brought by the GUARD approach are: • Increased usability and confidence in services from local public administration; • Detection of improper or malicious use of services; • Simplified monitoring of data and services.

2.9.5 The smart grid scenario Name Smart Grid Even though the electrical grid has been operated for over a century with minimal changes to the original design, it is nowadays undergoing a radical evolution, pushed by the need for more efficient generation, integration with renewables, new business and market models, bi-directional flows of traffic. This evolutionary path is largely based on the massive deployment of ICT technologies, including smart devices, communication networks, computing and storage infrastructures, digital services and processes. The Motivation transition to a “smart” grid is expected to bring more intelligence, flexibility, and automation in all control and management processes, from generation to transmission, distribution, and consumption; however, the presence of ICT infrastructure would expose the grid to many remote security threats, exploiting vulnerabilities in intelligent electronic devices (IED), communication protocols and network topology. Further, the large amount of customer’s data also rises huge concerns about privacy. Smart grids can be defined as an electricity network that can cost efficiently integrate the behaviour and actions of all users connected to it – generators, consumers and those that do both – in order to ensure economically efficient, sustainable power system with low losses and high levels of quality and security of supply and safety [36]. Different players are expected to interoperate on an open market, creating supply chains and digital pipelines of information: electricity generation, transmission system operators (TSO), Description distribution system operators (DSO), users, prosumers, and so on. Figure 15 shows a simplified scenario for operation of a smart grid. DSO measure electricity consumption of both industrial and residential users; this data is mainly used for billing, but they can also be processed to create historical trends and statistics, so to predict the load in a weekly, daily or even hourly timescale. TSO are mainly responsible to buy electricity from producers and to sell them to DSO. They also monitor the grid to identify anomalies and faults; private or public networks can be used to this purpose,

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Smart Grid with different considerations on cost, reliability, security. Just to make a very simple yet illustrative example, let’s consider a very conceptual electricity supply chain made of a power plant, a transmission operator, and a distribution operator. To correctly operate the system, the power injected by the power plant into the grid must balance the current load. We can imagine the indication of the current load as a digital service, which fed the power plan control system. In case the former gets compromised, the latter may be driven away the equilibrium point, with catastrophic consequences for the safety of the electrical grid and the attached users.

Figure 15. Operation of the Smart Grid. Let us suppose a DSO gathers information about electricity consumption from its users through a dedicated network slice, and processes them in a cloud infrastructure. The TSO collects status information and measurements on its grid equipment (transformers, energy flows, phasors, circuit breakers) by means of many sensors through a public network slice. Generation plants are also part of the same verticals, by receiving indications about the electricity to supply in the next period (hour, day or longer) and the current load. All these digital services are part of a common business chain that is critical for the correct operation of a Smart Grid. Reliable and safe operation requires high service availability and trusted data. For example, from the perspective of the TSO, it needs to monitor end-to-end latency of packets, as well as the network utilization to be sure no volumetric attacks are in place. From the GUARD dashboard, it therefore installs monitoring and timestamping programs in the infrastructure of the Network Provider, Cloud Provider, Users, and Grid sensors. Measurements are used to create usage profiles for different periods, so to facilitate the detection of anomalies and deviations to the normal behaviour. Therefore, also lightweight data fusion and aggregation tasks are delegated locally, so to reduce the overhead on the communication links. The DSO also needs certification of data provided by the DSO, Users, and sensors, to be sure these subsystems have not been compromised. It uses the GUARD dashboard to verify that antivirus and IDS services are running; it may also perform remote attestation of the software that provides the data, to be sure it did not get compromised. This last procedure has not limitations on the components owned by the TSO (i.e., the grid sensors), but should be subject to authorization and access control when carried out on services owned by third parties (e.g., smart meters of the users, which are typically owned and operated by the DSO).

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Smart Grid From the user perspective, their data about power consumption (hourly, daily, weekly, monthly or with other periodicity) is collected by the DSO for billing purposes. Aggregate data on all or part of the users are likely be used for management of the grid, but this does not represent a privacy issue for the users. Also, the profiling for proposing different contracts or fares by the DSO should fall under the general agreement. If such data are used for other purposes (e.g., selling consumption profiles to third parties, integrating individual user’s data in analytics of external providers) users should be aware and allowed to agree or disagree. The GUARD dashboard available to users (maybe provided by a friendly app on the smartphones) gets notified by the DSO when data are transferred to external infrastructures, e.g., a cloud provider for processing, a network provider or the Internet for transfer, etc. Users can therefore trace the propagation of their data in different infrastructures and countries, asking to stop and delete if they feel such actions may harm them in some way. • Deployment of monitoring and inspection tasks • Delegation of aggregation and pre-processing tasks Expected GUARD • Exposing the list of available security services in GUARD APIs Features • Detection of anomalies and attacks on the network • Tracing of data • Awareness to users The detection of anomalies and other network/software threats is often performed locally, to avoid overwhelming the network with the transmission of large amount of data. This gives limited visibility, since only local information can be considered. The ability to perform data fusion and aggregation tasks locally allows better usage of the communication bandwidth, and the possibility to have a centralized detection logic that has visibility on a broad range of devices, infrastructures, and services. Expected Currently, the usage of users’ data is settled by fixed and quite vague agreements, which Improvements do not give clear indication about the purpose and way these data will be used. With the inclusion of specific APIs and the definition of more binding regulations, the development of GUARD interfaces that receive notifications about data transfer to third parties becomes rather straightforward, with a greater improvement of user’s awareness. Further, this approach also enables the definition of proactive or reactive agreement policies: the users may be asked explicit permission to transfer the data, while retaining the right to stop it at any time.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2.9.6 The smart city scenario Name Smart City The Smart City is a collection of urban strategies and planning aimed at linking the physical infrastructures with human, cognitive and social capital of its inhabitants. The concept implies a radical innovation of public services, based on the design and implementation of cyber-physical systems for mobility, energy efficiency, and environmental sustainability by interconnecting things, people, processes, and data through massive usage of information and communication technologies. Smart Cities are expected to improve the quality of life for citizens, as well as to become a key enabler for new services and business, leveraging the high degree of interconnectivity, the large Motivation availability of information, and the broad context awareness. Boosted by huge investments, the amount and complexity of digital infrastructures and processes are increasing every year in cities around the world. As soon as a growing number of social, medical, industrial, energy, and environmental services will heavily rely on ICT, digital infrastructures and technologies will become critical assets for Smart Cities. Core infrastructural components, such as a Sensor Networks, cloud services, databases, and public mobile networks need to be reliable and robust against cyber-threats and service interruptions, both with respect to external attacks and internal misuse. The Smart City scenario is based on a real use case. As part of its digital strategy, the city of Wolfsburg is developing and deploying an ICT infrastructure for building smart services that help tackle issues like waste management, parking and metering, pollution and transportation. The infrastructure is based on a modern sensor network covering the city and its surroundings, including several Smart Gateways interconnected by a high-speed- low-latency fibre network throughout the whole city (see the left side of Figure 16).

Description

Figure 16. Smart City solution in Wolfsburg. The Smart Gateway is a small edge computing installation and consists of 3-5 general- purpose computers. Each gateway has local fast storage, a multi-core multi-tenant CPU, x4 Ethernet interfaces and at least 32GB RAM per node. Kubernetes acts as the container orchestrator. The first generation of Smart Gateways includes both LoRaWAN and WIFI/LTE interfaces; they implement so-called packet forwarders, protocol/payload decoders, scheduled/recurring (cron type) of tasks and other network functions. All Smart Gateways are connected by the optical backbone to remote data centres managed by Wobcom, an ICT service provider owned by the municipality.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Smart City WobCom already provides free internet access by WiFi (FreeWolfsburg) in the city. The plan is to provide a similar service to many IoT devices, through the LoRa network, so to create connectivity among things and citizens. Both the edge installations and central data centres can host applications of third parties that create Smart City services. The right side of Figure 16 identifies different digital services that are involved in the Smart City scenario, highlighting the presence of multiple actors (Wobcom, service providers, citizens, owners of the physical infrastructures). Even though most LoraWAN applications are not real-time and tolerate some degree of service interruption, the full range of possible services under the scope of Smart City cannot exclude carrier grade connectivity that provides reliable and robust quality of service, not to mention the need for trust and high security guarantees. This is especially critical for the most recent computing paradigms, since edge installations do not have enough resources to deploy the same security measures a normal data centre would have. Opening access to thousands of devices requires a multi-tenant, self-learning approach, but the market for cyber-security tools is currently lacking proper solutions that can be effectively applied in a distributed, heterogeneous, multi-tenancy environment. Monitoring the good and fair behaviour of the different network participants is a tight requirement of the European regulations regarding the usage of open-shared frequencies and properties of LoRaWAN. With GUARD, each digital component shown in the right side of Figure 16 will implement internally its own monitoring, inspection, and enforcement tasks; a common interface will expose these capabilities, together with the description of security properties (vendor, release, updates). Local agents deployed in each component will report measurements, events, logs to a common central framework, which therefore gains deep visibility over the whole chain to detect and analyse even the weakest correlations in the cyber-security context. Beyond the common interface to security capabilities, the great value of GUARD local agents will be programmability, i.e., a large flexibility in defining the local operations (filtering rules, log aggregation, pre-processing tasks, etc.). For example, the amount of traffic generated by IoT devices could be compared with that received by cloud/edge applications, to detect anomalies and possible on-going volumetric Denial-of-Service (DoS) attacks. Suspicious end devices and gateways could be selectively monitored at the radio/network level, so to identity incorrect or malicious behaviours, including various kinds of DoS (volumetric, syn flood, amplification), man-in- the middle (e.g., rogue diversion of network traffic), tampering, manipulation or alteration of the device (operating system, libraries, applications, data). This will help detect intrusion especially at the edge, which is one of the weakest links in the chain. The dichotomy between local and central processing should also be leveraged to improve the resilience of the same framework. As a matter of fact, some “cron types” might not have an “always connected” requirement, and could work locally for a certain period of time. Local GUARD agents should be able to apply pre-defined or fallback enforcement rules in case the central framework cannot be reached, so to ensure continuity of service even in case of direct or indirect attack to the framework itself.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Smart City Another important feature of the GUARD framework is support for multiple users. A single installation may be managed by a trusted security provider, offering interfaces to all users that might be affected by security concerns: Wobcom, the municipality, service providers, infrastructure owners, and citizens. Through the external interface, each user can set up its own security service. For example, infrastructure owners may be interested in detecting if their devices get compromised (to avoid responsibility when they are used for other attacks), Wobcom may be interested in avoiding DoS and amplification attacks that overwhelm its ICT infrastructure, service providers may be interested in the integrity and availability of their applications. • Security capabilities embedded in each digital service (local GUARD agents) • Disconnected operation Expected GUARD • Multi-user interface Features • Local programmability • Network analysis The main technical challenge for the implementation of Smart City services is the interconnection of heterogeneous components (IoT devices, applications, citizens, processes) in different domains. While a lot of effort has been put in proposing n web services, middleware, and service-oriented architectures, security concerns beyond identity management and access control have been largely overlooked. Indeed, existing tools in the market for complex systems including cloud, IoT, and network assets are often designed and integrated for each specific system, resulting in rigid architectures that cannot be adapted to evolving systems and partially unknown topologies. The Expected introduction of security capabilities in each digital component, accessible through Improvements common interfaces and APIs, represents a ground-breaking evolution in security architectures for Smart Cities, allowing the composition of dynamic environments where devices, users, applications, and processes can be easily plugged or removed in multi- tenancy environments without requiring the re-design of the cyber-security architecture. The most tangible benefit will be improved service level agreements, also including strong security features that can be defined by each actor. While infrastructure providers will be mostly concerned about the availability and integrity of their infrastructures, service providers will care about the reliability, continuity, and trustworthiness of their operation, which are essential to deliver high-levels of (end) user experience.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

2.9.7 The wearable devices scenario Name Augmented Reality for shopping Augmented Reality (AR) enhances real-word environments with computer-generated information encompassing several sensory modalities: visual, auditory, haptic, olfactory, somatosensory. It largely relies on wearable devices, which are used both to capture the environmental conditions and, most of all, to represent the augmented information. Though hardware, algorithms and software for AR have already been largely developed in the past, the need to exchange large amount of data between sensors and processing devices has often limited its application to closed environments where the necessary infrastructure and data sets are deployed locally. Motivation The advent of 5G and cloud computing technologies pave the wave for unprecedented business opportunities based on public AR services, for example for shopping or food services. From the technological point of view, there are all the elements for an industrial-scale deployment. The real, huge weak points limiting the proposition of such use case are represented by security, privacy and sensitive data management issues. From this point of view, the introduction of edge computing nodes, as an intermediate network element able to process and store part of the information in transit, makes the problem even more complex. This scenario addresses the use of wearable devices for Augmented Reality. In this context, particularly interesting and promising is their usage for shopping, where multiple applications can be identified. Imagine entering a store wearing smart glasses, looking at clothing items and immediately discovering different prices, promotions and colours, check if your size is available and see how they would look at you in front of a mirror without even wearing them. Or, in a restaurant, imagine receiving all the valuable information on what you are reading on the menu and browsing reviews of the dishes. Or receiving promotions dedicated to you when you physically approach a pub that sells your favourite beer. The implementation of AR services is based on local acquisition/rendering and remote processing at the network edge (see Figure 17). The Owner of the Mall (OM) deploys an Description Edge Application in the Mobile Edge Computing (MEC) platform that provides AR services for clients, as well as IT services for local shops (POS and warehouse management, etc.). This application may access third parties’ dataset concerning characteristics and properties of the products (components, origin, quality, etc.); it might also connect to user’s profile for delivering tailored advertisements and promotional offers. The client side of the AR application is composed of acquisition/rendering devices (glasses) and software (for example, running on a smartphone). 5G connectivity is used between local devices and the edge platform. The Edge application is in charge of all heavy computational workloads (video transcoding, video analytics, ML applications, etc.), leaving connectivity and display (glasses) management on Mobile App. This approach combines large processing power available in fixed infrastructure with low-latency and high-bandwidth for best quality of experience.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Augmented Reality for shopping

Figure 17. Remote AR services based on edge computing. From a technical perspective, it is likely that the Mall application is composed of a central component identified by a fixed URL, which is contacted by the visitor’s devices and redirect them towards the best edge instance, based on the mobile operators used by the visitors. At this point, the wearable sends real-time video streams, the real time position of the person and eventually information about other embedded sensors. Data provided by wearables is processed by applying data analytics and machine learning techniques, also accessing external cloud services. When glasses frame a specific object (i.e. clothing items) or a writing (i.e. menu item), the application running at the edge identifies/interprets it and transmits the contextual audio/video information to wearable for AR representation. Several digital components are involved in the implementation of the AR service: 1. Wearable device(s) 2. Local computing platform (e.g., smartphone) 3. Edge Mall Application 4. MEC platform and 5G network 5. Remote datasets 6. Cloud infrastructures hosting remote datasets From a security perspective, the final users of the system are the Mall visitors, through their smart glasses, which are not aware of the different infrastructures and domains involved in the creation of AR services. On the other side, the OM needs to monitor the integrity and availability of its application at the edge, since it is used both for commercial support and financial transactions. Therefore, two interfaces to the GUARD framework are expected, one installed on mobile terminals of customers and the other available for the OM. When smart glasses approach a specific place, the application running at the edge calls a cloud service and looks for personal preferences and tastes coming from the personal profile, to provide customized advertising. The security workflow of this scenario could be summarized as follows (see right side of Figure 17): 0. Visitors of the Mall that wear smart glasses with AR capability download the Mobile App of the Mall; this process may be based on existing integrity and trustworthiness mechanisms for mobile applications and does not fall under the scope of GUARD.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Name Augmented Reality for shopping 1. A visitor enters the Mall; the AR App discovers an AR service and asks permission to GUARD to connect. The GUARD framework checks the availability of security APIs and queries the AR service to verify its security properties are compliant with user’s policies. 2. Since there are multiple Mobile Operators, the OM should install its Edge AR Application in multiple infrastructures. This task can be accomplished on- demand, upon request of connection from the user. Through GUARD, the OM checks the security properties of the MEC platform before offloading its application. 3. The Mobile App starts running: the Smart Glasses transmit streaming video and position data to the Edge AR application for low latency real-time video analytics. In case data are used to extrapolate personal information (preferences, profiles, history) for storing them, the GUARD framework is used to get permission from the user (either interactively or automatically based on preferences). The identity of external cloud services should be provided as well, to allow direct queries for verifying security capabilities. 4. The GUARD framework queries the different components for guaranteeing integrity and availability of the application. a) It monitors the number and correctness of service requests and the amount of streaming data, to avoid abuses and illicit behaviours (DoS, Man-in-the-middle, Amplification attacks). b) It also gathers statistics on network usage from the mobile network operators, to verify QoS and the compliance with commercial SLA; this guarantees that no packets are dropped and the service remains fully available. • Guard local agents embedded in software and devices (mobile App, Edge AR App, MEC platform). Expected GUARD • Security policy and management. Features • Evaluation of the level of trustworthiness of external services before authorizing their usage. • Detection of network attacks in the Edge platform: DoS, Amplification. With the performance gains promised by 5G technology, application of AR is expected to extend to public services. Since AR is largely based on the elaboration of the personal Expected context of individuals, it can easily be used to profile people and to derive preferences, Improvements orientations, opinions. In this respect, the availability and integrity of shopping applications do not bring safety implications, but are essential to preserve the confidentiality of information.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3 State of the Art analysis

3.1 Introduction GUARD analysed the scientific literature, research initiatives and current market solutions in order to select candidate technologies and approaches aiming to identify missing gaps, and the technical priorities to be tackled. More specifically, fast paths for packet processing are necessary for the project in order to address the need to have fast and efficient monitoring, inspection, and enforcement tools that can be easily and seamlessly integrated into a variety of virtual and cyber-physical components. Security Information in GUARD purpose is to generate CTI through correlation of open/external threat information. New detection algorithms will be developed that exploit the overall view on the distributed service and retrieve CTI from novel findings. Patterns for known attacks will be automatically recognised, and for unknown attacks, deviations from normal system behaviour that manifest in anomalies that indicate new threats will be identified. Data models and APIs for Distributed Systems purpose is to provide the definition of open APIs for retrieving the security context of running services and detecting the presence of and notify access to private and sensitive data. GUARD’s open APIs are defined for retrieving security and privacy information and checking security properties of the execution environments in order to develop security-by-design system. Machine learning and other techniques for threat detection support automatic patterns recognition for known attacks, and identification of deviations from normal system behaviour that manifest in anomalies that indicate new threats and unknown attacks. User-machine Interfaces, User Awareness and Involvement in Cybersecurity Practice aims to build end-user interfaces leveraging dynamic composition, security capabilities and programmability of system components. They will enable GUARD to take appropriate and timely action in response to detection of attacks or other suspicious activity and to restore the system to normal operation following an attack, by orchestration of hardware and software elements, according to users’ policies. In addition, the GUARD dashboard in the user interface will graphically show the propagation of data along the business chain, will notify about access request not covered by current policies, and will include options to remove data from given services.

3.2 Fast paths for packet processing

3.2.1 Programmable data planes for packet processing The analysis of network traffic is an important aspect for attack detection and threat identification. It consists of inspecting packet headers, comparing their fields against given values, updating statistics, and sometimes modifying (or dropping) the packet. The main requirement is to perform all these operations without delaying packet transmission and reception and without overwhelming the inspecting device; this may be challenging particularly in case of large volumes of traffic, as typically happens for NFV services. Traffic analysis is indeed part of the more general packet processing that is performed in a communication network. Typical operations include filtering, classification, queueing, switching, routing, natting, and deep packet inspection; they represent the data plane of network architectures, which performs the operations requested by the control and management planes. Packet processing in the data plane is usually based on simple logical structures (decision trees, finite state machines) that are easy to realize in hardware, which is the best option to effectively address performance issues. Performance is measured as number of packets processed per unit of time, and is mostly affected by high-speed links, small packet sizes, and large protocol states. Even recent software-defined paradigms make systematic use of hardware-based switching technologies (usually FPGA, or

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Network Processors) for building fast data planes, while software applications are only developed for the control and management planes. However, the interest in Network Function Virtualization (NFV) technologies has fed the interest for efficient packet processing in software. This effort has resulted in several technologies for fast and programmable data planes, which look interesting for the project purposes. There are two main elements that hinder effective packet processing in software: • limited performance (bus capacity, memory access) in commodity hardware (PC-like); • frequent memory copies in the processing pipeline. Hardware performance of modern general-purpose servers has largely improved in the last decade, with the introduction of high-bandwidth internal buses (i.e., PCI-express, PCI-X) and DMA. Memories are improving as well (i.e., DDR3/4), but unfortunately more in terms of bandwidth than in terms of access time, which has a huge impact on network processing particularly when the data cannot be found in the CPU cache. Overall, hardware is today a secondary performance issues than software. This is especially true for virtualization hardware, which is designed to bear the traffic of multiple Virtual Machines (VMs). Performance limitations of software mostly come from the generality and complexity of modern operating systems, which are designed to support many different protocols (in many cases, just for backward compatibility with outdated solutions). As a matter of fact, typical operating systems are organized in multiple architectural tiers, which often results in long internal pipelines. The most relevant structure is certainly the dichotomy between kernel and user space. Processing packets in user-space is far simpler and more secure than in kernel space, but it requires the packet to stay in the user space most of the time, otherwise it suffers excessively from the overhead and latency introduced by the kernel/user space context switching. In addition, the network stack implementation within the kernel usually follows common protocol layers (i.e., TCP/IP and ISO/OSI), so the pipeline length changes accordingly to the specific layer where packets are handled. Real-time operating systems have bare structures that limit the internal overhead and enable more efficient processing; however, we argue that they cannot be used in virtualized environments, where multi-tenancy and strict resource isolation are inescapable requirements. The networking stack of standard operating systems is not suitable for processing large streams of data, so several alternative solutions have been designed to overcome this limitation. Before elaborating on the alternative solutions, it is worth recalling that data plane programmability entails two main aspects. The first aspect concerns the design of the processing pipeline, i.e., the type and sequence of operations to be performed on network packets. Lightweight processing usually requires pipelining inspection, modification, and processing operations according to software and hardware architectures (bath processing, zero-copy, no context switches, and so on). The processing pipelines may be fully written in high-level languages, it may consist of a fixed yet programmable structure, or it may be defined according to specific low-level modelling languages. The second aspect concerns the specific rules to classify packets, update statistics, and perform actions. In some cases, this configuration is explicitly included in the implementation of the processing pipeline, but there are situations where external entities (i.e., the control plane) are expected to push the configuration through specific protocols. Since both aspects are relevant for programming inspection and monitoring operations, we consider available technologies for both the data and the control planes.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3.2.2 Technologies for building fast data planes Table 1 compares the main technologies for developing fast data planes that may be of interest for GUARD. They are grouped according to the overall approach and scope. For the sake of brevity, we only elaborate on their main concepts and applicability.

3.2.2.1 Kernel hooks A first group is represented by kernel hooks that give userland applications direct access to networking hardware or the internal kernel networking stack; they are usually designed to provide direct access to low-level drivers or network interface cards, hence bypassing in part or completely the kernel stack. These hooks are not network data plane per se but can be used to implement custom processing pipelines. Kernel hooks deliberately just provide basic input/output functions, because they can be used for different applications (monitoring, forwarding, replication). In GUARD, they may be used for monitoring purposes, assuming traffic replication is available; however, the implementation of enforcing actions requires to by-pass the kernel stack, and to re-write a lot of code to manage network protocols. Indeed, the main benefit of these frameworks, namely acceleration, is partially vanished in virtual environments, where access to the real hardware is usually mediated by the hypervisor (except when SR-IOV2 is used). That means the DMA and similar features will not be available, even if performance still benefits from zero-copy and lightweight protocol operations. Technology Target Programmability Main features Status Network Direct access to packets (no kernel networking stack); Raw socket – Active socket Packet filtering (Berkeley Packet Filter, BPF). Packet filtering in kernel space, direct access from user- Network BPF-like space; Dynamic Network Interface Card (NIC) Access; PF_RING Active socket packet filters Zero-copy support; Hardware filtering in NIC where available. Direct access to packets (no kernel networking stack); Raw packet Netmap – Memory-mapped region with pre-allocated packet Active I/O buffers. Acceleration Hybrid kernel/user-space stack; Dual kernel/user-space OpenOnload – Active middleware operation; Memory-mapped kernel region.

Table 1. Main technologies for kernel hooks.

3.2.2.2 Data planes in user space The second group consists of full data planes implemented in user-space; they are generally conceived as lightweight and high-performance replacements for the standard networking stack, which poor performance are often due to the large number of heterogeneous protocols supported.

2 Single Root Input/Output Virtualization (SR-IOV) is a specification that allows the isolation of the PCI Express resources for manageability and performance reasons. A single physical PCI Express can be shared on a virtual environment using the SR-IOV specification. The SR-IOV offers different virtual functions to different virtual components (e.g. network adapter) on a physical server machine.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Technology Target Programmability Main features Status Development Zero-copy packet access; Ring buffer; Memory DPDK C/C++ libraries Active framework management; Environment Abstraction Layer. Memory-mapped L2 switch; IP router; MPLS; Tunnelling; Packet Fast data message bus. VPP telemetry; Filtering and classification; Lawful Active plane Bindings for C, interception. Java, Python, Lua APIs for ODP hardware – Reference implementation for Linux (odp-linux). Active acceleration NFV Python/C++ Packet processing pipelines; Kernel bypass or legacy BESS Active dataplane Script/CLI operation. Kernel bypass; Service graph design; High-speed NVF SNABB Lua/C communication with VMs in user-space; SIMD Active dataplane offloading; Just-in-time compiler. SDN OpenFlow Stateful packet processing; Based on OpenFlow OpenState Inactive dataplane extensions Software Switch.

Table 2. Main technologies for data planes in user space. Several frameworks are available for creating data planes in user-space. They often leverage the previously described kernel hooks to provide fast access to raw packet I/O, while hiding the heterogeneity and complexity of the hardware. These frameworks are specifically conceived to build networking functions, bypassing the standard kernel networking stacks. Therefore, they provide a set of built-in functions for queuing, pipelining, forwarding, encrypting/decrypting, and timestamping operations on network packets. In some cases, they explicitly provide support for connection of Network Functions. The programming abstraction is often limited to define the packet pipelines in terms of processing modules; there are standard modules, but additional ones can be also developed by users. Applications can be specifically developed to directly connect to these frameworks via specific libraries; legacy applications may also connect through the standard networking stacks. However, performance benefits are only evident in the first case, where all the duplication and inefficiency of kernel implementations are totally bypassed. Data planes in user-space are a very effective solution to build software-based implementation of network devices, or to create dedicated environments for network function virtualization and service function chaining. The applicability of these frameworks in GUARD might not be straightforward: the innovative cyber-security concept of GUARD, focused on the service rather than on the underlying infrastructure, does not require full networking capabilities in all VMs. Even more, the impossibility to modify the hypervisor software layer, which is common in virtualized environments (e.g., public datacenters) prevent the usage of the above technologies. Finally, (re-)engineering applications to use specific networking libraries represents an important hindrance to applicability in the largest number of contexts. OpenState and ODP are specifically conceived to enhance the programmability of the hardware, hence they do not fit well virtual machines. Compared to OpenFlow, they allow the creation of state machines, which are far more powerful than stateless configuration. However, the processing pipeline is fixed, unlikely P4 where programmers have the possibility to declare and use own memory or registries. However, the flexibility of P4 leverages more advanced hardware technology, namely dedicated

D2.1 Vision, State of the Art and Requirements Analysis V1.0 processing architectures or reconfigurable match tables as an extension of Ternary Content-Addressable Memories (TCAMs), even if P4 programs may also be compiled for eBPF in the Linux kernel.

3.2.2.3 Data planes in kernel space The third group is represented by built-in kernel filtering and processing capabilities, which is essentially limited to the restricted family of BPFs. The BPF framework provides a kernel virtual machine to run user programs in a controlled environment. The latest extended version gives access to a broad range of system-level resources (network sockets, traffic schedulers, system calls, kernel functions), without the need to deploy additional processing frameworks. The BPF implements a very effective monitoring solution: its directed acyclic Control Flow Graph (CFG) was shown to be 3 to 20 times faster than the Boolean expression tree used by the filtering module in the Data Link Provider Interface (DLPI), depending on the complexity of the filter. Another difference is that BPF always makes the filtering decision before copying the packet, in order to not copy packets that the filter will discard. Interestingly, there have been attempts to use Extended BPF (eBPF) programs to write extensible datapaths, for example for Open vSwitch [37]. We point out that any inspection/enforcement solution based on the eBPF implicitly makes use of the standard kernel stack, which may be less efficient than existing data paths in user- space depending on the operating conditions. However, the main advantage of eBPF is the capability to run on an unmodified software stack, encompassing existing Linux kernels, which is a rather common scenario also in virtualized environments (and hypervisors).

Technology Target Programmability Main features Status General in- C (data plane); Direct access to packets; integration with Linux eBPF kernel event C++, Python, LUA networking; possibility to intercept any system call Active processing (control plane) (hence not limited to networking tasks). Open Software OpenFlow/ Support hardware- or software-based datapaths; Fast Active vSwitch dataplane OVSDB and slow paths.

Table 3. Main technologies for data plane in kernel space.

3.2.2.4 Fully programmable data planes The fifth group considers actual programming languages for packet processing. In this case, the pipeline structure is defined by the program, hence it is conceptually possible to carry out any kind of operation on any packet header, provided this is consistent with performance requirements.

Fully programmable data planes are unlike to be suitable for software-based implementation. Indeed, they are mainly conceived for hardware targets, to overcome the traditional difficulty of low-level languages (e.g., VHDL). There is only one technology available in this field, namely P4, which is mainly conceived for re-programming the hardware. P4 programs can also be compiled for the eBPF, so they could be used to build kernel data paths for VMs.

Technology Target Programmability Main features Status Programmable P4 hardware data P4 High-level language to access hardware features. Active plane

Table 4. Main technologies for fully programmable data planes.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

It is hard at this stage to say whether a P4 program might be more convenient than a native eBPF program. In general, we point out that P4 can be used to describe protocol headers, and this may be useful when writing eBPF programs. Writing eBPF programs in P4 could be useful in case the same program is used for both software and hardware targets (e.g., in NFV environments).

3.2.2.5 Programming abstractions Beyond processing technologies, it is also important to understand what abstractions are available to program or, in some cases, just configure the data plane. Programming abstractions mark the boundary between the data and control planes, and basically defines what kind of computation can be performed locally and what need to be done centrally. Despite the large number of acceleration technologies and frameworks for data plane, the number of programming abstraction is rather limited. Usually, these technologies leverage the concept of Software Defined Networking (SDN) to steer the pipeline through configuration of matching rules and processing actions.

Protocol Target Language Main features Status SDN / hardware Flexible packet processing; available on OpenFlow OpenFlow Active data plane (selected) physical network devices. OpenState Open SDN / hardware OpenFlow OpenFlow extensions for stateful packet Packet Processor Inactive data plane extensions processing. (OPP) Flexible abstraction for network devices and NETCONF RESTCONF SDN Yang services, overcoming the limitations of SNMP; Active Template-based description

Table 5. Comparison of control and configuration protocols. We briefly compare two standard protocols for SDN and one interesting proposed extension in Table 5. While programmable data planes can be used to attach specific hooks to relevant security events (including packets and system calls) and to define processing pipeline, control protocols can be used to push matching rules and specific actions (drop, block, allow, alter, etc.) in the pipeline. OpenFlow pushes matching rules and processing/forwarding actions in flow tables. It expects a specific structure for the data plane, based on multiple tables and fixed processing pipelines. OpenFlow supports a limited (yet rather representative) set of protocol headers; vendor-specific extensions are possible, but at the cost of reduce interoperability. The main limitation of the OpenFlow abstraction is stateless operation, which prevents the possibility to manage various common network protocols (and security inspection operations too). OpenState extends OpenFlow with the abstraction of the extended finite state machine, which have been used for DDoS protection in SDN. Unfortunately, this project has never evolved to a more mature state, including the implementation of an efficient software dataplane. RESTCONF is more flexible, since it does not assume any specific structure for the underlying dataplane. Indeed, the template-based approach of NETCONF allows to write Yang models for very different devices, although the counterpart of this flexibility is some more procedural complexity, due to the need to exchange and agree on the specific models.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3.3 Security Information

3.3.1 Scientific literature Cyber Threat Intelligence (CTI) deals with all kind of information that is in some way related to cyber threats. However, CTI is not just any data; there must be an intrinsic value to the data that makes it intelligent. There have been several attempts to describe intelligence in the context of cyber security more accurately. McMillan [38] defines CTI as evidence-based knowledge that includes context, mechanism, indicators, implications, and actionable advice. Chismon and Ruks [39] on the other hand describe CTI through the process of detecting and understanding previously unknown threats in order to mitigate the risks they pose. In general, any security- related data should be processed and enriched in order provide the necessary validity and depth to be regarded CTI. There are multiple types of potentially useful information obtained from various data sources for security defence purposes. Every type has its own characteristics regarding the objective (e.g., to facilitate detection, to support prosecution etc.), applicability, criticality, and “shareability” (i.e., the effort required to make an artefact shareable because of steps required to extract, validate, formulate and anonymize some piece of information). Depending on their content, cyber threat information can be grouped into the categories defined by NIST [40], and OASIS [41], as described below.

Indicators are technical artefacts or observables that suggest an attack is imminent or is currently underway, or that a compromise may have already occurred. Examples are IP addresses, domain names, file names and sizes, process names, hashes of file contents and process memory dumps, service names, and altered configuration parameters.

Tactics, Techniques, and Procedures (TTPs) characterize the behaviour of an actor. A tactic is the highest-level description of this behaviour, while techniques give a more detailed description of behaviour in the context of a tactic, and procedures an even lower-level, highly detailed description in the context of a technique. Some typical examples include the usage of spear phishing emails, social engineering techniques, websites for drive- by attacks, exploitation of operating systems and/or application vulnerabilities, the intentional distribution of manipulated USB sticks, and various obfuscation techniques. From these TTPs, organizations can learn how malicious attackers work, and derive higher level and generally valid detection, as well as remediation techniques, compared to quite specific measures based on just, often temporarily valid, indicators.

The Threat Actors contain information regarding the individual or a group posing a threat. For example, information may include the affiliation (such as a hacker collective or a nation state’s secret service), identity, motivation, relationships to other threat actors and even their capabilities (via links to TTPs). This information is being used to better understand why a system might be attacked and work out more targeted and effective counter measures. Furthermore, this type of information can be applied to collect evidences of an attack to be used in court.

The vulnerabilities are software flaws that can be used by a threat actor to gain access to a system or network. Vulnerability information may include its potential impact, technical details, its exploitability and the availability of an exploit, affected systems, platforms and version, as well as mitigation strategies. A common schema to rate the seriousness of a vulnerability is the common vulnerability scoring schema (CVSS) [42], which considers the enumerated details to derive a comparable metric. There are numerous Web platforms that maintain lists of vulnerabilities, such as the common vulnerability and exposures (CVE) database from MITRE [43] and the national vulnerability database (NVD) [44]. It is important to notice that the impact of vulnerabilities usually

D2.1 Vision, State of the Art and Requirements Analysis V1.0 needs to be interpreted for each organization (and even each system), individually, depending on the criticality of the affected systems for the main business processes.

Cybersecurity Best Practices include commonly used cyber security methods that have demonstrated effectiveness in addressing classes of cyber threats. Some examples are response actions (e.g., patch, configuration change), recovery operations, detection strategies and protective measures. National authorities, CERTs and large industries frequently publish best practices to help organizations building up an effective cyber defence and rely on proven plans and measures.

Courses of Action (CoA) are recommended actions that help to reduce the impact of a threat. In contrast to best practices, CoAs are very specific and shaped to a particular cyber issue. Usually CoAs span the whole incident response cycle starting with detection (e.g., add or modify an IDS signature), containment (e.g., block network traffic to command and control server), recovery (e.g., restore base system image), and protection from similar events in the future (e.g., implement multi-factor authentication).

Tools and Analysis Techniques are closely related to best practices, however focuses more on tools rather than procedures. Within a community it is desirable to align used tools to each other to increase compatibility, which makes it easier to import/export certain types of data (e.g., IDS rules). Usually there are sets of recommended tools (e.g., log extraction/parsing/analysis, editor), useful tool configurations (e.g., capture filter for network protocol analyser), signatures (e.g., custom or tuned signatures), extensions (e.g., connectors or modules), code (e.g., algorithms, analysis libraries) and visualization techniques.

Interpreting, contextualizing and correlating information from different categories allows the comprehension of the current security situation and provides therefore threat intelligence [45].

Actionability in the context of threat intelligence usually refers to whether the threat information at hand can be utilized without the need for further analyses. However, exact requirements for actionable threat intelligence depend on the application area. For example, actionability could refer to the specificity of the information and its ability to enable decision-making [46], relate to the timeliness and availability of CTI [47], deal with its representation in standardized formats [48], or relate to threat actor understanding, selection of countermeasures as well as detection, mitigation, and prevention of attacks [49].

3.3.2 Research Initiatives Standards

MITRE Standards

Structured Threat Information eXpression (STIX) [50] is a standardized language for structured cyber threat information representation. The STIX language aims at providing comprehensive cyber threat information as well as flexible mechanisms for addressing such information in a wide range of use cases. STIX’s architecture comprises a large set of cyber threat information classes; including, indicators, incidents, adversary tactics techniques and procedures, exploit targets, courses of action, cyber-attack campaigns, and cyber threat actors.

Existing structured languages, such as Cyber Observable Expression (CybOX), Malware Attribute Enumeration and Characterization (MAEC), Common Attack Pattern Enumeration and Classification (CAPEC), can be leveraged to provide an aggregate solution for any single use case. Furthermore, numerous flexibility mechanisms are designed into the language so that portions of the available features are independently usable, accounting for the relevance of a specific use case.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Trusted Automated eXchange of Indicator Information (TAXII) [51] defines a set of services and message exchange mechanisms for the detection, prevention, mitigation and sharing of cyber threat information across organization and service boundaries. It allows organizations to achieve improved situational awareness about emerging threats, enabling them to share subsets of information with a selected list of partners they choose. TAXII is the preferred method to securely and automatically exchange information represented in the STIX language. TAXII use cases include public alerts or warnings, private alerts and reports, push and pull content dissemination, set-up and management of data sharing between parties. It uses a modular design that can accommodate a wide array of optional sharing models. Sharing models supported by TAXII include (but are not limited to):

• Source-Subscriber: A single entity publishes information for a group of consumers.

• Peer-to-Peer: A group of data producers and data consumers establish direct relationships with each other. All sharing exchanges are between individuals.

• Huband-Spoke: A group of data producers and consumers share information with each other. The information is sent to a central hub, which then handles dissemination to all the other spokes as appropriate.

• Push or Pull Sharing: Data consumers are automatically provided with new data (push), or the consumer can request updates at times of their choosing (pull).

IETF Standards

The Managed Incident Lightweight Exchange (MILE): IETF Working Group defined two main standards for describing (IODEF) and exchanging (RID) incident information. Although the current implementations of IODEF and RID are mostly limited to the technical description and local exchange of IoCs, the standards are designed to allow large-scale sharing of complex incidents.

The Incident Object Description Exchange Format (IODEF): specification described in RFC 5070 [52], provides an XML representation for conveying incident information across administrative domains. The data model comprises information about hosts, networks, services running on the systems, attack methodology and associated forensic evidence, the impact of the activity, and approaches for documenting the workflow.

The Real-time Inter-Network Defence (RID) protocol described in RFC 6545 [53] was designed to transport IODEF cyber security information. RID is flexible enough to exchange other schemas or data models either embedded in IODEF or independent of IODEF, with a transport binding using HTTP/TLS. RID is preferred for peer- to-peer models with higher levels of security and privacy.

CTI management

CÆSAIR [54]– Collaborative Analysis Engine for Situational Awareness and Incident Response is an AIT research prototype for comprehensive analysis of cyber threat intelligence.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Incident IODEF User Feedback Report Importer

Threat STIX Features Document Graphical Document Mapping Information Importer Extraction Linking Interface

Knowledge JSON/REST Base OSINT Importer

Figure 18. CAESAIR workflow.

CÆSAIR is a cyber threat intelligence solution designed to provide analytical support for security experts carrying out IT incident handling tasks on a local, national or international level. Thanks to its correlation capability [55], CÆSAIR provides analysts with the necessary support to handle reported incident information. It aggregates and examines intelligence acquired from numerous Open Source INTelligence (OSINT) feeds, as depicted in Figure 18; it quickly identifies related threats and existing mitigation procedures; it allows to establish cyber situational awareness by keeping track of security incidents and threats affecting the monitored infrastructures over time.

CAESAIR provides a reduced incident handling time in fact, from a multitude of imported security documents, CÆSAIR identifies the most relevant to a given one. It offers a reliable basis for decision making: CÆSAIR explains how documents or events are connected to one another; it allows the analyst to select the most appropriate correlation method and to flexibly adjust relevance metrics. Moreover, CAESAIR answers to strategic questions on threat landscape, such as the software products are being targeted recently, the attack patterns the infrastructure is most vulnerable to; which vendors fix vulnerabilities faster; and more. CAESAIR foresees customizable import sources: It acquires organization’s internal incident reports and a multitude of OSINT feeds. It interfaces with existing security solutions by supporting widely adopted CTI standards: IODEF, STIX, TAXII, etc.

The main application scenarios for CAESAIR include:

• The identification of implicit relations between documents of different types. Auto-tagging of documents allows to identify their class(es) based on text content analysis. Classes such as: “vulnerability”, “exploit”, “generic attack description”, “technical attack description”, “patch/fix”, “update”, “IoC”, “course of action” can are identified. Relations between documents from different classes are therefore discovered. A vulnerability and its corresponding exploits, an exploit and the respective attack description, an attack and an advisory describing its possible mitigation.

• The assistance in creation and distribution of advisories. CÆSAIR provides suggestions for generating warnings / advisories about:

o vulnerable software/hardware products (based on current threat landscape and vulnerability descriptions), o potential countermeasures for a threat, found as related documents with the tag “patch/fix” or “course of action”, o recipients of the warning (based on assets information provided by end users). Warnings / advisories are sent out to the recipient list, or available on-demand (including the historic data).

• Trend analysis. CAESAIR keeps track of the evolvement of the IT security landscape by observing how the vulnerability of a software/hardware product changes over time, how timely a software vendor

D2.1 Vision, State of the Art and Requirements Analysis V1.0

releases a fix after an exploit is disclosed, which products on the market are most exposed to security threats, as well as what are the top N non-trivial frequently co-occurring concepts in CTI.

• Interaction with existing solutions for threat and incident handling. CÆSAIR’s analytical functionality can be accessed through a friendly graphical user interface, as well as via APIs. This means that CÆSAIR can be either deployed as a full-fledged standalone installation, or run “as a service” on data collected from third-party solutions, such as threat sharing or incident handling solutions, and/or direct its output to such solutions. This allows the integration of CÆSAIR with open-source (such as IntelMQ and MISP) or commercial products.

3.3.3 Current market solutions CTI management tools

MISP [56] is a threat intelligence platform for sharing, storing and correlating Indicators of Compromise of targeted attacks, threat intelligence, financial fraud information, vulnerability information or even counter- terrorism information.

MISP includes an efficient IoC and indicators database allowing to store technical and non-technical information about malware samples, incidents, attackers and intelligence. It enables automatic correlation finding relationships between attributes and indicators from malware, attacks campaigns or analysis. It foresees a flexible data model where complex objects can be expressed and linked together to express threat intelligence, incidents or connected elements.

MISP built-in sharing functionality eases data sharing using different model of distributions. MISP can synchronize automatically events and attributes among different MISP. Advanced filtering functionalities can be used to meet each organization sharing policy including a flexible sharing group capacity and an attribute level distribution mechanism.

A user-interface allows to create, update and collaborate on events and attributes/indicators. A graphical interface to navigate seamlessly between events and their correlations. Advanced filtering functionalities and warning list to help the analysts to contribute events and attributes.

MISP can generate IDS (Suricata, Snort and Bro are supported by default), OpenIOC, plain text, CSV, MISP XML or JSON output to integrate with other systems (network IDS, host IDS, custom tools).

Flexible APIs allow to integrate MISP with other solutions. MISP is bundled with PyMISP which is a flexible Python Library to fetch, add or update events attributes, handle malware samples or search for attributes.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Sighting support makes MISP able to get observations from organizations concerning shared indicators and attributes. Sighting can be contributed via MISP user-interface, API as MISP document or STIX sighting documents.

Limitations and competitors

Other similar tools are IntelMQ [57], CRITS [58], Taranis [59] and MassiveIntel [60]. In opposite to MISP, IntelMQ and CRITS provide automated data enrichment and extend existing events by e.g. lookup information such as DNS reverse records, geographic location information (country code) or abuse contacts for an IP address or domain name. Furthermore, Taranis and MassiveIntel allow threat detection and analytics. In opposite to all other solutions MassiveIntel provides Darknet monitoring.

3.4 APIs and Data models.

3.4.1 APIs Application Programming Interface (API) is a set of subroutine definitions, communication protocols, and tools for building software. In general terms, it is a set of clearly defined methods of communication among various components [61].

APIs are needed in communication between different decoupled services in order to build applications. The use of APIs on services communicating using the network is for sure quite common nowadays.

• Strengths o Ease development of applications. o They abstract the implementation exposing only the needed objects or data. • Weakness o APIs are common attack vectors in many ways. • Opportunities o Finding ways to detect attacks, preventing them and remediate them. o Defining how to secure APIs in terms of data Sovereignty • Threats o APIs deprecations and breaking backwards compatibility. o APIs can expose too much data.

3.4.2 Common API types There are several commonly accepted and used ways of creating APIs exposed by different services in order to be consumed. Sometimes they are not products by themselves, but patterns commonly used or Software architectures.

3.4.2.1 REST API Representational State Transfer (REST) is an architectural style for designing loosely coupled web services. It is mainly used to develop lightweight, fast, scalable, and easy to maintain, web services that often use HTTP as the means of communication [62].

REST APIs is currently the most extended and accepted protocol now a days when relating to Microservice based development. Most important and extended companies rely on REST APIs to implement professional services from their Software. Yahoo, Facebook, Twitter, Amazon are some examples of companies providing REST APIs on their systems.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

REST APIs were initially described by Roy Thomas Fielding in his dissertation in year 2000 [63].

As a Network-based architecture, GUARD would probably at some point take advantage on the most extended practice in the design of service Network-based Software architectures.

Another valuable mechanism is implementing REST Hooks, where clients are subscribed to different events on a REST server to get information when it is available and triggers events in other applications using Callbacks, so pushing Data is possible when it is ready.

• Strengths o Scalability through layered services o Client notification Listeners o Uniform Interface to access all the services. o Language and interface agnostic. • Weakness o Security concerns. o One to One communication. o Synchronous: so long delays can happen. • Opportunities o Guard can help defining how to enforce REST APIs in order to keep Data Sovereignty safe through all the APIs. o Guard can help defining Best Practices in protecting REST APIs • Threats o Exposing information through REST APIs can be dangerous and might expose too much information, violating the “Data Sovereignty” principle. o REST APIs could be attacked so GUARD should show its own capabilities by protecting itself by protecting both Client and Server sides. o Poor API documentation

3.4.2.2 Publish/Subscribe In this paradigm, subscribers register their interest in an event, or a pattern of events, and are subsequently asynchronously notified of events generated by publishers. This paradigm suits perfectly in cases where data source has multiple targets [64].

• Strengths o Scalability through layered services o Communication many-to-may o Decoupling Consumers and Producers. o Time Decoupling o Synchronization decoupling • Weakness o Confidentiality preserving. o Repudiation problems. o Illegitimate Data modifications. • Opportunities o Guard can help defining how to enforce Publish-Subscribe in different parameters:

D2.1 Vision, State of the Art and Requirements Analysis V1.0

o Publishers and subscriptions should only be done when they have permissions to do so. o Keep data safe with Data Sovereignty principle in mind. o Guard can help defining Best Practices in implementing Publish-Subscribe based services. • Threats o Several security threats due to the model weakness.

As Publish-Subscribe is a commonly used pattern, there are tools implementing this paradigm using different protocols. Well known protocols are MQTT, AMPQ, ZMTP, Wire protocol. These protocols are implemented by widely used and well proven solutions like MosquitoMQ, RabbitMQ, ZeroMQ or Kafka.

FIWARE Orion Context Broker allows you to manage the entire lifecycle of context information including updates, queries, registrations and subscriptions. It is an NGSIv2 server implementation to manage context information and its availability [65]. Using the Orion Context Broker, you are able to create context elements and manage them through updates and queries. In addition, you can subscribe to context information so when some condition occurs (e.g. the context elements have changed) you receive a notification [66]. However, Context Broker is not exactly a tool for Publish-Subscribing but for Context Management.

3.4.2.3 Polling Polling is basically querying constantly the same URL in search for new data. It is resource consuming. It wastes lots of effort in trying to get data, since most of the polls are loos (there is no new data).

It is typically automated in the background for a web app. Instead of you having to manually refresh (reload a webpage) to get new data, your app can request new data from a server. If there is new data, then it is sent back. If there is no new data, then nothing is sent back.

However, polling is less and less used due to its waste of resources and other ways of doing this is being done instead, for example, REST Hooks or WebSockets.

3.4.2.4 GraphQL GraphQL is an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data. It provides an efficient, powerful and flexible approach to developing web APIs, and has been compared and contrasted with REST and other web service architectures [67]. Should GraphQL Specifications be checked [68].

However, GraphQL is different to REST APIs because it is a technology and REST is an architecture.

• Strengths o Good for querying data with limited bandwidth and speed o Performance improvements capabilities. • Weakness o Performance improvements comes from a good optimization of GraphQL backend. o Caching is hard or impossible. • Opportunities o Guard can help defining a bests practices on GraphQL and security issues. • Threats o Malicious queries can crash the GraphQL Servers.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

o Too much information exposure.

3.4.2.5 Data Push Push APIs provide applications the ability to get messages from a server. This allows asynchronous messages to be sent to applications. One client register to a server and the server sends notification to the client regarding on a topic.

Data Push can be easily implemented using REST Hooks.

3.4.2.6 SOAP Simple Object Access Protocol (SOAP) is a protocol specification for exchanging structured XML messages over Internet based protocols. The framework has been designed to be independent of any programming model and other implementation specific semantics [69].

SOAP was created to standardize formats and message requisites; however, this protocol is everyday less used since it is hard to implement, uses quite a lot of bandwidth so it’s being quite unpopular between trending Mobile APPs developers, and REST is becoming the facto Standard. There are some studies comparing REST and SOAP [70][71].

3.4.2.7 API Securing Hacked or Broken APIs are one of the biggest threats and vulnerabilities for data security. The security is dependent on the kind of data transmitted. There are a set of standards implementing API securing which can be used all together in order to mitigate different threats.

As many of the APIs would work over HTTP, should we consider hardening it and pay attention to considerations in http specifications, especially in chapter 15 of [72].

• Strengths o APIs are a powerful way to implement applications based on services. o Integration between applications o Sharing data and generating data retrieving information from a large range of sites. o Decoupling components. o Implementation independence

• Weakness o APIs are insecure.

• Opportunities o Enforcing APIs access in order to make them secure. o Protecting infrastructure and information from attackers. o Building an enforced security framework to work with different APIs.

• Threats o Vulnerabilities and successful attacks may occur when using insecure APIs o Potential loss of money and time can happen. o Information can be illegitimately accessed.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Some available tools already exist which, used in combination, can help APIs to be secure, some of the next described tools are somehow related to REST API security essentials [30].

3.4.2.8 TLS Transport Layer Security (TLS) is a cryptographic protocol designed to provide communications security over a computer network. It aims to provide privacy and data integrity between two or more communicating computer applications [73].

TLS is already implemented as a library in every operating system, for every programming language and tools. It is everywhere.

3.4.2.9 OAuth2 OAuth is an open standard for access delegation, commonly used as a way for Internet users to grant websites or applications access to their information on other websites but without giving them the passwords. This is commonly used by major companies like Amazon, Google, Facebook, Microsoft, Twitter, etc. This protocol is described if RFC 6749 and its specification and its extensions are being developed within the IETF OAuth Working Group [74][75].

OAuth2 Standard is already implemented and ready to use with FIWARE Keyrock enabler and described in [76].

3.4.2.10 OpenID OpenID is an open standard and decentralized authentication protocol. Promoted by the OpenID Foundation, it allows users to be authenticated by co-operating sites. Users create accounts by selecting an OpenID identity provider, and then use those accounts to sign onto any website that accepts OpenID authentication [77].

OpenID specifications are described by the OpenID foundation [78].

3.4.2.11 XACML The standard "eXtensible Access Control Markup Language" defines a declarative fine-grained, attribute-based access control policy language, an architecture, and a processing model describing how to evaluate access requests according to the rules defined in policies.

XACML is a standard from Oasis defined [79]. It is implemented as FIWARE Enable AuthzForce [80], which provides an API to get authorization decisions based on authorization policies, and authorization requests.

3.4.2.12 Securing of previously existing APIs May existent APIs are not secure by definition. They’ve never been thought nor implemented with security in mind, so they are prone to be attacked in many ways. However, without modifying their implementation or their specification, we can harden them using proxy.

As an example, FIWARE Wilma PEP proxy is designed to harden this kind of APIs making the implementation and security independent. The FIWARE Enabler Wilma PEP Proxy is thought to provide and easy way to make unsecure APIs more secure [81], combining this Proxy with other security components such as Keyrock and Authzforce to enforce access control to your backend applications. This means that only permitted users will be able to access your Generic Enablers or REST services.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Another example related to Securing in project Istio (https://istio.io/). They describe themselves as “open platform-independent service mesh that provides traffic management, policy enforcement, and telemetry collection”. Istio is not only related to

3.4.2.13 JWT JSON Web Tokens (JWT) is an open standard (RFC 7519 [82]) that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. This information can be verified and trusted because it is digitally signed. JWTs can be signed using a secret (with the HMAC algorithm) or a public/private key pair using RSA or ECDSA. Signed tokens can verify the integrity of the claims contained within it, while encrypted tokens hide those claims from other parties. When tokens are signed using public/private key pairs, the signature also certifies that only the party holding the private key is the one that signed it [83]. JWT are useful in scenarios where Authorization and Information Exchange are involved, and it is used in IDS to identify Connectors [84].

3.4.2.14 Data Models Data Models as the abstract model to organize data, describing its structure and standardizes the data relationships with real life objects. The FIWARE Generic Enabler implementing the FIWARE NGSI (v2) standard [66], has been adopted by the European Commission as CEF Building Block. The inclusion of security and privacy aspects in the basic architectural principles of CEF Building Blocks will represent an important exploitation opportunity for the GUARD concept and technologies. Data models can be embedded in NGSI messages, so the communications are effective. A new version of NGSI (NGSI-LD) has been published as ETSI Standard [85], this new version takes advantages of JSON-LD benefits [86]. On the Other hand, GUARD will be compliant with IDS architecture [87], which means privacy and security is a must since deployment. Privacy policies must be there from the very beginning of deployment and through all the life cycle of the project.

3.4.2.15 Ontologies An ontology is a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a domain of discourse. Some works related to defining Security Ontologies has been done, describing the ontology based on some abstract concepts: Asset, Security Goal, Threat, Vulnerability, Defence strategy and Counter Measures and relationships between these concepts [88][89]. • Strengths o Based in W3C recommendation [90] o It has its own definition language (OWL) [91] o Ontologies provide a common understanding of a specific domain and allows queries and metadata easily. o Ontologies provide a way to discover new knowledge. • Weakness o Hard to define

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• Opportunities o Guard can take advantage of previously defined ontologies and extend those ontologies for project purposes. o Guard can define NGSI mappings for Security ontologies. o Build a Framework for exposing and accessing security information and inspection/enforcement capabilities. o Mapping heterogeneous context to a common knowledge about the domain o Powerful representation and querying data. • Threats o Uncomplete ontologies o Modelling ontologies in an open environment is a challenging task.

3.5 Data protection and Identity and Access Management This section reviews the state of the art on data protection and Identity and Access Management. The overall study is presented by means of three main sections: inputs from the scientific literature, research initiatives coming from other international projects, and current market solutions. The section concludes with a critical analysis of what technologies could be of interest (i.e., integrated and/or extended) within the GUARD platform.

3.5.1 Inputs from the scientific literature

3.5.1.1 Identity Management (authentication) Basic authentication schemes are discussed in several works found in literature. One of the first works on this topic can be found in [93], that describes a basic user-password authentication scheme for the single server environment with a secure one-way encryption function implemented in the user's terminal. Extensions of this study are found in [94]-[98] that propose password-based authentication schemes in different scenarios. An evolution of such schemes is found in [99], that describes an authentication scheme that uses both a password and a smart card. An authentication scheme for multi-server environment is introduced in [100], that exploits a neural network. In [101] a federated architecture that preserves users’ privacy is studied, for device-centric and attribute-based authentication. In [102] the Persona Identity Bridge is described and designed to allow users to authenticate and log into any website that supports Persona (a decentralized authentication system for the web) with their existing email address. An authentication scheme for big data services based on ECC is proposed in [103], that allows the authentication of multiple clients simultaneously. The channel state information is exploited in [104] to perform user authentication. Three different user authentication schemes are analysed in [105], addressing their weaknesses and the related countermeasures taken. Many works in literature discuss authentication strategies in the cloud environment [106]-[116]. A Privacy- Aware Authentication (PAA) scheme is presented in [106] to solve the identification problem in Mobile Cloud computing services. In [107] a signature-based approach is proposed for cross-domain authentication in the cloud environment. Another cross-domain identity authentication scheme based on group signature appears in [108] for the cloud computing. The main features of a Cloud-Oriented Cross-Domain Security Architecture are described in [109], so that users with different security authorizations can exchange information in a cloud of cross-domain services. In [110] a user Authentication, File encryption and Distributed Server Based Cloud Computing security architecture is proposed for cloud computing platforms. A Single Sign On (SSO) authentication scheme is proposed in [111] for cross-cloud federation model, where different clouds cooperate to optimize computation resources. Federated identity management method is tackled in [112] to manage

D2.1 Vision, State of the Art and Requirements Analysis V1.0 identity processes among entities without a centralized control. A performance analysis of multi-factor authentication schemes in cloud services is presented in [113]. The progress in Future Internet and Distributed Cloud (FIDC) testbeds exploiting Software Defined Network (SDN) paradigm are discussed in [114]. The goal of [115] is the study of a secure and efficient mechanism to manage and authenticate flow rules between the application layer and the control layer. A trust establishment framework is studied in [116] between an SDN controller and the applications, to trust different management applications communicating with the SDN controller. Authentication strategies in the Internet of Things (IoT) domain are presented in [117],[118]. A survey for authentication protocols in IoT is discussed in [117], while an authentication protocol is designed in [118] for IoT devices in cloud computing environments. In [119], an authentication and key exchange protocol is proposed to provide unique identities to billions of connected devices in IoT.

3.5.1.2 Access Management (Authorization) By assuming to control the access to protected resources and services separately from the authentication feature, the simplest approach available in the literature refers to the Identification Based Access Control (IBAC) mechanism [120]. In this scheme, permission to use a system resource is linked to user identity. Permissions are stored in an access matrix and a trusted party, in general the system administrator, has the right to make all the changes in the matrix entries. The Role Based Access Control (RBAC) approach is introduced in [121] and is based on the concept of roles of users. In RBAC, permissions in the access matrix are linked to roles, and the association between users and their roles is the key-requirement for controlling user access. The very simple IBAC and RBAC approaches have been extended in several directions. In [122] a collection of the authentication schemes previously addressed is presented and called as autheNtication Based Access Control (NBAC). The authoriZation Based Access Control (ZBAC) scheme is described in [123]; it uses authorizations presented with the request to make an access decision. An access control model is proposed in [124]; it encompasses traditional access control, trust management, and digital rights management features. More recently, the National Institute of Standards and Technology (NIST) formulated a concrete solution offering fine-grained authorization, namely Attribute Based Access Control (ABAC) [125]. In this case, the access to resources is handled by considering attributes associated to the user identity, which significantly extends the preliminary concept of user role and couples users attributes and accessed resources. Sometimes, the ABAC approach is also referred to as Policy Based Access Control (PBAC) [126] or Claims Based Access Control (CBAC) [127]. In [128] a fine-grained management of permissions for secure sharing of APIs in multi-tenant networks is discussed. OpenFlow networks are presented in [129], that proposes an autonomic and policy-based framework for access control management in OpenFlow networks. A security architecture is presented in [130] for SDN networks with defence capabilities. Many papers tackle access control and authorization issues in the cloud environment; they extend the the aforementioned approaches to achieve the requirements of the targeted services. Identity and access management are discussed in [131] in the cloud environment, explaining the related main mechanisms and challenges. A secure cloud storage service based on RBAC is constructed in [132], including also an ABAC mechanism and an Attribute-Based Encryption (ABE) in order to define hierarchical relationships among all values of an attribute. A framework for a flexible cross-domain access control based on RBAC is developed in [133]. Other RBAC models based on trust management in cloud computing environments are proposed in [134],[135] for single and multi-domain cloud environment. A new cross-domain authorization management model is presented in [136] for multi-levels hybrid cloud computing, which is based on a novel role architecture. A virtual private cloud is proposed in [137] for collaborative clouds based on security-enhanced gateways.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Various types of access control mechanisms that can be used in cloud computing environments are compared in [138]. Flexible authorization credentials for cloud services are illustrated in [139] to support decentralized delegation between principals. The Security as a Service paradigm for virtual machines (VMs) in the cloud environment is discussed in [140],[141]. Finally, access control and authorization schemes in IoT environments are presented in some papers[142]-[147]. A multi-authority access control scheme is proposed in [142] for federated and cloud-assisted Cyber-Physical Systems. In [143] a multi-authority scheme is proposed, that can solve the trust issues, attribute revocation problem, and system failure. The position paper [144] identifies relevant challenges and describes solutions for privacy and security issues in the IoT scenario. Another overview of the state-of-the-art of different access control solutions in IoT scenario can be found in [145], that highlights the main related challenges and new opportunities. SDN and Network Function Virtualization (NFV) are introduced in [146] to present a hybrid access control architecture in IoT scenarios and cloud computing. An SDN-based framework for network static and dynamic access control is proposed in [147], to enhance the smart home IoT security.

3.5.1.3 Privacy and Data Protection Privacy and data protection are treated in different works. The reference document for the European Community is [148], that describes in detail all the issues related to the protection of sensitive data. In current and upcoming cyber-physical and cloud-based systems, the data protection functionality is strictly related to authentication and authorization features. In few words, an authenticated user can access to protected data based on its permissions. In this framework, access control methodologies based on cryptographic mechanisms are found in [143],[149]-[154]. All these works aim at proposing enhancements to the Ciphertext-Policy Attribute-Based Encryption (CP-ABE) approach, that enables data owners to perform fine- grained and cryptographically-enforced access control over data. The issue of data protection in cloud computing is tackled in [92], that proposes a data protection mechanism integrating the RBAC with the CP-ABE. Privacy-by-design frameworks are discussed in [155],[156], exploiting an approach that aims to identify and examine possible data protection problems when designing new technologies. In [157] a framework is presented that exploits a variant of the CP-ABE algorithm. A decentralized CP-ABE access control scheme is proposed in [158], that preserves privacy in cloud storage systems. A cross-layer architecture is proposed in [159] where users are able to control and use many, indistinguishable identities by linking activities to remote services. In [160] an architecture is proposed that offers an anonymous way to provide online social network users with privacy preserving online identities. A hybrid privacy protection method is illustrated in [161] in a cloud-based environment. Authentication and authorization services are described in [162]-[165] for IoT [162],[164], WLAN [163] and cloud [165] platforms through the OAuth 2.0 framework. Specifications for authentication and authorization in IoT are addressed in [166]. The main challenges and opportunities for mutual authentication and authorization of users and services in a cross-domain environment are described in [167]. Also, the patent [168] focuses on a multi-domain authorization and authentication method in a computer network. Finally. A nice survey can be found in [169], that addresses security and privacy issues in the IoT scenario with the integration of fog/edge computing. The study carried out in [170] aims at demonstrating that many technologies and standards supporting confidentiality-and integrity-based (C-I) security controls. An SDN security model is proposed in [171] to protect communication links from sensitive data leakages. An access control scheme that combines attribute-based encryption and attribute-based signature, called as Attribute-Based SignCryption (ABSC), is proposed in [172]. An ABE scheme with time-domain information is proposed in [173]; is thought for multi-authority outsourcing.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3.5.2 Research initiatives There are many projects focusing on security issues. COMPOSE [174] and OpenIoT [175] are two FP7 projects in which authentication and authorization issues have been tackled by using a centralized solution. The ADECS project developed encryption systems for private and secure voice communication for mobile phones using Android and Apple iOS platforms [176]. The SPAGOS project provided security and privacy-awareness in the provision of eGovernment Services [177]. The Bio-Identity project designed and implemented a secure authentication system based on multimodal biometric characteristics [178]. The Assure UK project developed a pilot-ready identity and attribute verification and exchange service for the UK commercial market [179]. Other relevant solutions regarding authentication and authorization aspects have been also conceived in Achieving The Trust Paradigm Shift (ATTPS) [180] and ABC4Trust [181] projects. The ASTRID project is developing additional components for identity management and access control, to enable seamless and secure interconnection with external components. [182]. The RECRED project designed and implemented mechanisms that anchor all access control needs to mobile devices [183]. A very interesting approach has been conceived in symbIoTe project, which implemented a security framework to protect the access to resources exposed across federated IoT platforms [184].

3.5.3 Reference Standard and relevant Market Solutions In the security context, different open standards and market solutions deserve to be mentioned. Fast Identity Online Alliance is a non-profit organization formed with the goal to address the end user problem of creating and remembering multiple credentials [185]. OpenID Foundation enables relying websites to dedicate their authentication mechanisms to trusted identity provider websites [186]. OAuth is an open standard for authorization [187], together with its evolution OAuth 2.0 [188]. Another open standard for access control is XACML [189]. There are some free identity management and/or access control solutions. Idemix is a cryptographic protocol suite that performs identity management and protection through anonymity [190]. OpenDaylight is a modular open platform for customizing and automating networks [191]. KeyCloak [192] and WSO2 [193] are open source identity management and access control solution OpenUnison is an open source identity management, highly customizable solution [194]. There are also commercial solutions in the same context. 1Password, developed by Agilebits, is a password manager for access control [195]. WAYF, Where Are You From, is a Denmark's identity federation for research and education. [196]. Avatier is an identity management and access control suite made by several independently-licensed identity and access management products [197]. Bitium is an identity management and access control solution for cloud computing. [198]. Layer7, Fischer Identity, Forgerock, Idaptive, NetIQ, Okta, Onelogin, Optimal IdM, Ping Identity, SecureAuth, Simeio, Ubisecure, and Tools4Ever are Identity and Access Management (IAM) solutions, with several capabilities of strong authentication, SSO, identity management, connection to multiple directory services and mobile security management, certificate management, reporting and analytics. [199]-[211]. CrossMatch [212] and RSA SecurID Suite [213] are identity management solutions that manage also biometric identities RapidIdentity is another identity management and access control solution that includes also solution for the healthcare clinical workflow. [214]. Omada and One Identity are IAM solutions designed for business users. [215],[216]. Oracle Identity Management is an identity governance solution that provides access and role management, analytics and account management [217]. Another identity management and directory service solution is Randiant One [218]. Duo Security is a Cloud-based user authentication vendor [219]. FusionAuth is an identity management and access control solution that adds authentication,

D2.1 Vision, State of the Art and Requirements Analysis V1.0 authorization, user management, reporting, analytics [220]. The National Strategy for Trusted Identities in Cyberspace (NSTIC) describes four principles to help individuals and organizations to utilize secure, efficient and interoperable identity credentials to access online services [221]. Microsoft’s U-Prove is a cryptographic technology that allows users to minimally disclose personal information when interacting with online resource providers [222]. Another commercial solution is Amazon Cognito, a service that provides authentication, authorization, and user management [223]. Another interesting commercial solution is Axiomatics, a dynamic authorization suite that implements fine-grained attribute-based access control mechanisms [224].

3.5.4 Candidate technologies and missing gaps The GUARD project is called to jointly offer authentication, authorization, and data protection. Given the heterogeneity of services and involved entities, it is possible, at the time of this writing, to think about the possibility to implement an access control mechanism based on the ABAC logic. It will guarantee a nice level of flexibility, well seen by researchers and industries worldwide. To this end, all the open source technologies reported in Section 3.5.3 could be considered solutions of interest during the next activities of the project. Regarding the authentication aspect, instead, there could be the need to have a logically centralized Identity Manager that authenticates users, services, and any other logical entity belonging to the GUARD architecture. Also, in this case, the current literature offers many opportunities to implement Identity Management. Nevertheless, in order to effectively achieve the decoupling of both authentication and authorization functionalities, the technologies to be integrated within the GUARD project should be properly selected in order to meet the following simplified methodology:

(1) users, services, and any other logical entity implementing specific algorithms and procedures must be authenticated by an Identity Manager; (2) the Identity Manager must release an authentic token storing the attributes associated to the authenticated users, services, and any other logical entity; (3) users, services, and any other logical entity could use these tokens for performing authorization procedures in the future.

It is important to remark that the usage of open standards, like XACML, is recommendable for offering the opportunity to dynamically manage access policies.

3.6 Machine learning and other techniques for threat detection Threats against modern ICT networks are continuously emerging and becoming ever more diverse. Reasons for this are adversary’s increasing capabilities and evolving techniques to attack ICT systems. Furthermore, the increasing diversity in today’s networks leads to an increasing number of vulnerabilities that can be exploited by attackers. Highly developed security software and intrusion detection systems (IDS) should be capable of detecting attacks fast and therefore reduce the time for attackers to harm network components, software, and sensitive data, and thus mitigate the damage. Machine learning methods are promising to fulfil such requirements. In this section we survey the state-of-the-art regarding machine learning and other techniques for threat detection, including examples of research initiatives current market solutions and goals of the GUARD project to fill research gaps in this area. Details of the survey and the critical analysis are presented in Annex A (available at https://guard-project.eu/wp-content/uploads/2019/11/Annex-A-Detailed-SoA-Analysis.pdf).

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3.6.1 Scientific literature

3.6.1.1 Methods for detection of attacks and threats Like other security tools, intrusion detection systems (IDS) aim to achieve a higher level of security in information and communication technology (ICT) networks. Their primary goal is to timely and rapidly detect invaders, so that it is possible to react quickly and reduce the caused damage. IDS are roughly categorized as follows [225]: (i) host-based IDS (HIDS) – they should be installed on every system (host) in a network that should be monitored to deliver specific low-level information about an attack and allow comprehensive monitoring of a single host; (ii) network-based IDS (NIDS) – they monitor and analyze the network traffic of a whole network and one single sensor-node is sufficient for it, and hybrid or cross-layer IDS - they usually provide a management framework that combines HIDS and NIDS to reduce their drawbacks and make use of their advantages.

IDS categories

Following the taxonomy presented in [226], we can define the following two main criteria to categorize threats in ICT systems:

• Techniques of attack criterion, and • Threat impact criterion.

The first criterion is the main aspect in development of the following models: (i) Three Orthogonal Dimensional Model [227], where the threat space is decomposed into sub-spaces according to three orthogonal dimensions (motivation, localization and agent); (ii) Hybrid C3 Model [228], in which frequency of security threat occurrence, area affected by the threat and threat source are specified; (iii) Pyramid Model [229], where the threats are classified based on the attackers’ prior knowledge about the system hardware, software employees and users and critical system components which might be affected by the threat; (iv) Cyber Kill Chain [230]. where cyber-attacks are split into 7 phases – Reconnaissance, Weaponization, Delivery, Exploitation, Installation, Command and Control, and Actions on Objective. Second criterion is the background for the following two models: (i) STRIDE Model [231] characterizes known threats according to the goals and purposes of the attacks (or motivation of the attacker) – Spoofing identity, Tampering with data, Repudiation, Information disclosure, Denial of service, and Elevation of privilege; (ii) ISO Model (ISO 7498-2) [232] defines destruction of information, corruption or information modification, loss of information, disclosure of information, and interruption of services as 5 major security threats impacts.

3.6.1.2 Methods for detecting known attacks and threats Detection of “known” attacks and threats bases on prior knowledge on the characteristics of an attack and potential threat impacts. Most detection methods for the detection of known threats are signature-based [233]. Signature-based detection methods (SD) (also known as knowledge-based detection methods) use predefined signatures and patterns to detect attackers. This method is simple and effective for detecting known attacks. The drawbacks of this method are: It is ineffective against unknown attacks or unknown variants of an attack, which allows attackers to evade IDS based on SD. Furthermore, since the attack landscape is rapidly changing, it is difficult to keep signatures up-to-date and thus maintenance is time-consuming. Also, manual signature generation may be a cumbersome and complex process. Therefore, automated signature generation tools are used in IDS to limit propagation of a new threat at an early stage, until a manually created signature is available and can be included in the rule sets [234]. The generation of multi-set type signatures is formulated as a global optimization problem, which may be solved by metaheuristic methods, such as genetic algorithms (GAs) [235].

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Another group of methods is stateful protocol analysis (SPA). SPA (also referred to as specification-based detection) uses predetermined profiles that define benign protocol activity. Occurring events are compared against these profiles, to decide if protocols are used correctly or not. IDS based on SPA track the state of network, transport, and application protocols. They use vendor-developed universal profiles, and therefore rely on their support [236]. Most SD and SPA methods do not belong to the machine learning (ML) class. In some cases, Classification Machine Learning (ML) techniques based on supervised learning may be used to generate data sets for validation and testing. However, together with other ML techniques, they are much more useful for detection of anomalies and unknown attacks.

3.6.1.3 Methods for detection of unknown threats and attacks Unknown threats and attacks are not recognized by knowledge-based IDS. Reasons are that attackers use brand- new attack methods or technologies, and exploit so called zero-day vulnerabilities, i.e., vulnerabilities that have previously not been known to exist. Hence, methods are required that enable detection of unknown threats. However, these methods often suffer from high false positive rates [236]. The core class of methods for unrevealing unknown attacks is anomaly-based detection (AD). AD (behaviour based) approaches learn a baseline of normal system behaviour, a so-called ground truth. All occurring events are compared against this ground truth to detect anomalous system behaviour. The following types of anomalies indicate malicious system behaviour [237]-[239]: (i) Point anomaly is an anomalous single event, which could be caused by an unexpected login-name or IP-address; (ii) Contextual anomaly is an event that is anomalous in a specific context, but might be normal in another one, such as a system login from an employee outside working hours, which would be normal during normal working time, (iii) Collective/frequency anomaly usually origins from an anomalous frequency of a usually normal event; (iv) Sequential anomaly represents an anomalous sequence of single events, e.g., events that occur in an unusual order.

Anomaly detection algorithms Most anomaly detection algorithms base on artificial intelligence (AI) and machine learning (ML). Popular methods include Artificial Neural Networks (ANN) [240], Bayesian Networks [241], Decision Trees [242], and Hidden Markov Models (HMM) [243]. Clustering algorithms successfully support filtering defines mechanisms [244] in case of DDoS attacks. Some of them can be also used for clustering graphs generated based on malware data analysis. Graph clustering techniques [244] can be used to derive common malware behaviour. Support Vector Machines (SVM) are a frequently applied ML method. Similar to clustering, SVM can, for example, be applied for outlier detection [246]. Most of these methods are based on self-learning and learn a baseline of normal system behaviour during a training phase. There are three ways how self-learning AD can be realized [247]: (i) Unsupervised learning does not require any labelled data and is able to learn distinguishing normal from malicious system behaviour without any requirements on training data; (ii) Semi-supervised learning is applied when the training data only contains anomaly-free data; (iii) Supervised learning requires a fully labelled training data set, containing both normal and malicious data.

3.6.2 Current market solutions and research initiatives Below we present 4 selected examples of interesting practical projects, where the implemented threat and attack detection methods are important components of the developed security systems. AECID (https://aecid.ait.ac.at//), developed by AIT, is a research prototype that implements anomaly detection. AECID monitors textual computer log data and applies semi- and unsupervised self-learning, to learn a system’s

D2.1 Vision, State of the Art and Requirements Analysis V1.0 normal behaviour and initialize and configure different detectors. It can be either deployed centrally on a single host, or in a distributed set-up using several AMiner instances to fully monitor a network. AMiner (https://launchpad.net/logdata-anomaly-miner) is a light-weight host sensor, which operates similarly to a HIDS requiring only a minimum of resources. It leverages a tree-like parser model that reduces complexity for parsing from O(n), when using regular expressions, to O(log(n)). Additionally, the tree-like parser structure enables fast access to the information stored in log lines.

MobileIron

MobileIron (https://www.mobileiron.com/) is industry’s first mobile-centric, zero trust security framework that verifies every user, device, application, network, and threat before granting secure access to business resources. Zero sign-on - built on the company’s unified endpoint management (UEM) platform - eliminates passwords as primary method for user authentication, and replaces them with the mobile device ID to enable secure access to the enterprise. This allows users to access any app or service from any location or device without the hassle and security vulnerabilities of passwords.

SISSDEN

The SISSDEN (https://sissden.eu/) H2020 project was developed in order to improve cybersecurity posture of EU organizations and citizens through increasing situational awareness and effective sharing of actionable information. SISSDEN has a sensor network, which is composed of VPS provider hosted nodes (procured at a cost from the VPS providers) and nodes donated to the project by third-parties acting as endpoints. SISSDEN provides free-of-charge victim notification services, and works in close collaboration with Law Enforcement Agencies, national CERTs, network owners, service providers, small and medium-sized enterprises (SMEs) and individual citizens.

ARAKIS

ARAKIS GOV and ARAKIS ENTERPRISE (https://www.arakis.pl/) are early warning systems against cyber-threats developed by NASK. Their main functionality is the detection of patterns of advanced attacks in protected networks. They are based on a carefully designed system of probes. They collect data within protected networks and provide real-time information for further analysis. Advanced software can then generate descriptions of detected incidents. If they are classified as threats, the systems set off an alarm. The systems are able to detect known patterns of attacks and identify new, unknown types of threats.

3.6.3 Gaps and GUARD Solutions Sophisticated, targeted, and persistent cyber-attacks are complex and stealthy and involve multiple stages that span over a long period of time. Static attack detection techniques as well as popular SD methodologies prove to be ineffective, and novel approaches for detection and mitigation are required.

GUARD will propose the following innovations:

• New automatic methods for detection of known attacks and threats: GUARD will extend the surveyed methods for detecting “known” threats to enable detection of characteristics of packet forging and poisoning attacks. GUARD will also verify the efficiency of multi-agent systems (MASs) for anomaly detection in networks behaviour [248] and will integrate MASs deep learning model for improvement of detection of malicious data.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• New methods for revealing unknown threats and attacks: GUARD will develop a two-tier model using machine learning methods the bottom, which will be combined with multilevel correlation analysis among attributes of correct and malicious data. This method will base on the prototype model and algorithms developed in [249] for web campaigns. • Big data analytics and improved correlation for enhanced attack detection: GUARD will propose a novel approach that combines analysis of security logs, events, and network traffic from multiple intertwined domains. This approach will enable analysing large amounts of data [250], automatic generation of actionable Cyber Threat Intelligence (CTI) through massive event correlation, and therefore improve network security by optimizing detection of known and unknown attacks. Thus, GUARD will enhance the capability of detecting stealthy and sophisticated advanced persistence threats (APT), multi-vector attacks and zero-day exploits.

3.7 Data Inspection Tools

3.7.1 nDPI Deep packet inspection is type of data processing examines the data packets at an inspection point, with in a computer network. Depending on the result (protocol non-compliance, viruses, spam, intrusions, or other defined criteria) of the data processing, different actions might be taken, such as routing to a different destination [251]. The increasing application data rates, using arbitrary TCP and UDP ports for communication, and the prevention of the applicability of legacy methods for service classification constitutes DPI a necessity. OpenDPI is an open source software of deep packet analysis tools. nDPI is a superset that is based on the OpenDPI library and expands its functionality [252]. 8Bells of Virtual Deep Packet Inspection (vDPI) solution is built using the nDPI library. It analyses network traffic in real time in order to recognize specific applications and categorize each traffic flow according to its service. Considering that the operation involves significant amount of computing resources, a medium to large Virtual Machine (VM) type is highly recommended for efficient traffic processing. vDPI is developed on the Linux network stack which is a common basis for cloud networking solutions. The packet analysis and flow information processes produce a significant workload. That is because all network flows and most of their packets are processed deeper than protocol headers, commonly at payload level in order to provide accurate and precise packet identification of network traffic.

Figure 19. 8Bells vDPI Dashboard

D2.1 Vision, State of the Art and Requirements Analysis V1.0

The vDPI is not only used to identify network traffic profiles for various security, or network management purposes but also to analyse IP traffic including headers and data protocol structures together with the payload of the packet’s message to identify the various application protocols and traffic flows. This information is used to classify the traffic of the monitored network. The Deep Packet Inspection solution from 8Bells is built using the nDPI library.

3.7.2 Open Virtual Switch (OVS) Firewalls are the systems which control the incoming and outgoing network traffic to and from the inner network, based on predetermined security rules. A firewall typically provides a barrier between a trusted internal network and untrusted external network, such as the Internet [253]. In Figure 20, common setup for a firewall is depicted. There are two types of Firewalls: Stateful and stateless firewalls. The stateful type firewalls track the connection and allow the packets part of the tracked connection to pass by. The stateful Firewalls track the traversing packets by using attributes. These attributes include the source and destination IP addresses, port numbers and sequence numbers which are also known as state of the connection.

Figure 20. Common Firewall Setup

A virtual firewall is a firewall service running in a virtualized environment and provides packet filtering and monitoring services like what a physical firewall provides. In that sense, Virtual firewall in bridge-mode acts like a physical-world firewall. It is positioned in a strategic point of the virtual network infrastructure and intercepts virtual traffic destined for other segments. A routing firewall participates in the IP process in contrast with a bridging or transparent- firewall that does not. A transparent firewall acts like a stealth firewall, without making any IP address changes on other devices. A routing firewall participates in the IP process by forwarding traffic onto its next destination. Open vSwitch12 (OVS) [254] is an open source software switch designed to be used as a virtual switch in virtualized server environments. Open vSwitch12 (OVS) provides firewall capabilities by forwarding or blocking packets. The aim of OVS is to implement a switching platform that provides standard, vendor-independent management interfaces and opens the forwarding functions of switches to programmatic extension and control. It supports all versions of OpenFlow protocol. More specifically, OVS manages the Flow Tables for the Datapaths which are used for forwarding the incoming traffic according to matched entries. Below are presented fields of the flow table entries:

• Match Fields: Will be matched against incoming packets. It includes packet header and input port, b) Priority: Matching priority for this Flow Table entry, • Counters: Number of received packets matching this rule, • Instructions: Used to modify the action to be applied on the packet,

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• Timeout: The number of seconds this Flow Table entry lives in the Table, • Cookie: Opaque value chosen by controller.

3.7.3 AlienVault - AlienVault OSSIM AlienVault Open Source Security Information Management (OSSIM) is an open source SIEM software used for real-time analysis of security alerts generated by network hardware and applications. OSSIM provides SIEM capabilities, such as event collection, normalization, and correlation [255]. OSSIM is intended to give security analysts and administrators a view of all the security-related aspects of their system, by combining log management and asset management and discovery with information from dedicated information security controls and detection systems [255]. This information is then correlated together to create contexts to the information not visible from one piece alone. OSSIM performs these functions using other open-source software security components, unifying them under a single browser-based user interface. The interface provides graphical analysis tools for information collected from the underlying open source software component (many of which are command line only tools that otherwise log only to a plain text file) and allows centralized management of configuration options [256].

3.8 Human machine interfaces and cybersecurity practices In this section, we will discuss the latest development trends in the broadly understood relationship between Human Machine Interface (HMI) and cybersecurity (CS), i.e. both the technologies dedicated to cyber-security applications and those aspects of HMI which, due to their application, need to be revised in terms of cybersecurity requirements. The role of man in cybersecurity is still extremely important. Human-in-the-loop may mean two things. First, a human is the greatest source of vulnerability, because his oversight, error, misunderstanding or ignorance (lack of good practices) may cause weakening of the security system qualities or even its complete inactivation. And second, in most cases, it is still a human who plays the highest role in the decision-making process. In this case, a proper and most precise transfer of information about threats, attacks, vulnerabilities and the like is most crucial to offer the best possible situational awareness. This information transfer means primarily intelligent visualization of data collected by the computer, i.e. such as a user interface that will allow understanding of huge amounts of data through their clustering, filtering and exposing correlations, trends or patterns. Although many systems can work automatically (and to some extent they do), the architecture of an ICT infrastructure may require the highest level of accuracy while making decisions about preventing or mitigating cyberattacks. May the mitigation of an attack turn off or stop other key services in consequence, a final decision will probably be left for a human. Employees of cyber-security teams also need up-to-date information about the current state of cybersecurity landscape. Hence statistics that reveal patterns or trends of attack type, source, destination, or other non-obvious correlations are extremely important. An important element of HMI for cybersecurity is the ability to present threats detected by the machine to a human, i.e. creating the best possible situational awareness. For this purpose, it is worth to unify data recording – as in SISSDEN (Secure Information Sharing Sensor Delivery Event Network) project, where all data about cyberthreats, attacks and malicious behaviour collected by the huge worldwide sensor network are processed to maintain a coherence [257]. Frameworks for information sharing that focus on the role of humans in the cybersecurity context have been also proposed: Alan Calder, NIST Cybersecurity Framework: A pocket guide, IT

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Governance Publishing 2018, Tri-Modular Human-in-The-Loop Framework [258] or Platform for Security-Aware Design of Human-on-the-Loop Cyber-Physical Systems [259]. Data Visualization is one of the most crucial parts of the user interface and human-machine interaction. However, despite many approaches and frameworks for the cybersecurity purposes that have been suggested in the recent years, they were mostly an inside tool for advanced systems developed by engineers rather than a user interface thought through by professional designers. Hence, not rarely, even the fundamentals of the visual theory have been neglected, even those relating to the limitations of human reception, as Ch. N. Adams and D.H. Snider emphasize in their paper on Effective Data Visualization in Cybersecurity [260]. It means mostly complicated graphs with too high resolution, 3D-visualisations or too much data at once. Is there any way to do it better? The authors looked at existing research aimed at improving the effectiveness of data visualization in cybersecurity and distinguished three types of approaches so far: surveys on data visualization, user-oriented research (UX) and unique explorations of data analysis and visualization in cybersecurity. They also listed the most frequent causes of poor data visualization for cybersecurity:

• tool rigidity (lack of opportunities for data exploration) • overly complex visualizations (the already mentioned poor application of data visualization theory) • challenges of tool adoption (data visualization scepticism) • lack of user-centred design (limited access to network analysts) The work [261] surveyed different data sketches (network graphs, maps, aggregated charts, timelines) and consulted them with analysts to get some recommendations on how certain design methods fit different types of data in terms of user-centred visualization. Authors in [262] have brought an interesting topic of user-centred design for decision support and proposed the Ocelot interface as an addition to real-time monitoring or visual analytics, which are most typical for cybersecurity user interfaces. Focusing on user-centred design is a very important step for cybersecurity systems, but as [263] point out the user-centred design doesn't automatically lead to a good user experience, and while “usability by definition is only concerned with the efficiency and effectiveness and satisfaction during use, UX addresses the experience before, during and after usage.” A problem at the intersection of security and user experience lies in the assumptions of both, which present exactly different values – cybersecurity system most often complicates and hinders, in order to prevent too easy access. Whereas the principles of user-centred design aim at possible simplification, facilitation, and shortening of time and complexity of service. This conflict of interest has been described by, among others, [264] and [265]. The study [266] provided a systematic literature review to answer the question about the definition of UX in the context of cybersecurity, which means covering both goals: usability and security. The authors extracted criteria that help to evaluate the usability of a system, which are: • convenience • understandability • inclusivity • authentication Further criteria needed for evaluation of the security of a system include: 1) revelation, 2) secrecy, 3) privacy, 4) breakability and 5) abundance. In order to address these criteria, it is to set the right target, which means to specify the user whose experience should be satisfied. And while cybersecurity plays a crucial role in everyday life of almost everyone, there is still a huge mental gap between the human and the world of CS, which concerns

D2.1 Vision, State of the Art and Requirements Analysis V1.0 both experts and non-experts, but in a different way. The experts have a broad knowledge about CS itself but as humans, they still lack a high computational capability to process big data. Therefore, a visualization algorithm that reduces the data dimension is required. In contrast, the non-experts are mostly not familiar with CS at all, thus the user interface designed for them must take into consideration their lack of general understanding of CS. In this case, one has to consider the user interface in terms of enhancing situational awareness of cybersecurity. A discussion about the design of a non-expert centred user interface and a tool for Visual Analytic for establishing and enhancing CS has been proposed in [267]. The tool provides three primary view types: timeline, communication, and network, based on a streamgraph, with stacked area chart with time-series, showing the count data, such as bytes sent/received, packets sent/received, traffic for each connected device and the host number that is currently connected. However, it is worth mentioning that one should not forget about the esthetical needs of experts as well. As an experienced designer of user interfaces for cybersecurity (working among others for Email Protection) said: "even a SOC analyst is still a user of consumer products. They use iPhones, Android phones, and the apps made for those devices. So, they're expecting a higher level of fit and finish [268]. A user interface, whether graphic or other, is meant to be used – to be clicked, scrolled, touched, etc. The exact user behaviour may differ, but it must stay within some expected frame. [268] propose an approach that focuses on user behaviour with GUI and collects possible profiles for anomaly detection system, based on different classifiers. The learning set contains personalized user behaviour (stored logs with certain details) to cover the widest range of legitimate anomalies. However, unlike the IDS proposed by [270], [271] or [272], the intrusion detection introduced here, as they put it "doesn't require the user to perform a specific task to train the system so that a behavioural IDS is able to seamlessly learn behaviour without any user interruption". Intelligent User Interface (IUI) is an adaptive interface that learns from user behaviour and makes further assumptions based on correlating the data. IUI can infer from both the input data and the user behaviour to produce high-quality, contextually relevant user experience. However, as [273] underline it, “they also raise the spectre of privacy violations”. A further aspect of the relationship between the user interface and cybersecurity is the question of a graphical user interface that helps to establish or maintain the required cybersecurity level. What is meant here are for example several forms used for creating user accounts including creating passwords and passing authentication processes. [274] points out that a well-designed GUI for this purpose should not only validate the input but also educate the user about the possible levels of data security (including bad password selection practices modelling). The same applies to different security policies that are customized by users. [275] focuses on non- professional users that need a proper translation of specific policies. They offer a framework that translates user-created high-level policies into low-level policy understood by network devices and controllers. This approach stresses the importance of usability when designing a cybersecurity system. Such usability includes non-professionals and it is not only limited to creating situational awareness but also allows these users to take control when desired. Finally, when it comes to the cybersecurity of physical devices connected to a network, the GUIDE (Graphical User Interface Fingerprints Physical Devices), proposed by [276] may come in handy. Their approach assumes that most of the IOT surveillance devices have a GUI in the form of a website. The described framework automatically generates fingerprints based on these webpages. And, as the authors claim: “discovering these devices brings about the deep understanding on these devices' characteristics and helps secure device security in the cyberspace”.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3.8.1 Behavioural, social and human aspects

The following section provides a first overview of relevant literature about behavioural, social and human aspects in cybersecurity, and more importantly, currently focused research areas. The main focus was put on Association for Computing Machinery (ACM) publications since 2014 to capture the state-of-the-art knowledge within the field of Human Computer Interaction (HCI). Additionally, other relevant publications have also been considered. Excluded were, for example, articles on botnet behaviours, system behaviour monitoring, network behaviour, and the like. Further excluded were all articles on cybercriminal behaviour, and behavioural analyses to identify cyber-criminality or (cyber) security threats, as we focused on the behaviour of users rather than cybercriminals. Several topics can be observed in this area of research: • the fact that user behaviour might not be taken into account or not in a comprehensive manner might lead to ‘incorrect’ user behaviour and, thus, problems in the interaction between user and system [277]- [282]; • drivers and factors of cybersecurity behaviour such as privacy concerns and perceived threats [283]- [287]; • cybersecurity awareness education, especially gamification approaches that teach users relevant behaviours within the field of cybersecurity [288]-[296]; • awareness on cybersecurity [297],[298],[284],[285]. One aspect that seems relevant for cybersecurity when it comes to behavioural aspects is nudge, a concept of behavioural science. Users are ‘nudged’ towards favourable behaviour (e.g., installing security updates and enable two-factor authentication), for example with the use of ‘commitment devices’ [279],[299]. Other important aspects are factors and especially drivers of cybersecurity behaviour such as of perceived threat, privacy concerns and descriptive norms/group norms to understand users’ intention to adopt self- protective behaviour [283]. Crucial is also the sense of security many users have; often users are unconcerned about cybersecurity until they are victims of a security breach. For example, research on the factors that influence smartphone security decisions [284],[285] showed that few users update their Operating System (OS) despite the importance for smartphone security. Ndibwile and colleagues identified that intellectual, financial, sociocultural and other factors (such as the smartphones User Interface (UI)) are affecting users behaviour [284],[285]. User vulnerability and user interactions, in addition, are relevant factors – for example, the impact of user vulnerability depends on the system topology [286]. Also, Self-efficacy in information security (SEIS) is a relevant predictor of end user security behaviour that hinges on end user acceptance and use of the protective technologies such as anti-virus and anti-spyware [287].

Education on cybersecure behaviour is another area of focus within this topic. A lot of research is focusing on gamification approaches that allow to teach users, and quite often children or young adults, or achieve a change in cybersecurity behaviour [288]-[291], or to raise interest in cybersecurity and/or engage users in this topic [292],[293]. In the learning process, participatory and constructive learning, multi-model dissemination, and ubiquitous opportunities have been identified as promising approaches [294]. Other educational approaches such as visual privacy [295] or hands-on cybersecurity exercises [296], as well as the use of real-life examples are discussed as well.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

3.8.2 User awareness and involvements in cybersecurity

Connected to education on cybersecurity behaviour is the issue of awareness on such issues, which is connected to news on security and privacy. Social factors are also of relevance for this aspect, as age, gender, education, cultural background, occupation, and security behavioural intention correlate with how they hear and, as a consequence, share security information – for example, males often feel a personal responsibility, and older people were less likely to be informed about such issues [297]. Two age groups have been identified as especially vulnerable when it comes to safety online: older people and younger children including adolescents. For older people, cybersecurity is, in general, a relevant aspect, where older adults with Mild Cognitive Impairment (MCI) represent higher risk factor regarding cybersecurity, attacks and scams [298]. As [300] reports, older users prioritise social resources based on availability, rather than cybersecurity expertise, and they avoid using the Internet for cybersecurity information searches despite using it for other domains. Findings show that younger people, aged from thirteen to seventeen, are exposed to cyberspace risks as well. Here, mobile phones are widely used for accessing the Internet. Observed cyberspace risks include cyberbullying and sexual abuse- oriented risks. In addition, adolescents across all genders are open to befriending strangers online and they consider sharing contact details with strangers [301]. Thus, dissemination strategies that take the needs of this (and, consequently, other) vulnerable group(s) into account are an important measure to take. Awareness needs also to be raised for all user groups on general of cybersecurity, e.g. password strength or phishing [284],[285]. Consequently, various delivery methods for improving end-user information security awareness and behaviour are needed. Based on conducted studies, combining delivery methods produces better results when compared to an individual security awareness delivery method. Such combinations can include text, graphics, animations, game, computer-based training (CBT) educational presentation, E-mail messaging, group discussions, newsletter articles, posters & video-based methods [302],[303]. However, changing security behaviour of citizens, consumers and employees requires more than providing information about risks and how to react. The following three steps are identified [304]:

• accept information as relevant • understand how to respond • be willing to do this on own accord Security awareness is defined in (NIST, 2004) as follows: “Awareness is not training. The purpose of awareness presentations is simply to focus attention on security. They are intended to allow individuals to recognize security concerns and respond accordingly”. Therefore, if the content “feels” impersonal, or too general, users will treat it as just another obligatory session or an obstacle. On the other hand, threatening or intimidating security messages are not particularly effective, especially because they increase stress to such an extent that the individual may even be repulsed or deny the existence of the need for any security decision [305]. It is important that the end users are not prevented from their primary online tasks due to security procedures and processes. If they perceive security as a yet another obstacle, they will get tired quickly and abandon security precautions. Initially, users can be motivated to follow cybersecurity guidelines.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

However, if this causes them inconvenience, emotional discomfort or any limitations whilst using online services, they will ignore suggested secure behaviour. Additionally, it can be very stressful to remain at a high level of vigilance, patience and security awareness. This effect is known as “security fatigue” [306]. In order to address these issues, research and development needs to focus on two possible aspects (i.e. two sides of the same coin): usable security and secure usability [307]. Here it is very important to strike a balance between two approaches: «complex» security kills usability – (cSkU) and «simple» usability “kills” security – sUkS). Security is combination of following sub-characteristics: confidentiality, integrity, non-repudiation, accountability and authenticity. Usability is combination of following sub-characteristics: appropriateness recognizability, learnability, operability, user error protection, user interface aesthetics, accessibility [308]. Therefore, a change of user attitude and behaviour needs to take place as well. Persuasion can be defined as the “attempt to change attitudes or behaviours or both (without using coercion or deception)” [309]. The basic persuasion techniques include fear, humour, expertise, repetition, intensity, and scientific evidence. Behaviour change in a cyber security context should take place by using simple consistent rules of behaviour that people can follow. This way, people’s perception of control will lead to better acceptance of the suggested behaviour [310]. Based on literature review and analysis on successful and likewise unsuccessful cybersecurity awareness campaigns, the following factors can increase the overall effectiveness [304]: 1) Professionally prepared on organised security awareness. 2) Avoid using fear tactics (it can scare users who can least afford to take risks). 3) Security education needs to be more than giving information. It needs to be targeted, actionable, doable, and provide feedback. 4) Once the user behaviour changes, sustain them through the change with training and feedback. 5) Emphasis is necessary in different cultural contexts. When it comes to user involvement, one aspect that has been criticised is that in the development process, usually an ideal-type user is imagined, who is behaving to some textbook ideal. Thus, developers and designers need to take actual user behaviour into account [277], such as, for example, present bias, which is the tendency to discount future risks and gains in favour of immediate gratifications [279], or the Third-Person Effect [280], which describes the discrepancy between an individual’s perception of the effects on themselves and on others. Furthermore, decisions are not always taken deliberately, on the contrary, most human behaviour is controlled by nonconscious automatic cognition – developers of cybersecurity systems need to take this into account [311]. Lastly, all (not just parts) of social, human, and organisational factors need to be taken into account to ensure that a system works properly within a specific context [278]. Rather than seeing the human (factor) as the weakest link in cybersecurity systems, these systems should make use of them [281],[282].

3.9 Conclusions The research presented in this section focused on the investigation of the latest development trends in relationship between Human Machine Interface and cybersecurity. Society’s reliance on information technology (IT) has been increasing simultaneously with the involved risks in cybersecurity. Computer systems are becoming crucial for individuals and organizations to be able to operate at all. For this reason, it is important that the technology adapts to human needs. Here, humans have the key role, as they can be the weakest point in a security system. For a user to make optimum decisions, a “Security, Functionality and Usability Triangle” balance needs to be achieved. The main problem is at the intersection between security and user interface. User interface must educate, validate and allow visualization and understanding of huge amounts of data both for

D2.1 Vision, State of the Art and Requirements Analysis V1.0 experts and non-experts alike and thus, consequently, create the best possible situational awareness. Previous attempts at implementing cybersecurity frameworks have not been successful, because even the fundamentals of visual theory have been neglected. Likewise, during the development, only an “ideal” user has been considered. The adequate approach needs to include actual user behaviour from different age groups, cultures, education levels, occupations etc… It is not to be expected for a user to be able to remain at a high level of vigilance, patience and security awareness (i.e. avoid “security fatigue). Here, the user must not perceive security as a yet another obstacle, otherwise they will get tired quickly and abandon security precautions. Rather, the development of a successful user interface for cybersecurity practices needs to focus on positive feedback, avoiding scare tactics and not to overwhelm with abundance of information. The key is that the user needs to accept the information as relevant, understand it and follow through on own accord.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

4 Requirements analysis This section presents the main technical requirements of the GUARD framework that can significantly contribute to the GUARD development. Service structures, GUARD modules and architectural components, integration with existing systems and propagation and management of sensitive information are all aspects taken into account in the provided requirement analysis. More specifically, the requirements define:

• which GUARD functionalities must be implemented; • which interfaces should be implemented towards entities external to the GUARD framework and involved in the management of digital and security services; • which methods should be developed and which algorithms should be implemented for detection and analysis; • how security operators should deploy/use the GUARD framework.

The full set of requirements therefore covers both the design, implementation, validation, and operation of the GUARD framework. A prioritization scheme is used to select mandatory GUARD features and to de-scope other activities that are mostly relevant for industrialization, commercialization, and operation of GUARD.

4.1 Methodology The methodology is largely based on the FURPS+ approach [312], which classifies the architectural requirements into the following categories: • Functional requirements: they represent the main system features; • Usability requirements: they are concerned with characteristics such as aesthetics and consistency in the user interface; • Reliability requirements: they include availability, accuracy of system calculations, and the system ability to recover from failure; • Performance requirements: they set constraints on throughput, response time and recovery time; • Supportability requirements: they are concerned with characteristics such as testability, adaptability, maintainability, compatibility, configurability, installability, scalability, and localizability; • others (+), which mainly include:

• Design: they specify or constraint architectural elements and their relationship; • Implementation: they specify or constraint the coding, software, and applications used in the realization of the system; • Interface: they specify the paradigms and protocols to interact with external systems; • Physical: they put constraints on the hardware used to house the system, e.g., shape, size, weight, capability, resources.

For the correct design, implementation, and evaluation of the whole framework, it is also important to identify the target for each requirement. S some requirements address the GUARD framework and must be therefore taken into account in the design, implementation, and validation phases of the system; other requirements instead define constraints on external entities (operators, infrastructures, services) and should be considered for operation in real environments. In the GUARD context, it is possible to identify several internal and external stakeholders (see Figure 21), based on the project concept outlined in Section 2:

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• GUARD developers, which are the Project partners involved in the design, implementation, and validation of the GUARD framework. They should consider any design, functional, technical, and performance requirement that directly affects the GUARD framework. • Cybersecurity operators: they are responsible for safe and secure operation of applications and services, including detection of security incidents, mitigation and response to attacks, and investigation of new threats. They are the primary users of the GUARD technology, likely part of a Security Operation Centre that is working for one or multiple organizations. They select the analysis algorithms to run, define policies for reacting to anomalies, warnings, and attacks, investigate zero-day attacks and unknown threats. They use the GUARD framework to activate and control security hooks embedded in the services. • Cybersecurity vendors: they develop cyber-security appliances and algorithms that analyse and correlate data in order to detect attacks, intrusions, violations, and anomalies. The GUARD framework feeds these algorithms with the necessary context: service topology, functions, data, and events. • Service/Software developers: they design and implement services and applications, by chaining individual functions into complex topologies. Software developers are mainly responsible to create software and update existing applications, to package and distribute it through public or commercial channels. Service developers select software tools and services, define initial configurations and life-cycle management actions, identify resources needed to run the service and deployment constraints. They are also responsible for chaining, which means to create the proper connections between elementary elements; this includes creating accounts, provisioning resources, defining forwarding policies and processing pipelines. • Service providers: they offer services to internal and (more commonly) external end-users, by running the software provided by Service/Software developers. This role usually includes system engineers, system administrators, release engineers, DBAs s network engineers involved in operation processes. • End users: they are the final users of digital services. They are not directly involved in any technical aspect, but they can put tight requirements on service continuity and availability in case of critical services for their business.

Figure 21. Stakeholders of the GUARD framework. They are targets of one or more requirements.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

4.2 Taxonomy Each requirement is presented by using the following template:

Requirement ID Type Priority Target R# Type of requirement High/Medium/Low Involved stakeholder(s) Title Title of the requirement Description Description of the requirement

• R# - This field uniquely identifies the requirement within the project and eases tracking its fulfilment in the next steps of the project. • Type – this field classifies the requirement according to the specific implementation issue: o Functional, if the requirement represents one main system feature; o Design, if the requirement specifies or constraints architectural elements and their relationship; o Performance, if the requirement sets constraints on detection capabilities, response time, recovery time, consumed resources, etc. o Usability, if the requirement describes characteristics such as aesthetics, consistency and ease of use in the user interface; o Implementation, if the requirement specifies or constraints the coding, software, and applications used in the realization of the system; o Interface, if the requirement specifies the paradigms to interact with external systems. • Priority – this field indicates whether the requirement should be considered mandatory or not. Three different priority levels are allowed: o High means a mandatory requirement that must be fulfilled in order to achieve the project objectives, as identified by the description of Project concept and application scenarios; o Medium means an expected requirement that should be considered in the design, but it is not essential to demonstrate the framework, hence, it might not be present in the implementation of Use Cases; o Low identifies a feature that is required to correctly operate the framework in real installations, but it is not worth for validation and demonstration of the proposed framework. • Target – this field identifies the target for each requirement, according to the classification in Section 4.1; • Title – this field defines the title of the requirement. • Description - this field contains a clear and concise description of the main features of the requirement, its purpose, and the goals to be reached.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

4.3 Requirements list and description

Requirement ID Type Priority Target R01 Design HIGH GUARD developers Title Centralized analysis and correlation Description The system must guarantee the ability to detect multi-vector attacks and advanced persistent threats. This implies the capability to analyse and correlate the broadest security context in both space and time dimensions, hence GUARD must be able to correlate data coming from different parts of the system and at different times. In this respect, detection is expected to be carried out in a centralized location, to improve visibility over complex topologies. Lightweight tasks of data aggregation and fusion can be implemented locally to decrease the overhead of transmitting data. The following features must be defined and supported: • control interface and APIs to configure and program the local environment; • small lightweight programs for local packet inspection and generation of alerts; • message bus to collect data and feed the context broker and detection algorithms; • encryption mechanisms to protect data in transit; • mutual authentication between the components and access control rules integrated in the identity management and access control framework.

Requirement ID Type Priority Target R02 Design HIGH GUARD developers Title Local data aggregation and fusion Description The current security context (including data, events, and measurements) should be gathered locally in each virtual function. The need for deep correlation and centralized processing poses efficiency concerns on the retrieval of security data. The two main issues are bandwidth usage to transfer that information and latency. It is obvious that many inspection operations cannot be performed in a centralized way but must be implemented locally: the analysis of network packets is the most obvious example. Therefore, local agents must be deployed in all service functions, which are responsible for data collection, fusion, and access. However, monitoring and inspection operations must not slow down the execution of the main business logic; this is especially difficult to achieve for network functions that process packets at nearly line speed. Besides parsing and filtering that are necessary to gather the context, the following set of operations are expected to be available locally: • Aggregate information from several events originating with a single task; • Parse comma-separated value data into individual fields; • Parse string representations as "123 MB" or "5.6gb" into their numeric value; • Parse data from fields to use as the Logstash timestamp for an event [313]; • Extract unstructured event data into fields using delimiters; • Perform DNS lookups; • Encrypt and sign data; • Add timestamps; • Calculate the elapsed time between a pair of events; • Adds geographical information; • Manage JSON [314] and XML [315] descriptions (parse, serialize); • Split complex events and multi-line messages into multiple elementary events; • Translate or transform fields in messages according to some replacement pattern.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Some simple forms of enforcement are also required on the local side: • drop packets; • redirect/steer packets; • duplicate packets and deliver to given destination.

Requirement ID Type Priority Target R03 Design HIGH GUARD developers Title Support multiple detection and analytics services Description According to the Project concept, GUARD is not a cyber-security appliances per se, but a framework to facilitate the application of detection and analysis algorithms on virtual services. GUARD is therefore expected to support the execution of multiple algorithms for intrusion detection and attack recognition, including: • detect known attacks through: o Signature-based techniques (based on data collected in n6 and Internet Telescope – historical data); o Analysis of trends and anomalies based on Internet Telescope data (farms of honeypots); o detect unknown attacks through: o Unsupervised learning techniques for DDoS attacks detection (based on Netflow data collected by NASK); o DDoS attacks detection based on network traffic in Internet Telescope (DARKNET) analysis; o Anomaly and trends investigation to new attacks detection (based on honeypots and Internet Telescope); o Anomaly detection in IoT systems: black and grey holes detection; delay, flooding and sinkhole attacks; • detect and prevent zero-day attacks; • detect changes in the service chain (secure chaining), see R12; • track data for secure propagation, see R13; • detect and verify the trustworthiness of services, see R14.

Requirement ID Type Priority Target R04 Functional HIGH GUARD developers Title Heterogeneous security context Description To properly support the broadest range of detection algorithms and analytics, the framework must support the collection of the following security-related data: • service topology, that identifies the logical relationships among services, i.e., the way the services interface and exchange information each other, the way they are connected through APIs, etc.; • security properties for each service function (e.g., if they adopt cryptography and certificates for a secure relationship); • log files (from the operating system, system daemons, applications); • system metrics (CPU and memory usage, network and disk utilization, open/listening sockets, open file descriptors, number of running processes); • events from applications; • trace of system calls; • network statistics: number of transmitted/received bytes for selected protocol, address, port;

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• deep packet inspection: filter and capture packets based on their headers or content; trace software execution (internal functions, system calls, CPU instructions).

Requirement ID Type Priority Target R05 Design HIGH GUARD developers Title Data delivery Description The GUARD framework must collect security context from the service graph and feed the centralized detection logic. Multiple algorithms may be deployed to analyse and correlate data, so the GUARD framework should take care of dispatching the correct set of data to each of them. There are two main delivery methods that should be supported by the framework: • real-time: data is delivered on the fly while it is collected; this allows to reduce the latency and overhead for algorithms in retrieving the data; • historical: data is stored in database and can be requested on demand by the algorithms, preferably by queries that allow the selection along the space and time dimension; this method is useful for late investigation and identification of attack patterns. The mechanisms for data delivery (including storage) must also account for the presence of multiple users. It is therefore required the ability to partition data according to the service they are originated for. Partitioning is useful to correlate data among similar services. As a matter of fact, many services may be instantiated from a common template, hence sharing vulnerabilities. In addition, the increasing adoption of new business models, where services and processes from multiple organization are pipelined to create new value chains, is creating stronger security dependencies among organization and their business. A compromised service from a supplier becomes a potential attack vector towards its customers. Similarly, advanced persistent attacks might carry out different intrusion attempts on interconnected services. The ability to correlate security- related data and events from multiple services is expected to improve the detection capability, while propagating faster alerts along the business chain. Similarly to R01, a central detection logic is necessary to effectively correlate data from multiple sources.

Requirement ID Type Priority Target R06 Design HIGH GUARD developers Title Context access Description According to the overall concept described in Section 0, GUARD is conceived as a sort of middleware to access and control the security context of (one or more) digital services. Even if data encryption is used both for data at rest and data in transit, it is necessary a fine-grained control over who is allowed to use data and to change the collection tasks. There are two main threats: 1. unauthorized access to sensitive or private data; 2. wrong, weak or malicious configuration of the collection process, which may retrieve too few data for detection or overwhelm the system. There are multiple entities that may be interested in accessing the context: • detection and analysis algorithms, which process data; • cyber-security staff, which may visualize data for their own analysis and investigation; • service providers, which may visualize data or use it for their own management purposes.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Within each category, there are multiple individuals that play different roles; for instance, among cyber-security staff, there may be multiple seniority levels. It is therefore important to establish different roles in the system, with different privileges in accessing and modifying the security context.

Requirement ID Type Priority Target R07 Design HIGH GUARD developers Title Historical data for statistical analysis Description The framework must include a database that collects historical data for statistical analysis. Statistical parameters can be derived based on the specific application, type of service, bandwidth load, etc. Data protection mechanisms must be applied to keep the confidentiality of data collected in the central repository and avoid unauthorized access (see R29).

Requirement ID Type Priority Target R08 Design HIGH GUARD developers Title Local programmability Description The GUARD framework must be flexible to support different scenarios and operations. GUARD users must be able to adapt at run-time the monitoring and inspection processes to the current detection needs, balancing the depth of inspection with the computation, storage and transmission overhead. For example, when early signs of attacks are detected or new threats are discovered, it may be necessary to analyse finer-grained information than during normal operation. Fine-grained information may consist in more frequent sampling values or additional inspection data (for example, in case of network packets the innermost protocol bodies may be checked in addition to standard TCP/IP headers). “Programmability” is expected to support the following features: • manage configurations of local agents for multiple programmable components that implement API #1 (see R25) by running inspection and monitoring tasks to build the widest security context; • inject simple processing programs in local security agents for monitoring, inspection, data fusion, aggregation, and filtering.

Requirement ID Type Priority Target R09 Functional HIGH GUARD developers Title Remediation, investigation, and mitigation actions Description The framework must support multiple actions for remediation, investigation and mitigation in case of attacks, threats and anomalies. Actions are implemented at: – the control plane, by changing the configuration of security hooks (monitoring, inspection, enforcement) within the service, changing the set/configuration of algorithms of detection and analysis, injecting lightweight programs locally, etc. – the management plane, by changing the service graph, changing security properties through security APIs, notifying users on threats and anomalies, moving/deleting private or sensitive data, installing/uninstalling local agents, disconnecting the system from compromised software, etc.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R10 Functional LOW Service providers Title Reconfiguration of service topology Description One of the main ambitions for GUARD is the ability to adapt to changes in the service graph at run-time (e.g., due to scaling, addition/removal of virtual functions, re-routing of traffic, etc.). Integration with external frameworks should be considered to perform run-time actions of service management, allowing the dynamic modification of the actual service topology. Examples of management frameworks that may be of interest for the project mainly include orchestration tools for cloud, NFV, and IoT applications: Kubernetes [316], Juju [317], OpenBaton [318], Cloudiator [319], OpenSource Mano [320], OpenStack Heat [321], FIWARE Lab management tools, NodeRED [322]. Integration with such management frameworks would improve the level of automation in security management, allowing the implementation of reaction and mitigation strategies at the graph level without human intervention. The scope of the external framework is to provide notifications of broken, insecure, or compromised service components, together with the suggestion of possible remediation actions (e.g., alternative services to connect to, diversion through scrubbing centres, activation of additional detection or mitigation services). Explicit notification should be sent whenever a change occurs, so that all interested entities can be triggered (human staff, detection algorithms, internal storage and processing modules, etc.). The Service Provider must implement this notification mechanism.

Requirement ID Type Priority Target R11 Functional HIGH Guard developers Title Semi-automated operation Description As described in R08, programmability is expected to adapt the behaviour of the monitoring subsystem to the evolving context. This feature is useful to balance the amount of collected information to the actual detection needs, but it also helps investigate new threats and unknown attacks by defining new monitoring metrics and inspection rules. Though this kind of control can be decided and applied by cyber-security staff, in many cases part of these operations could be easily automated, hence contributing to reduce both the burden of repetitive tasks on humans and the latency to detect attacks. For this reason, the GUARD framework is expected to support (at least semi-) autonomous operation by the definition of security policies, which are able to invoke control/management actions when triggered by specific events. Events should include: • indications from detection and analysis algorithms; • changes in service topology; • comparison of data and measurements in the security context against predefined values and thresholds. Policies must consider the current context, in terms of • network measurements; • security properties; • service topology; • logs (from applications and kernel); • system calls; • events from applications. Actions should include • monitoring, inspection and enforcement operations • change of the set and configuration of detection algorithms; • injection of lightweight programs locally;

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• change of the service graph; • change of security properties; • signalling of possible anomalies and threats; • move or deleting sensitive data; • disconnection of compromised/unsecure software; • changes in the composition of the security context (active security agents and their configuration). The GUARD framework must include a specific component to provide this functionality. Algorithms for detection and analysis might have their own smart logic to perform similar operations, if they are aware of the presence and capability of the underlying GUARD framework. At least control operations must be supported through the GUARD APIs; management operations can be limited to notifications to humans in case an external management framework is not linked.

Requirement ID Type Priority Target R12 Functional HIGH GUARD developers Title Dynamic discovery of the service topology Description The framework must have the ability to create an abstract description of the service topology, including the kind of relationships among services, in terms of: • invocation of remote procedures on given data; • execution of actuation commands; • execution of required commands from remote peer(s); • amount of data exchanged; • type of data protection; • geographical location in the infrastructure; • mobility of nodes involved in the services. A security API must provide the interface to access to this information and notify any change that must be signalled to the system. Based on this, the security policies will then take actions on the system.

Requirement ID Type Priority Target R13 Functional MEDIUM GUARD developers Title Data propagation among services Description The system must be able to manage data propagation among services, by including the semantics to: • identify data units; • identify data ownership; • identify access rules; • notify data transfers between services; • remove data units if needed. A security API must include these semantics for data propagation. Concurrently, an internal service must: • keep historical evidence of the position and transactions of data; • consider the specific data format; • build data statistics; • authorize transactions.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R14 Functional HIGH Cyber-security vendors Title Risk assessment tools Description The framework must include tools for risk assessment that guarantee trustworthiness of mutual relationships among parties involved in the service topology. These tools must include information about: • service identification (name, version, …); • software versions, security patches, software developer name(s), software certificates; • vendor (name, address, certificates); • service provider (name, certificates); • certifications and remote attestation procedures; • presence of private/sensitive data; • exposed services and configurations (open ports, encryption/integrity mechanisms); • GUARD services: – context monitoring; – data tracking; – topology discovery; – policy enforcement; – local execution environments for lightweight programs; – detection algorithms; – installed local agents; – data protection procedures; – identity management; – access control. A security API must provide an interface to all the information listed above. The risk assessment tool must detect if the service is trusted or not, by comparing the properties of the service with the user policies and eventually evaluating the risk related to security breaches.

Requirement ID Type Priority Target R15 Functional HIGH GUARD developers Title Identity management and Access control Description The framework must include identity management and access control functionalities, that take into account the multiplicity of tenants in the framework and their related roles. Specifically, they are: • cyber-security staff; • management staff; • legal staff; • service/software developers; • end users; • exporters/importers; • GUARD services/components. Application Programming Interfaces (APIs) must be exploited, where access policies should be enforced. Access control functionalities must be implemented in a distributed fashion, over all the system components. Authentication procedures must be implemented for both humans and services/components.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R16 Functional MEDIUM GUARD developers Title Cyber Threat Intelligence Description The framework must enable creating, sharing, searching and visualizing threat information, including indicators of compromise (IoC) such as IP addresses, service names, file names, process names, URLs and hashes. It has to include interfaces (e.g., REST or message queue interfaces) to intrusion detection systems (IDS) to enable automation of the information sharing process and facilitate the generation of actionable CTI. Furthermore, to improve the interpretation process of novel findings, it should be able to import existing threat information and reports, available, for example, in the STIX format.

Requirement ID Type Priority Target R17 Functional HIGH GUARD developers Title Flexible insertion of detection algorithms Description The detection engine must include the capability of adding dynamically new detection algorithms. As soon as newer detection algorithms are implemented, they can be included in the system without redesigning it from scratch, so that the widest, most variegated and timely updated database of algorithms is guaranteed.

Requirement ID Type Priority Target R18 Functional HIGH GUARD developers Title Packet inspection Description One of the most valuable monitoring targets for NFV is network traffic. The GUARD framework must support deep packet inspection, without limiting the match to standard protocol headers (i.e., IP, TCP, UDP, ICMP). Taking into account R37, this means that a proper number of inspection programs must be implemented. The minimal inspection capabilities must cover the following headers: • Ethernet • IP • ICMP • ARP • TCP • UDP

Requirement ID Type Priority Target R19 Functional MEDIUM GUARD developers Title Packet filtering Description The framework should provide a filtering program, in the form defined by R37, suitable to implement firewalling services. Configuration will enable to set the matching fields, either with exact values or wildcards, for the following packet headers: IP, TCP, UDP, ICMP.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R20 Functional HIGH Service providers Title Secure communication channels Description GUARD collects data and measurements from digital services in a programmatic way, but it is not part of the service graph. Digital services typically run in isolated environments, and only limited access is allowed to front-end functions. The Security Provider must provide a safe and trustworthy communication channel to plug GUARD in the service graph. The GUARD framework must be able to: • collect data from any local security agent and • access and configure any local security agent independently of its placement.

Requirement ID Type Priority Target Service/Software R21 Functional HIGH developers Title Deployment of GUARD security agents Description The Service Provider is responsible for service management. It must deploy GUARD security agents in the digital service, so that cyber-security staff can collect monitoring and inspection data. There are no rigid indications about the number and placement of security agents, since this choice depends on the graph topology and detection services. According to the range of detection and analysis services that should be supported, interaction between cyber-security staff and service providers should define which agents should be deployed in each service. A conservative and safe approach would be the deployment of all agents in each service; GUARD agents are expected to have a small memory and disk footprint and the framework can activate/deactivate them as needed (see R25), so the impact on resource usage should be negligible.

Requirement ID Type Priority Target R22 Functional HIGH GUARD developers Title Disconnected operation Description Decoupling local monitoring from centralized processing introduces an implicit dependency on network connectivity for continuous and seamless operation. One of the main assumptions in GUARD is situational awareness for digital services without relying on any security feature of the underlying infrastructure. This means that reliable network connectivity and continuous communication cannot be taken for grant. So, even if the central framework is able to notify the disconnection and react accordingly (for example, by replacing the isolated service or by breaking the trust chain), the local service may become vulnerable to attacks without specific warning and indication from the detection logic. Accordingly, while pursuing the best detection performance, GUARD should be able to operate even under intermittent connectivity. There are two main technical requirements in this context: • local agents must be able to buffer and retransmit data if they cannot deliver it immediately;

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• the GUARD framework should delay re-configuration commands in case local agents cannot be reached. Since indefinite local storage is not possible, the following behaviour is expected: • local agents should provide a local buffer of 50 MB; • a heartbeat should be sent by local agents at least every 10 s; • a warning should be triggered when no heartbeat is received for 30 s; • an alarm should be triggered when no heartbeat is received for 3 m. Local agents should be configured with a fallback policy to apply in case of disconnection from the GUARD detection engine. Possible strategies include discarding all traffic not belonging to the control connection if no response to their periodic heartbeats is seen for 3 minutes, stop the service or shutdown and backup data.

Requirement ID Type Priority Target R23 Design MEDIUM GUARD developers Title Integration with existing logging facilities Description Several frameworks are already available to collect data and measurements, both for resource monitoring and attack detection. They allow unified management of logs from different applications (this is often used by SIEM for collecting logs). The GUARD framework should maintain compatibility with existing tools, to leverage existing integrations with many common software and applications.

Requirement ID Type Priority Target R24 Design HIGH GUARD developers Title Standardized APIs Description Standardized Application Programming Interfaces (APIs) must be included in the framework, to allow the interaction with external services and guarantee a high level of flexibility for the management of algorithms coming from different subsystems. Open APIs must be defined as: • API #1: Open API for digital services. This API gives access to inspection and monitoring capabilities, security properties and configurations; • API #2: Open API for context abstraction. This API provides an abstraction layer of the infrastructure components of the service, to allow the retrieval of the security context; • API #3: Open API for information sharing. This API reports threats and attacks and feed the detection algorithms for anomaly detection and investigation purposes.

Requirement ID Type Priority Target R25 Design HIGH GUARD developers Title API #1 - Open API for digital services Description The dynamicity and configurability of the system lead to a substantial unpredictability about the composition of the security context (see R04). The framework is therefore expected to expose a suitable abstraction of available security agents in each virtual function, their capabilities, current configurations and loaded

D2.1 Vision, State of the Art and Requirements Analysis V1.0

inspection programs. The same abstraction should also include information about the service topology, so to facilitate the identification of possible relationships and dependencies in the collected data. The context abstraction should be mapped to an open API that allows to build the security context based on local inspection and monitoring capabilities. This API must support at least the following operations: • get service topology (R12), data propagation among services (R13), and risk assessment tools (R14); • get agent list (per virtual function, per service); • get agent capabilities; • get current configuration options (including loaded inspection programs); • collect data, events, messages from local GUARD security agents; • activate/deactivate agents; • change agent configuration (including loading inspection programs); • support polling and/or subscribe/notification paradigms.

Requirement ID Type Priority Target R26 Design HIGH Service providers Title API #2 - Open API for context abstraction Description An open API for attack detection and identification and data trustworthiness and reliability must be included in the framework. It provides systematic access to the available security context, in terms of: • service topology; • service properties; • security features; • real-time and historical data. This interface must implement a flexible and extensible syntax to accommodate several kinds of historical queries and simple data aggregation/fusion capabilities for detection algorithms.

Requirement ID Type Priority Target R27 Design HIGH Service providers Title API #3 - Open API for information sharing Description An open API for user tools and information sharing must be included in the framework, to report threats, attacks, and unreliable or not-compliant configurations. This interface must support notifications and one-to- many communication patterns, without requiring any subscription procedure. Specific components must be included to send notifications (i.e., e-mail, instant messages, etc.) to users, based on their role.

Requirement ID Type Priority Target R28 Functional HIGH GUARD developers Title Automated information sharing Description The framework must include tools that enable automation of processes, as well as components for publication of threat descriptions. Automation of the information sharing process reduces i) delays due to preparation of incident reports in standardized formats, ii) delays caused by importing this knowledge into other systems, and iii) the risk of human mistakes and inaccurate/incomplete information. Tools that enable automatic

D2.1 Vision, State of the Art and Requirements Analysis V1.0

sharing of threat information help operators of essential services to notify about incidents with a significant impact on service continuity without any undue delay. To enable automation of the processes described in R16 a Web-based threat intelligence platform is required that allows automated communication through a REST and message queue interface.

Requirement ID Type Priority Target R29 Design HIGH GUARD developers Title Data protection Description According to R03 and R01 the GUARD monitoring and detection framework will have a distributed architecture, made of local processing tasks and centralized analysis algorithms. Specific mechanisms are required to protect both data in transit and data at rest. The specific target is to avoid increasing the attack surface, by giving the attacker the possibility to catch or alter the security context, in order to deceive the detection process. Data protection mechanism for secure operations translate into the definition of deployment constraints (physical requirements, management requirements), configuration constraints (which ports are open/closed), and encrypted communications (which components can exchange encrypted data). For both data in transit and data at rest, the following security services are expected: • confidentiality: MANDATORY; • integrity: OPTIONAL; • authentication: OPTIONAL.

Requirement ID Type Priority Target R30 Design HIGH GUARD developers Title Identity management procedures Description Identity management must be performed by a centralized Identity Manager that authenticates users, services and logical entities belonging to the GUARD architecture, to meet the needs of scalability from the point of view of both components and users. A token-based authentication procedure guarantees a simplified and effective procedure: users, services, and any other logical entity implementing specific algorithms and procedures are authenticated by an Identity Manager through the SSO authentication scheme. The Identity Manager releases an authentic token storing the attributes associated to each authenticated entity. The token is used also to perform future authorization procedures.

Requirement ID Type Priority Target R31 Design HIGH GUARD developers Title Access control procedures Description An access control scheme must be implemented in the framework to authorize users, services, and other logical entities in the system. The scheme authorizes users, services and entities in general based on specific attributes: • entity type (user, service, process, etc.); • entity role (data collector, data analyzer, notifyer, etc.);

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• type of action taken (read, write, move, modify, execute, etc.); • date/time of action; • type of data (application data, log, text, command, network packet, sensitive information, etc.). Based on any combination of specific attributes, the expected controls are: • users that can invoke commands/functions; • users that can read data; • users that can modify/delete data; • processes that can read/elaborate data; • processes that can write data; • users/processes that can move data from one location to another; • users/processes that can program the digital services; • users/processes that can access data at specific days/times. • services/processes that can execute code. The authorization phase is decoupled from the authentication phase, and is based on the release of a token obtained at the end of the authentication phase. The token contains the attributes that will then be managed for the authorization phase.

Requirement ID Type Priority Target R32 Design HIGH Service providers Title Data accuracy Description The GUARD framework must allow for the ex officio update of end-users data, collecting information from relevant authorities.

Requirement ID Type Priority Target R33 Design HIGH Cyber-security operators Title Data minimization Description Data processed by the GUARD system must be adequate, relevant and limited to what is necessary in relation to the purposes of processing in order to comply with the “data minimization” principle. Thus, the system should have a predefined list of types of personal data which will be collected strictly corresponding to the purposes for which it will be processed.

Requirement ID Type Priority Target R34 Design HIGH GUARD developers Title Service discovery Description The framework must include a module with service discovery capabilities, to understand where the services are placed in the system, what are their operation modes, and how they are linked together. This information is then exploited to create the abstraction that will form the security context.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R35 Design HIGH GUARD developers Title Context programmer Description The framework must include a module with capabilities of interfacing with the local agents, to offload the tasks locally. It also translates high-level policies into low-level configuration and settings. The level of programmability encompasses verbosity level of logs, network traffic statistics (like average/max number of packets for specific flows), types of system calls to be reported (disk, network, timers, etc.), and small programs for data processing, fusion, and aggregation.

Requirement ID Type Priority Target R36 Design LOW Cyber-security vendors Title Scalable detection and analysis algorithms Description GUARD designs a flexible and portable framework that can be used for any digital service. Programmability is expected to adjust the depth of inspection according to the actual needs; hence the amount of information may vary in a wide range, according to the number of virtual functions and current workload. It is therefore difficult to properly allocate computing resources for analysis and detection. Developers of cyber-security applications are required to develop scalable algorithms, which can be easily scaled in and out according to the amount of information to be processed, without the need to statically allocate resources.

Requirement ID Type Priority Target R37 Design HIGH GUARD developers Title Program repository Description Based on R08, there must be a collection of monitoring, inspection, and enforcement programs that can be dynamically loaded at run-time by the security agents deployed in virtual functions. Programs must be pushed and executed locally by means of metadata, to guarantee the compatibility between programs in the repository and services, and the matching between the programs and their local execution. The repository must ensure the origin, integrity, and trustworthiness of such programs. A program must not be inserted in the repository if its source and integrity cannot be verified.

Requirement ID Type Priority Target R38 Design HIGH GUARD developers Title Storage limitation Description The storage limitation principle states that personal data should not be kept longer than needed. The GUARD framework must have a retention policy which lists the types of record or information kept by the system, what will they be used for, and how long should they be kept. The system must have the ability to automatically identify and erase the data that is no longer needed.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R39 Design HIGH GUARD developers Title Control logic Description A control logic must be included in the system, that intelligently responds to the system events by activating algorithms, sending notifications, and reprogramming the local environment. It acts as an event processor, i.e., according to a list of policies defined by the user, and in response to a specific event, it generates another event according to specific conditions. The main goal of this component is to provide automaticity to the whole system (see R11).

Requirement ID Type Priority Target R40 Design HIGH Cyber-security operators Title PKI infrastructure Description A public key infrastructure (PKI) must be present in the system, to manage, distribute and store digital certificates and keys and guarantee trusted relationships between logical entities. The following functional components must be present: • a Certification Authority (CA), for issuing and revoking certificates; • a Generator of a public and private key pair; • Certificates in a standardized format; • a CA database with trusted CAs and their public keys; • a Certificate Revocation List (CRL) that keeps track of certificates that are revoked, a time stamp saying when they have been revoked, and the reason for revocation (a compromised key, or a superseded certificate); • a CRL Distribution Point (CDP), where entities can access and download the CRL. The CA is external to the system. The system assigns a public and private key pair to a logical entity, chooses a CA from the database of trusted CAs, and sends it a Certificate Signing Request (CSR) with the public key of the entity. The external CA generates a certificate in a standardized format (e.g., adopting the X.509 standard). The certificate validity is verified by accessing a CDP and downloading the CRL, which is cached in the system for all its validity period.

Requirement ID Type Priority Target R41 Design HIGH GUARD developers Title Accountability Description The system must keep a record of the log files for a better management, accountability and transparency.

Requirement ID Type Priority Target R42 Design MEDIUM GUARD developers Title Data subject rights implementation Description

D2.1 Vision, State of the Art and Requirements Analysis V1.0

The system must allow the data subjects to exercise their rights under the GDPR via the user dashboard as described in R50. They should be able to automatically request information, access to their data, rectification, erasure, etc. Relevant data subject’s rights are: • Article 12: Exercise of the Rights of the Data Subject; • Article 13, 14: Right to Be Informed; • Article 15: Right to Access; • Article 16: Right to Rectification; • Article 17: Right to Erasure (“Right to be Forgotten”); • Article 18: Right to Restriction of Processing; • Article 19: Notification Obligation; • Article 20: Right to Data Portability; • Article 21: Right to Object; • Article 22: Object to Automated Individual Decision Making; • Article 7(3): Right to Withdraw Consent.

Requirement ID Type Priority Target R43 Performance LOW Cyber-security vendors Title Detection rate Description The detection rate quantifies the number of successfully detected attacks over the total. It is also considered as the typical performance index and an important evaluation parameter for intrusion detection systems. The formal definition of this requirement is found in [323]. Detection rate is difficult to predict with precision, because of the lack of concrete data and network full characteristics in the general case. In the case of known attacks, this rate can be relatively high, but at the cost of having the full knowledge about the infected data and IT components, malicious infected signals, etc., so that the anomalies can be analyzed and compared to the known signatures of the attacks. Studies in this direction have been made in [324], where the rates can be in the range of 60-90% for the offline signature-based analysis, and in [325]-[327]. For the malware campaign detection using methodologies like SVN, k-NN, Naïve Bayes, etc., the achieved rates are around 86%-98% according to the studies conducted in [328]. Exemplary results for the DDoS, Black Hole, Opportunistic, sinkhole, wormhole attacks can be found in [329].

Requirement ID Type Priority Target R44 Performance LOW Cyber-security vendors Title Precision of detection Description High rates of precision in the detection of attacks must be reached, to avoid that false positives overwhelm the system and reduce the level of attention of security operators. The attack detection rate is strictly related to the requirement R43 (Detection rate). The formal definition of this requirement is found in [323]. Given the strict relationship with R43, also this metric is difficult to estimate precisely, and for the same reasons discussed in R43. Furthermore, this metric strongly depends on the specific scenario under analysis, as testified by comparing the results provided in [325]-[327]. Other efforts to directly quantify this metric can be found in [330], where an average precision in the attack detection is estimated with different detection methods, which can be used for known attacks, even if the evaluation can also be extended to unknown attacks. The high percentages obtained in [330] for known attacks are instead very difficult to obtain for unknown attacks. Taking into account all the studies mentioned above, a rough estimation of the precision

D2.1 Vision, State of the Art and Requirements Analysis V1.0

rate of 75-85% can be set as the target value for unknown attacks and, and 85-95% for known attacks, with an error of 5%, which is acceptable from the statistical point of view.

Requirement ID Type Priority Target R45 Performance LOW Cyber-security vendors Title Average time to detect anomalies Description The detection of anomalies in the network traffic often is an early sign of a network-based attack. For network traffic, real-time detection is possible for many threats (i.e., DoS, port scanning), but the detection of advanced persistent threats may require days or weeks. These assumptions are supported by the works [325]- [327] that present a series of tests on anomaly detection in the IT networks. According to these studies, an accurate estimation of the average time to detect anomalies is very difficult, especially in the case of unknown attacks, and the time window can range from few hours until few days (even if a week is often assumed). In the GUARD context, and based on results found in the existing literature and current technologies, a time window of 8 hours should be set as target value for DoS and signature-based attack techniques.

Requirement ID Type Priority Target R46 Performance LOW Cyber-security vendors Title Average time to respond to attacks Description The detection of an attack must be followed by an effective mitigation and response actions. Their effectiveness is usually measured with the timeliness of the action, in order to limit the impact of the attack. The time interval from the detection of the attack to the response action as described in R03 should be of 2 minutes, and does not include the time to send notes on response feasibility and updates, nor the time interval for the effectiveness of the attack mitigation. Please note that this value can be recalibrated on the basis of the framework capabilities.

Requirement ID Type Priority Target R47 Performance MEDIUM Service providers Title Performance of services Description Many virtual services are designed to be elastic, i.e., to scale according to the actual workload and evolving security context. This feature is highly desirable for GUARD, since the operation of local security agents might overwhelm the execution of one or more virtual functions. In such cases a low overhead is desirable; nevertheless, the resource overhead is strongly related to the activity degree of the specific service and to the specific resource under analysis. For this reason, the Service Provider should adopt mechanisms for monitoring resource usage (utilized bandwidth, CPU load, I/O performance, etc.), so to scale the service accordingly or fostering the cyber-security framework to reduce its impact.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R48 Reliability HIGH GUARD developers Title Safe monitoring and inspection Description According to R08, the framework should support the definition of monitoring and inspection tasks at run- time, by dynamically creating and loading new processing pipelines. Additional software and frameworks co- located with virtual functions must not introduce vulnerabilities to single functions or to the entire service. The following aspects shall be considered for safe monitoring and inspection: • the number of local agents (as depicted in Figure 6) should be kept as small as possible, and their implementation as simple as possible; • the framework must not allow the execution of arbitrary binaries and scripts, but must provide safe and controlled execution environments for running user-defined monitoring and inspection tasks.

Requirement ID Type Priority Target R49 Usability LOW GUARD developers Title Notification of security events Description Notifications of security events is a set of technical and procedural functions that must bring awareness to humans, at the same time assuring the best understanding of the events to recipients. Notification classes envisioned by the project include: • attack detected: • new threat identified; • non-compliance with the security context; • non-compliance with the access control policy; • compromised software detected; • change in service topology detected; • change in service configuration detected; • sensitive data removed; • modified position of sensitive data; • untrusted program detected; • expired/superseded certificate; • unencrypted data in transit; • authentication attempt failed. The amount of information and the technical depth of the notification must be related to the users receiving it. For example, a notification to the technical staff may be limited to representations of technical aspects (attack type, position, involved entities, source); an uncertainty in the position of private data should send a warning about potential privacy violation to the legal staff, etc.

Requirement ID Type Priority Target R50 Usability MEDIUM GUARD developers Title Dashboard Description The GUARD dashboard must provide a visual representation of the service graph and main security and monitoring concerns like threats, workload, statistics on usage resources, etc., keeping also track of sensitive

D2.1 Vision, State of the Art and Requirements Analysis V1.0

user-related information, their privacy settings, and perform control actions on transfers of personal data. The dashboard must be provided with different interfaces, each one with its layout and informative content, and that are tailored to the different user roles. Expected capabilities are listed below: • depict current service topology (virtual functions and logical relationship); • show/edit local agent configuration for each virtual function; • show/edit/manage available detection algorithms; • show/edit/manage inspection programs; • show/edit/manage security policies; • show real-time and historical data; • show data queries; • show the list of past security events and actions taken; • show private data tracking; • show the system workload (usage of bandwidth, CPU and processors, programs in execution, ...); • show the CTI view. By clicking on the graph nodes, it should be possible to show the current monitored information. Alerts from the detection algorithms should be shown prominently in the interface, capturing the attention of the operator. The combination of a general menu and a context menu provides quick access to the main functions, including configuration, management, and response actions like managing the personal data for end users, enabling or revoking permissions for access to sensitive data, making decisions on who can process data based on the controller trustworthiness, asking to remove personal data from specific services or domains, and restricting the data to be shared according to specific policies.

Requirement ID Type Priority Target R51 Usability MEDIUM GUARD developers Title Personalized user dashboard Description The end user must have the possibility to personalize the number and amount of information visible in the dashboard. The way the information is displayed is visualized in text and/or graphical format, with the possible utilization of widgets whose number, size and position is decided by the user. The dashboard must have the possibility to be optimized depending on the screen sizes and resolutions of different device types (desktop/laptop PCs, smartphones, tablets, etc.). Through the dashboard, the end user will be able to define the policy and receive notifications on his/her personal data, identify the responsible for data management according to the normative framework (e.g., the GDPR), allow, deny, remove and clear access to private and sensible data in part or from the whole service chain. The expected elements for the end user in the dashboard include: • security warnings; • identification of private data location; • definition of propagation rules for private data; • removal of private data.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R52 Usability LOW GUARD developers Title Messages exchange Description The dashboard must include the possibility to exchange messages between end users and other persons or groups like technical staff, management staff, legal staff, etc. through different (real-time and/or non real- time) tools like instant messaging, chats, web-based forms, etc. Messages can be associated to different priority levels, assigned by the message sender in dependence to the urgency degree of the request. Requirement ID Type Priority Target R53 Implementation HIGH Cyber-security operators Title Selection of detection algorithms Description Cyber-security staff is responsible to select the set of detection and analysis algorithms that will build situational awareness for each digital service. The decision is taken according to the security features requested by the Service Provider and the outcome of the risk assessment process (R14). The selection should not be static: the set of algorithms may be changed at run-time, according to the evolving context. For instance, some algorithms may run just from the discovery of a new vulnerability up to mitigation of the threat (e.g., application of a software patch).

Requirement ID Type Priority Target R54 Implementation MEDIUM Cyber-security operators Title Creation of inspection programs Description The GUARD framework will be able to run user-defined inspection programs. Cyber-security operators are responsible to create such programs, according to their specific inspection needs.

Requirement ID Type Priority Target R55 Implementation MEDIUM Cyber-security operators Title Definition of security policies Description The GUARD framework must be able to carry out security-related operation in an autonomous way (see R11). Cybersecurity operators are responsible to define rules that drive such process, according to the model and language that will be implemented in the GUARD system. The rules should specify how to behave when specific events occur. Events should entail both the security context and notification from detection algorithms.

Requirement ID Type Priority Target R56 Implementation MEDIUM Service providers Title Unavailability of the GUARD framework Description If a service provider relies on the GUARD framework for secure operation of its digital services, it should stop the service in case the connection is lost due to network failure or attacks to the infrastructure. The definition

D2.1 Vision, State of the Art and Requirements Analysis V1.0

of the waiting time before stopping the service should be set according to the impact of possible attacks to the service.

Requirement ID Type Priority Target R57 Implementation LOW Cyber-security operators Title Trusted and secure infrastructure Description The GUARD framework must be installed in trusted and secure infrastructure, so to be properly isolated and protected from external threats. It should run in dedicated infrastructure with a sharp security perimeter and strict access control on network connections and users. The following pre-requisites are necessary to this aim: • the infrastructure must be owned and operated by the cybersecurity operator; • all computing, networking, and storage equipment must be installed in rooms with locked doors and restricted access; • the infrastructure should be reserved to run the GUARD framework only; • a firewall must be installed that only allows incoming connections from active services and authorized operators; • no software but GUARD should run in the secure infrastructure; • all GUARD components should be run by unprivileged users (only non-reserved ports should be used for incoming connections); • no passwords should be stored in any configuration file; • management operations should only be allowed by a local terminal with strong authentication method (e.g., public key) or by external connections with two-factors authentication; • software updates must only use trusted sources and signed packages.

Requirement ID Type Priority Target R58 Implementation HIGH GUARD developers Title Operating system Description The GUARD framework will be available for Linux ≥ 2.6, x86_64 architecture. No specific restrictions on the distribution.

Requirement ID Type Priority Target R59 Implementation HIGH GUARD developers Title Concurrency and multitenancy Description The GUARD framework must avoid conflicts in accessing and modifying local agents. The following conditions are required: • the configuration of each agent must be changed by one entity (algorithm, operator) at a time to avoid any race conditions and undefined behaviour; • an entity (algorithm, operator) cannot override the previous configuration of another entity with looser options (e.g., the frequency of reporting cannot be to 30 s if it was set to 20 s by a different entity); • non-conflicting configurations from different entities should be merged together;

D2.1 Vision, State of the Art and Requirements Analysis V1.0

• a mechanism should be implemented for removing options and parameters that are no more necessary, to avoid wasting resources in case there are not entities interested in the data anymore.

Requirement ID Type Priority Target R60 Implementation HIGH GUARD developers Title Secure API Description The implementation of the API as described in R25 and the management of programmable components as described in R63 need protection against misuse and attacks. The use of those interfaces must be subject to the following security mechanisms: • identification; • authentication; • authorization; • accounting; • encryption; • integrity.

Requirement ID Type Priority Target R61 Implementation MEDIUM Cyber-security operators Title Time synchronization Description All the information exchanged among the system components and the services must be time-synchronized, to keep track of the exact temporal sequence of the events. This requires, in turn, that all the clocks of the system must be time-aligned.

Requirement ID Type Priority Target R62 Implementation HIGH Cyber-security vendors Title Supported executable formats for detection algorithms Description The system must support the execution of: • ELF binaries • java bytecode • python programs Java and Python environments must be present in the system, to correctly support the related execution codes.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R63 Interface HIGH GUARD developers Title Manage programmable components Description According to R11, R53 and R37 there will be multiple user-defined components (detection algorithms and monitoring, inspection, and enforcement programs) that are load into the system at operation time. The GUARD framework must provide an interface for the management of these components: • list available objects (algorithms, programs, policies), including type, name, version, owner, human- readable description, vendor, certification, deployment constraints; • query and filter objects based on their properties; • load new objects, remove existing objects; • check whether the object is in use and where.

Requirement ID Type Priority Target R64 Functional LOW GUARD developers Title Automatic back-up information Description The system must have an automatic back-up function allowing the recovery of information. Note that this back-up should be stored in an EU member state in order to comply with the data protection rules of the EU.

Requirement ID Type Priority Target R65 Functional MEDIUM GUARD developers Title Automatic data breach notification Description The GUARD framework must support an automatic notification to the competent data protection authorities in cases of a data breach. According to Art. 33 of the GDPR, in case of data breach the controller shall notify the supervisory authority without undue delay. The system should also ensure the notification to the national competent authority on the security of network and information systems, in cases of any incidents having a substantial impact on the provision of the GUARD services.

Requirement ID Type Priority Target R66 Functional MEDIUM Cyber-security operators Title Availability of services Description The system must implement appropriate technical and organizational measures to guarantee an appropriate level of security and ensure the ongoing confidentiality, integrity, availability and resilience of processing systems and services, regardless of the user location.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

Requirement ID Type Priority Target R67 Functional HIGH GUARD developers Title Record keeping (including log files) Description As required by GDPR, controllers and processors must maintain records of their data processing activities. The system should be tailored to keep data processing records with all the information listed in art. 30 of the GDPR.

Requirement ID Type Priority Target Service/Software R68 Implementation LOW developers Title Cookies compliance Description The system should have a cookie consent form. The consent should be given before any data processing takes place. The cookies must be paused and stay that way until consent has been given (excluding functional cookies). Requirement ID Type Priority Target R69 Interface MEDIUM GUARD developers Title Standardized data formats Description Standardized formats must be used to export/import data to/from the system. Standardized formats are highly recommended for a unified and universally recognized way to describe threats, who/what they are directed to, their consequences, adopted countermeasures, etc., to maximize the interaction with external systems. A common format for exchanging information via REST and message queue interfaces is JSON, which is also supported by STIX 2.1).

Requirement ID Type Priority Target R70 Interface MEDIUM GUARD developers Title Identity and access management interface Description An interface with external systems must be implemented in the framework, to manage the information on user identities and perform access control procedures from external Identity and Access Management (IAM) systems. For example, LDAP and RADIUS servers and protocols are both commonly used to authenticate and authorize users in external IAM systems.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

References [1] International Data Spaces association, “IDS Reference Architecture Model Industrial Data Space”. Version 2.0. [Online] Available at: https://www.fraunhofer.de/content/dam/zv/de/Forschungsfelder/industrial-dataspace/IDS_Referenz_Architecture.pdf. [2] ETSI, “2014b GS NFV-MAN001- v1.1.1: Network Functions Virtualisation (NFV); management and orchestration. technical report,” Accessed May 2019. [Online]. Available: http://www.etsi.org/deliver/etsigs/NFV-MAN/001099/001/01.01.0160/gs NFVMAN001v010101p.pdf [3] OASIS, “Topology and Orchestration Specification for Cloud Applications version 1.0,” 2013. [4] R. Rapuzzi, M. Repetto, “Building situational awareness for network threats in fog/edge computing: Emerging paradigms beyond the security perimeter model”, Future Generation Computer Systems 85 (2018) 235–249. doi:10.1016/j.future.2018.04.007. [5] NESSI, “Security and privacy: From the perspective of software, services, cloud and data”, White Paper (March 2016) [cited January 15th, 2019]. URL: http://www.nessi-europe.com/Files/Private/NESSI_Security_ Privacy_White_Paper_issue_1.pdf. [6] F. Doelitzscher, C. Reich, M. Knahl, N. Clarke, “An autonomous agent- based incident detection system for cloud environments”, in: Proceedings of the 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom), Athens, Greece, 2011, p. 197–204. doi:10.1109/CloudCom.2011.35. [7] J. Pescatore, “How DDoS detection and mitigation can fight advanced targeted attacks”, SANS Whitepaper (September 2013) [cited January 23rd, 2019]. URL: https://www.sans.org/reading- room/whitepapers/analyst/ddos-detection-mitigation-fight-advanced-targeted-attacks-35000 [8] D. Holmes, DDoS attack trends, f5 whitepaper (November 2016) [cited January 23rd, 2019]. URL: https://f5.com/Portals/1/PDF/security/2016_DDoS_ Attack-Trends.pdf [9] Radware, European application and network security report”, Whitepaper (2017) [cited January 23rd, 2019].URL:https://www.radware.com/getattachment/6bdd2d2a-fd3d-48c7-a160- 0909dc219113/Radware_ERT_Report_2016-2017.pdf.aspx?disposition=attachment [10] Symantec, “Internet security threat report”, Whitepaper, volume 23 (April 2018) [cited January 23rd, 2019]. URL: https://www.symantec.com/content/dam/symantec/docs/reports/istr-23-2018-en.pdf [11] B. Shen, “What are your key definitive strategies against advanced persistent threat today? A new, modern approach - blue coat advanced threat protection”, in: RSA Conference 2014, San Francisco, CA – USA, 2014 [cited December 13th, 2018]. URL: https://www.rsaconference.com/writable/presentations/file_upload/spo-t11-what-are-your-key- definitive-strategies-against-advanced-persistent [12] J. de Vries, H. Hoogstraaten, J. van den Berg, S. Daskapan, “Systems for detecting advanced persistent threats cybersecurity”, in: 2012 International Conference on Cyber Security, Washington, DC – USA, 2012, pp. 54–61. doi:10.1109/CyberSecurity.2012.14. [13] P. Bhatt, E. T. Yano, P. M. Gustavsson, “Towards a framework to detect multi-stage advanced persistent threats attacks”, in: Proceedings of the 2014 IEEE 8th International Symposium on Service Oriented Sys- tem Engineering (SOSE ’14), Oxford, UK, 2014, pp. 390–395, DOI: 10.1109/SOSE.2014.53. doi:10.1109/SOSE.2014.53. [14] T. J. Jeslin, “State of the art analysis of defense techniques against advanced persistent threats” (September 2017). doi:10.2313/NET-2017-09-109. URL: https://www.net.in.tum.de/fileadmin/TUM/NET/NET-2017-09-1/NET-2017-09-1_09.pdf [15] G. P ek,́ L. Butty an,́ B. Bencsath, “A survey of security issues in hardware virtualization”, ACM Computing Surveys 45 (3) (2013) 40:2–40:34. doi:10.1145/2480741.2480757. [16] W. Holmes, VMware NSX micro-segmentation, Vmware press. URL: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/nsx/vmware- nsx-microsegmentation.pdf?ClickID=calex4zkzawpeze7elpwznfnl7lxl7kkknea

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[17] B. Hedlund, “What is a distributed firewall?”, Post on vmware blogs - Network virtualization (July 2013). URL https://blogs.vmware.com/networkvirtualization/2013/07/what-is-a-distributed- firewall.html [18] H. Kasai, W. Kellerer, M. Kleinsteuber, “Network volume anomaly detection and identification in large- scale networks based on online time-structured traffic tensor tracking”, IEEE Transactions on Network and Service Management 13 (3) (2016) 636–650. [19] G. Settanni, F. Skopik, Y. Shovgenya, R. Fiedler, M. Carolan, D. Conroy, K. Boettinger, M. Gall, G. Brost, C. Ponchel, M. Haustein, H. Kaufmann, K. Theuerkauf, P. Olli, “A collaborative cyber incident management system for European interconnected critical infrastructures”, Elsevier Journal of Information Security and Applications (JISA) 34(2) (2017) 166-182. [20] X. Liu, Z. Liu, “Evaluating method of security threat based on attacking-path graph model”, in: 2008 International Conference on Computer Science and Software Engineering, Vol. 3, Hubei, China, 2008, pp. 1127-1132. [21] P. K. Manadhata, J. M. Wing, “An attack surface metric”, IEEE Transactions on Software Engineering 37(3) (2011) 371-386. [22] M. Mohsin, Z. Anwar, “Where to kill the cyber kill-chain: An ontology-driven framework for IoT security analytics”, in: 2016 International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, 2016, pp. 23-28. [23] X. Lin, P. Zavarsky, R. Ruhl, D. Lindskog, “Threat modeling for CSFR attacks”, in: 2009 International Conference on Computational Science and Engineering, Vol. 3, Vancouver, BC, Canada, 2009, pp. 486- 491. [24] W. C. Moody, H. Hu, A. Apon, “Defensive maneuver cyber platform modelling with stochastic petri nets”, in: 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Work sharing, Miami, FL, USA, 2014, pp. 531-538. [25] Y. Li, D. E. Quevedo, S. Dey, L. Shi, “A game-theoretic approach to fake-acknowledgment attack on cyber-physical systems”, IEEE Transactions on Signal and Information Processing over Networks 3(1) (2017) 1-11. [26] Radware, “Global application and network security report 2017-18”, Whitepaper (2018) [cited January 23rd, 2019]. URL http://global.radware.com/APAC_2018_ERT_Report_EN. [27] IO Visor Project – Advancing In-Kernel IO virtualisation by programmable data planes with extensibility, flexibility and high-performance. Web site: https://www.iovisor.org. [28] DPDK – Data plane development kit. Web site: http://dpdk.org/ [29] FD.io – The Fast Data Project. Web site: https://fd.io/ [30] R. Enns, M. Bjorklund, J. Schoenwaelder, A. Bierman. “Network Configuration Protocol (NETCONF)”, RFC 6241, June 2011. [Online] Available: https://tools.ietf.org/html/rfc6241 [31] The P4 Language Consortium. Web site: http://p4.org [32] A. Bierman, M. Bjorklund, K. Watsen. “RESTCONF Protocol”, RFC 8040, January 2017. [Online] Available at: https://tools.ietf.org/html/rfc8040 [33] Strategic Research and Innovation Agenda, NESSI Position Paper, Version 2.0, April 2013. Available at: http://www.nessi-europe.com/files/NESSI_SRIA_Final.pdf [34] Intelligent transport systems - Communications access for land mobiles (CALM) – Architecture. ISO 21217:2014, April 2014. [35] IEEE Standard for Information technology—Telecommunications and information exchange between systems Local and metropolitan area networks—Specific requirements - Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. IEEE 802.11-2016, Dec. 2016. DOI: 10.1109/IEEESTD.2016.7786995. [36] ERGEG conclusions paper on smart grids. Position Paper, June 2010, ref: E10-EQS-38-05. [Online]. Available: http://www.energy- regulators.eu/portal/page/portal/EER_HOME/EER_PUBLICATIONS/CEER_ERGEG_PAPERS/Electricity/ 2010/E10-EQS-38-05_SmartGrids_Conclusions_10-Jun-2010_Corrige.pdf. [37] http://dpdk.org

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[38] McMillan R (2013) “Definition: Threat Intelligence”. Gartner [39] Chismon D, Ruks M (2015) “Threat Intelligence: Collecting, Analyzing, Evaluating”. MWR InfoSecurity. [40] NIST (2016) “Guide to cyber threat information sharing. special publication” 800-150. Tech. rep. [41] Structured threat information expression v2.0. https://oasisopen.github.io/cti-documentation/. [42] Scarfone K, Mell P (2009) “An analysis of cvss version 2 vulnerability scoring”. In: Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement, IEEE Computer Society, pp 516–525. [43] https://cve.mitre.org/ [44] https://nvd.nist.gov/ [45] Skopik F (2017) “Collaborative Cyber Threat Intelligence: Detecting and Responding to Advanced Cyber Attacks at the National Level”. CRC Press. [46] Dalziel H, Olson E, Carnall J (2014) “How to Define and Build an Effective Cyber Threat Intelligence Capability”. Syngress Publishing. [47] Pokorny Z (2018) “4 Ways Machine Learning Produces Actionable Threat Intelligence”. Recorded Future. [48] Appala, Syam and Cam-Winget, Nancy and McGrew, David and Verma, Jyoti (2015) “In Workshop on Information Sharing and Collaborative Security”, pp 61-70. [49] What is “Actionable” Threat Intelligence? 2018. https://equiniti.com/uk/news-and-views/eq- views/what-is-actionable-threat-intelligence/ [50] http://stix.mitre.org/ [51] http://taxii.mitre.org [52] https://tools.ietf.org/html/rfc5070 [53] https://tools.ietf.org/html/rfc6545 [54] https://caesair.ait.ac.at/ [55] Settanni, Giuseppe, et al. "Correlating cyber incident information to establish situational awareness in critical infrastructures." 2016 14th Annual Conference on Privacy, Security and Trust (PST). IEEE, 2016. [56] http://www.misp-project.org/features.html [57] https://github.com/certtools/intelmq [58] https://github.com/crits/crits#readme [59] https://www.ncsc.nl/english/Incident+Response/taranis.html [60] https://www.massivealliance.com/massive-intel/ [61] Application Programming Interface, Wikipedia. Retrieved 19th July 2019, https://en.wikipedia.org/wiki/Application_programming_interface [62] Representational State Transfer, Wikipedia. Retrieved 19th July 2019, https://en.wikipedia.org/wiki/Representational_state_transfer [63] Roy Thomas Fielding (2000). “Architectural Styles and the Design of Network-based Software Architectures”. https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm [64] P. Th. Eugster, P.A. Felber, R. Guerraoui, A.-M. Kermarrec (2003). "The many faces of Publish subscribe. http://members.unine.ch/pascal.felber/publications/CS-03.pdf [65] NGSIv2 Specification https://fiware.github.io/specifications/ngsiv2/stable/ [66] FIWARE Orion Enabler Documentation: https://fiware-orion.readthedocs.io/en/master/ [67] GraphQL. Wikipedia. Retrieved 19th July 2019, https://en.wikipedia.org/wiki/GraphQL [68] GraphQL Specifications https://graphql.github.io/graphql-spec/ [69] W3C Recommendation (April 2017). “SOAP Version 1.2 Part 1: Messaging Framework (Second Edition) https://www.w3.org/TR/soap12/ [70] Luis Oliva Felipe (2009). “Design and development of a REST-based Web service platform for applications integration” https://upcommons.upc.edu/bitstream/handle/2099.1/8553/Master%20thesis%20-%20Luis%20Oliva.pdf [71] K. Wagh, M. Mitra, Sh. Guru, G. Singhji (2012) “A Comparative Study of SOAP Vs REST Web Services Provisioning”. https://pdfs.semanticscholar.org/4122/ddd2306f3fcc2d53d7e8abf866d28f4994a1.pdf

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[72] R. Fielding, J. Gettys, J. Mogul. H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee (1999). “RFC 2616 - Hypertext Transfer Protocol -- HTTP/1”. https://www.w3.org/Protocols/rfc2616/rfc2616.html [73] T. Dierk, E. Rescorla (August 2008). “RFC 5246. The Transport Layer Security (TLS) Protocol Version 1.2”. https://tools.ietf.org/html/rfc5246 [74] D. Hardt (Oct. 2012) RFC6749. The OAuth 2.0 Authorization Framework https://tools.ietf.org/html/rfc6749 [75] Oauth2 Web site. https://oauth.net/2/ [76] Identity-Manager Keyrock documentation: https://fiware-idm.readthedocs.io/en/latest/ [77] OpenID: What’s OpenID: https://openid.net/what-is-openid/ [78] OpenID Authentication: https://openid.net/specs/openid-authentication-2_0.html [79] B. Parducci, H. Lockhart, E. Rissanen (August 2010). “eXtensible Access Control Markup Language (XACML) Version 3.0”. http://docs.oasis-open.org/xacml/3.0/xacml-3.0-core-spec-cs-01-en.pdf [80] AuthzForce enabler documentation: https://authzforce-ce-fiware.readthedocs.io/en/latest/ [81] Wilma PEP Proxy enabler documentation: https://fiware-pep-proxy.readthedocs.io/en/latest/ [82] Internet Engineering Task Force (May 2015) - JSON Web Token (JWT) - https://tools.ietf.org/html/rfc7519 [83] https://jwt.io/introduction/ [84] International Data Spaces Association (April 2019) - Reference Architecture Model 3.0 - https://www.internationaldataspaces.org/wp-content/uploads/2019/03/IDS-Reference- Architecture-Model-3.0.pdf [85] ETSI GS CIM 009 V1.1.1 (2019-01) Context Information Management (CIM) https://www.etsi.org/deliver/etsi_gs/CIM/001_099/009/01.01.01_60/gs_CIM009v010101p.pdf [86] W3C Working Group (January 2014) Best Practices for Publishing Linked Data https://www.w3.org/TR/ld-bp/ [87] International Data Spaces Association (April 2019) - Reference Architecture Model 3.0 - https://www.internationaldataspaces.org/wp-content/uploads/2019/03/IDS-Reference- Architecture-Model-3.0.pdf [88] A. Herzog, N. Shahmehri & C. Duma (2007). “An Ontology of Information Security”. IJISP. 1.1- 23.10.4018/jisp.2007100101. [89] B. A. Mozzaquatro, C. Agostinho, D. Goncalves, J. Martins and R. Jardim-Goncalves (2016). “An ontology-based cybersecurity framework for the Internet of Things” https://www.mdpi.com/1424-8220/18/9/3053/pdf [90] W3C Ontology: https://www.w3.org/standards/semanticweb/ontology [91] W3C Working Group (Dec 2012). OWL 2 Web Ontology Language Document Overview (Second Edition) - https://www.w3.org/TR/2012/REC-owl2-overview-20121211/ [92] B. Lang, J. Wang, and Y. Liu, “Achieving flexible and self-contained data protection in cloud computing,” IEEE Access, vol. 5, pp. 1510–1523, 2017. [93] L. Lamport, “Password authentication with insecure communication,” Commun. ACM, vol. 24, no. 11, pp. 770–772, 1981. [94] E. J. Yoon, K. Y. Yoo, C. Kim, Y.-S. Hong, M. Jo, and H. H. Chen, “A secure and efficient sip authentication scheme for converged VOIP networks,” Comput. Commun., vol. 33, no. 14, pp. 1674–1681, 2010. [95] R. Arshad and N. Ikram, “Elliptic curve cryptography based mutual authentication scheme for session initiation protocol,” Multimedia Tools Appl., vol. 66, no. 2, pp. 165–178, 2013. [96] S. H. Islam and G. Biswas, “Design of improved password authentication and update scheme based on elliptic curve cryptography,” Math. Comput. Modelling, vol. 57, no. 11, pp. 2703–2717, 2013. [97] P. Guo, J. Wang, X. Geng, S. K. Chang, and J.-U. Kim, “A variable threshold-value authentication architecture for wireless mesh networks,” J. Internet Technol., vol. 15, no. 6, pp. 929–935, 2014. [98] J. Shen, H. Tan, J. Wang, J. Wang, and S. Lee, “A novel routing protocol providing good transmission reliability in underwater sensor networks,” J. Internet Technol., vol. 16, no. 1, pp. 171–178, 2015. [99] M.-S. Hwang and L.-H. Li, “A new remote user authentication scheme using smart cards,” IEEE Trans. Consum. Electron., vol. 46, no. 1, pp. 28–30, Feb. 2000.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[100] L.-H. Li, L.-C. Lin, and M.-S. Hwang, “A remote password authentication scheme for multiserver architecture using neural networks,” IEEE Trans. Neural Netw., vol. 12, no. 6, pp. 1498–1504, Nov. 2001. [101] K. Papadamou, S. Zannettou, G. Bianchi, A. Caponi, A. Recupero, S. Gevers, G. Gugulea, S. Teican, B. Chifor, C. Xenakis, and M. Sirivianos, “Killing the password and preserving privacy with device-centric and attribute-based authentication”, 2018. [102] Callahad. “What is an Identity Bridge.” http://identity.mozilla.com/post/56526022621/what-is-an- identity-bridge. https://www.mozilla.org/en-US/persona/, 2013. [103] SA.Ji, Z. Gui, T. Zhou, H. Yan, J. Shen, “An Efficient and Certificateless Conditional Privacy-Preserving Authentication Scheme for Wireless Body Area Networks Big Data Services”, IEEE Access, Vol. 6, pp. 69603-69611, 2018. [104] H. Liu, Y. Wang, J. Liu, J. Yang,Y. Chen, H. Vincent Poor, “Authenticating Users Through Fine-Grained Channel Information”, IEEE Trans. On Mobile Computing, Vol. 17, n.2, pp. 251-264, Feb. 2018. [105] D. Wang, H. Cheng, D. He, P. Wang, “On the Challenges in Designing Identity-Based Privacy-Preserving Authentication Schemes for Mobile Devices”, IEEE System Journal, Vol. 12, n.1, pp. 916-925, Mar. 2018. [106] D. He, N. Kumar, M. K. Khan, L. Wang and J. Shen, "Efficient Privacy-Aware Authentication Scheme for Mobile Cloud Computing Services," in IEEE Systems Journal, vol. 12, no. 2, pp. 1621-1631, June 2018. [107] A. Castiglione, F. Palmieri, C.-L. Chen, and Y.-C. Chang. “A Blind Signature-Based Approach for Cross- Domain Authentication in the Cloud Environment” Int. J. Data Warehous. Min. 12, 1, 34-48, January 2016. [108] Y. Yang, M. Hu, S. Kong et al. “Scheme on Cross-Domain Identity Authentication Based on Group Signature for Cloud Computing” Wuhan Univ. J. Nat. Sci., 24: 134, 2019. [109] T. D. Nguyen, M. A. Gondree, D. J. Shifflett, J. Khosalim, T. E. Levin and C. E. Irvine, "A cloud-oriented cross-domain security architecture," 2010 - MILCOM 2010 MILITARY COMMUNICATIONS CONFERENCE, San Jose, CA, pp. 441-447, 2010. [110] K.W. Nafi, T. S. Kar, S. A. Hoque, Dr. M. M. A Hashem “A Newer User Authentication, File encryption and Distributed Server Based Cloud Computing security architecture” (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 3, No. 10, 2012. [111] A. Celesti, F. Tusa, M. Villari and A. Puliafito, “Three-phase cross-cloud federation model: The cloud sso authentication”. In IEEE Second International Conference on Advances in Future Internet, pp. 94-101, July 2010. [112] J. Carretero,G. Izquierdo-Moreno, M.Vasile-Cabezas, J. Garcia-Blas, “Federated Identity Architecture of the European eID System”, IEEE Access, Vol. 6, pp. 75302-75326, 2018. [113] J. M. Alves, T. G. Rodrigues, D. W. Beserra, J. C. Fonseca, P. T. Endo, J. Kelner, “Multi-Factor Authentication with OpenId in Virtualized Environments”, IEEE Latin America Transactions, Vol. 15, n.3, pp. 528-533, Mar. 2017. [114] M. Berman, M. Brinn, “Progress and Challenges in Worldwide Federation of Future Internet and Distributed Cloud Testbeds”, 2014 International Science and Technology Conference (Modern Networking Technologies) (MoNeTeC), Moscow, Russia, 28-29 Oct. 2014. [115] M. Wang, J. Liu, J. Chen, X. Liu, J. Mao, “PERM-GUARD: Authenticating the Validity of Flow Rules in Software Defined Networking”, 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, New York, NY, USA, 3-5 Nov. 2015. [116] B. Isong, T. Kgogo,F. Lugayizi, B. Kankuzi, “Trust Establishment Framework between SDN Controller and Applications”, 2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Kanazawa, Japan, 26-28 June 2017. [117] M. A. Ferrag, L. A. Maglaras, H. Janicke, J. Jiang and L. Shu, “Authentication protocols for Internet of Things: A comprehensive survey”. Security and Communication Networks, 2017. [118] R. Amin, N. Kumar, G. P. Biswas, R. Iqbal, and V. Chang, “A light weight authentication protocol for IoT- enabled devices in distributed Cloud Computing environment,” Future Generation Computer Systems, 2016.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[119] U. Chatterjee, V. Govindan, R. Sadhukhan, D. Mukhopadhyay, R. S. Chakraborty, D. Mahata, M. M. Prabhu, “Building PUF Based Authentication and Key Exchange Protocol for IoT Without Explicit CRPs in Verifier Database”, IEEE Trans. On Dependable d Secure Computing, Vol. 16, n.3, pp. 424-437, May/Jun. 2019. [120] “An indispensable guide to Common Data Security Architecture”, The Open Group, 2001. [121] D.F. Ferraiolo and D.R. Kuhn, "Role Based Access Control", 15th National Computer Security Conference, October, 1992. [122] A. H. Karp, "Authorization Based Access Control for the Services Oriented Architecture", Proc. 4th Int. Conf. on Creating, Connecting and Collaborating through Computing, Berkeley, CA, IEEE Press, January 2006. [123] A. H. Karp, H. Haury, and M. H. Davis, “From ABAC to ZBAC: the evolution of access control models”, Journal of Information Warfare, 9(2), 38-46, 2010. [124] J. Park, R. Sandhu, “Towards usage control models: beyond traditional access control”, Proc. seventh ACM Symp. Access Control Model. Technol. - SACMAT ’02, ACM Press, New York, USA, 2002. [125] V. Hu, D. Ferraiolo, R. Kuhn, A. Schnitzer, K. Sandlin, R. Miller, and K. Scarfone, “Guide to Attribute Based Access Control (ABAC) Definition and Considerations,” NIST, NIST Special Publication 800-162, Jan. 2014. [126] A. Pimlott and O. Kiselyov, “Soutei, a Logic-Based Trust-Management System”, Springer's Lecture Notes in Computer Science 3945/2006, pp. 130-145, 2006. [127] D. Baier, V. Bertocci, K. Brown, M. Woloskiand A. Pac, “A Guide to Claims-Based Identity and Access Control: Patterns & Practices”. Microsoft Press, 2010. [128] D. Zou, Y. Lu, B. Yuan, H. Chen, H. Jin, “A Fine-Grained Multi-Tenant Permission Management Framework for SDN and NFV”, IEEE Access, Vol. 6, pp. 25562-25572, May 2018. [129] D. Rosendo, P. Takako Endo, D. Sadok, J. Kelner, “An Autonomic and Policy-based Authorization Framework for OpenFlow Networks”, 2017 13th International Conference on Network and Service Management (CNSM), Tokyo, Japan, 26-30 Nov. 2017. [130] Y. Shi, F. Dai, Z. Ye, “An Enhanced Security Framework of Software Defined Network Based on Attribute-based Encryption”, 2017 4th International Conference on Systems and Informatics (ICSAI 2017), Hangzhou, China, 11-13 Nov. 2017. [131] I. Indu, P.M. Rubesh Anand, V. Bhaskar, "Identity and access management in cloud environment: Mechanisms and challenges”, Engineering Science and Technology, an International Journal, Volume 21, Issue 4, Pages 574-588, ISSN 2215-0986, 2018. [132] Y. Zhu, D. Huang, C. J. Hu, and X. Wang, “From RBAC to ABAC: Constructing Flexible Data Access Control for Cloud Storage Services,” IEEE Transactions on Services Computing, vol. 8, no. 4, pp. 601–616, Jul. 2015. [133] S. Ullah, Z. Xuefeng, and Z. Feng, “TCloud: A Dynamic Framework and Policies for Access Control across Multiple Domains in Cloud Computing”, vol. 62, no. 2, January 2013. [134] C. Uikey and D. S. Bhilare, "TrustRBAC: Trust role-based access control model in multi-domain cloud environments," Int. Conf. on Information, Communication, Instrumentation and Control (ICICIC), Indore, 2017. [135] G. Lin, Y. Bie, M. Lei. “Trust Based Access Control Policy in Multi-domain of Cloud Computing”. Journal of Computers, VOL.8, NO.5, pp.1357— 1365, 2013. [136] Li Na, Dong Yun-Wei, Che Tian-Wei et al. “Cross-Domain Authorization Management Model for MultiLevels Hybrid Cloud Computing” International Journal of Security and Its Applications Vol.9, No.12, pp.357-366, 2015. [137] Y. Wu, V. Suhendra and H. Guo, “A Gateway-based Access Control Scheme for Collaborative Clouds”, The Seventh International Conference on Internet Monitoring and Protection, Stuttgart, Germany, 2012. [138] K. Punithasurya and S. Jeba Priya, “Analysis of Different Access Control Mechanism in Cloud”, International Journal of Applied Information Systems (IJAIS), Volume 4– No.2, Foundation of Computer Science FC, September 2012.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[139] A. Birgisson, J. Gibbs Politz, U. Erlingisson and M. Lentczner, “Macaroons: cookies with contextual caveats for decentralized authorization in the cloud”. Proceedings of the Conference on Network and Distributed System Security Symposium, 2014. [140] X. Yin, X. Chen, L. Chen, G. Shao, H. Li, S. Tao, “Research of Security as a Service for VMs in IaaS Platform”, IEEE Access, Vol. 6, pp. 29158-29172, May 2018. [141] H. M. Anitha, P. Jayarekha, “SDN Based Secure Virtual Machine Migration In Cloud Environment”, 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Bangalore, India, 19-22 Sept. 2018. [142] S. Sciancalepore, G. Piro, D. Caldarola, G. Boggia and G. Bianchi, "On the Design of a Decentralized and Multiauthority Access Control Scheme in Federated and Cloud-Assisted Cyber-Physical Systems," in IEEE Internet of Things Journal, vol. 5, no. 6, pp. 5190-5204, Dec. 2018. [143] D. Ramesh and R. Priya, “Multi-authority scheme-based cp-abe with attribute revocation for cloud data storage,” in 2016 International Conference on Microelectronics, Computing and Communications (MicroCom), pp. 1–4, Jan. 2016. [144] C. Hennebert, et al “IoT governance. privacy and security issues”. Technical report, European Research Cluster on the Internet of Things, January 2015. [145] A. Ouaddah, H. Mousannif, A.U Elkalam and A. Ait Ouahman, “Access control in The Internet of Things: Big challenges and new opportunities”. Computer Networks. 112. 10.1016/j.comnet.2016.11.007, 2016. [146] Ankur Lohachab, Karambir, “Next Generation Computing: Enabling Multilevel Centralized Access Control using UCON and CapBAC Model for securing IoT Networks”, 2018 International Conference on Communication, Computing and Internet of Things (IC3IoT), Chennai, India, 15-17 Feb. 2018. [147] M. Al-Shaboti, I. Welch, A. Chen, M. Adeel Mahmood, “Towards Secure Smart Home IoT: Manufacturer and User Network Access Control Framework”, 2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA), Krakow, Poland, 16-18 May 2018. [148] General Data Protection Regulation (EU) 2016/679 ("GDPR"). [149] J. Wei, W. Liu, and X. Hu, “Secure and efficient attribute-based access control for multiauthority cloud storage,” IEEE Systems Journal, vol. 12, no. 2, pp. 1731–1742, Jun. 2018. [150] R. LI, C. Shen, H. He, X. Gu, Z. Xu, and C. Xu, “A lightweight secure data sharing scheme for mobile cloud computing,” IEEE Transactions on Cloud Computing, vol. 6, no. 2, pp. 344–357, Apr. 2018. [151] K. Xue, W. Chen, W. Li, J. Hong, and P. Hong, “Combining data ownerside and cloud-side access control for encrypted cloud storage,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 8, pp. 2062–2074, Aug. 2018. [152] K. Yang, Z. Liu, X. Jia, and X. S. Shen, “Time-domain attribute-based access control for cloud-based video content sharing: A cryptographic approach,” IEEE Transactions on Multimedia, vol. 18, no. 5, pp. 940– 950, May 2016. [153] K. Yang, X. Jia, K. Ren, and B. Zhang, “DAC-MACS: Effective data access control for multi-authority cloud storage systems,” in Proceedings IEEE INFOCOM, pp. 2895–2903, Apr. 2013. [154] Y. Zhang, J. Li, H. Yan, “Constant Size Ciphertext Distributed CP-ABE Scheme With Privacy Protection and Fully Hiding Access Structure”, IEEE Access, Vol. 7, pp. 47982-47990, Apr. 2019. [155] P. Schaar, “Privacy by design,” Identity in the Information Society, vol. 3, no. 2, pp. 267–274, Aug 2010. [156] A. Cavoukian, “Privacy by design [leading edge],” IEEE Technology and Society Magazine, vol. 31, no. 4, pp. 18–19, 2012. [157] L. Y. Yeh, P. Y. Chiang, Y. L. Tsai, and J. L. Huang, “Cloud-based Finegrained Health Information Access Control Framework for Lightweight IoT Devices with Dynamic Auditing and Attribute Revocation,” IEEE Transactions on Cloud Computing, vol. PP, no. 99, pp. 1–1, Oct. 2015. [158] J. Chen and H. Ma, “Privacy-Preserving Decentralized Access Control for Cloud Storage Systems,” in Int. Conf. on Cloud Comput., Jun. 2014, pp. 506–513. [159] S. Han, V. Liu, Q. Pu, S. Peter, T. Anderson, A. Krishnamurthy, D. Wetherall. "Expressive privacy control with pseudonyms." Proceedings of the ACM SIGCOMM Conference, 2013.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[160] J. Maheswaran, D. I. Wolinsky, and B. Ford. "Crypto-book: an architecture for privacy preserving online identities." Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks. ACM, 2013. [161] Y. M. Ji, J. Tan, H. Liu, Y.P. Sun, J. B. Kang, Z. Kuang and C. Zhao, “A Privacy Protection Method Based on CP-ABE and KP-ABE for Cloud Computing”. JSW, 9(6), 1367-1375, 2014. [162] S. Cirani, M. Picone, P. Gonizzi, L. Veltri, and G. Ferrari, “IoT-OAS: An OAuth-Based Authorization Service Architecture for Secure Services in IoT Scenarios,” IEEE Sensors J., vol. 15, no. 2, pp. 1224–1234, Feb. 2015. [163] C. Pisa, A. Caponi, T. Dargahi, G. Bianchi, and N. Blefari-Melazzi, “WI-FAB: Attribute-based WLAN Access Control, Without Pre-shared Keys and Backend Infrastructures,” in Proc. of the ACM Int. Worksh. on Hot Topics in Planet-scale mobile Computing and Online Social networking, 2016, pp. 31– 36. [164] S. Sciancalepore, G. Piro, D. Caldarola, G. Boggia, and G. Bianchi, “OAuth-IoT: An access control framework for the Internet of Things based on open standards,” in 2017 IEEE Symposium on Computers and Communications (ISCC), pp. 676–681, Jul. 2017. [165] A. Tassanaviboon and G. Gong, “OAuth and ABE Based Authorization in Semi-trusted Cloud Computing: Aauth,” in Proceedings of the Second International Workshop on Data Intensive Computing in the Clouds, 2011, pp. 41–50. [166] L. Seitz, G. Selander, E. Wahlstroem, S. Erdtman and H. Tschofenig, “Authentication and authorization for the Internet of Things for constrained environments using the OAuth 2.0 Framework (ACE-OAuth)” Internet draft, IETF, 2019. [167] B. Jasiul, J. Sliwa, R. Piotrowski, R. Goniacz, M. Amanowicz, “Authentication and Authorization of Users and Services in Federated SOA Environments – Challenges and Opportunities” MILITARY COMMUNICATIONS INST ZEGRZE (POLAND), 2010. [168] N. J. Edwards, and J. Rouault. "Multi-domain authorization and authentication." U.S. Patent No. 7,444,666. 28 Oct. 2008. [169] J. Lin, W. Yu, N. Zhang, X. Yang, H. Zhang, and W. Zhao, “A Survey on Internet of Things: Architecture, Enabling Technologies, Security and Privacy, and Applications,” IEEE Internet of Things Journal, vol. 4, no. 5, pp. 1125–1142, Oct. 2017. [170] V. E. Urias, W.M.S. Stout, C. Loverro, B. Van Leeuwen, “Get Your Head Out of the Clouds: The Illusion of Confidentiality & Privacy”, 2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2), Paris, France, 18-21 Nov. 2018. [171] A. Almohaimeed, A. Asaduzzaman, “A Novel Moving Target Defense Technique to Secure Communication Links in Software-Defined Networks”, 2019 Fifth Conference on Mobile and Secure Services (MobiSecServ), Miami Beach, FL, USA, 2-3 Mar. 2019. [172] Q. Xu, C. Tan, Z. Fan, W. Zhu, Y. Xiao, F. Cheng, “Secure Multi-Authority Data Access Control Scheme in Cloud Storage System Based on Attribute-Based Signcryption”, IEEE Access, Vol. 6, pp. 34051-34074, Jun. 2018. [173] Y. Li, Z. Dong, K. Sha, C. Jiang, J. Wan, Y. Wang, “TMO: Time Domain Outsourcing Attribute-Based Encryption Scheme for Data Acquisition in Edge Computing”, IEEE Access, Vol. 7, pp. 40240-40257, Mar. 2019. [174] http://www.compose-project.eu [175] http://www.openiot.eu [176] ADECS: Advanced models for the design and evaluation of modern cryptographic systems. http://www.certsign.ro/certsign/resurse/documentatie/Proiecte-cercetare-dezvoltare/ADECS [177] Secure & Privacy-Aware eGovernment Services (SPAGOS). Greek National Funding Project (2013- 2015). [178] BIO-IDENTITY “Secure and revocable biometric identification for use in disparate intelligence environments, Greek National Funding Project (2011 – 2014), http://biotaytotita.encodegroup.com/ [179] Assure UK. “Identity & attribute verification and exchange service”. http://www.gsma.com/personaldata/wp-content/uploads/2013/12/AssureUK-December- Meeting.pdf

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[180] Achieving the Trust Paradigm Shift. http://www.trustindigitallife.eu/attps/trust-paradigm-shift.html [181] Attribute-based credentials for trust (ABC4Trust), 2010, https://abc4trust.eu/index.php/home/factsheet, https://abc4trust.eu/ [182] https://www.astrid-project.eu/ [183] Bianchi, Giuseppe, et al. "From Real-world Identities to Privacy-preserving and Attribute-based CREDentials for Device-centric Access Control." Deliverable D5. [184] S. Sciancalepore, M. Pilc, S. Schroder, G. Bianchi, G. Boggia, M. Pawłowski, G. Piro, M. Pł´ociennik, and H. Weisgrab, “Attribute-Based Access Control scheme in federated IoT platforms,” in Proc. Of 2nd Workshop on Interoperability and Open-Source Solutions for the Internet of Things, ser. LCNS. Stuttgart, Germany: Springer, Nov. 2016. [185] FIDO Alliance, https://fidoalliance.org/ [186] OpenID. “The Internet Identity Layer.” http://openid.net/ [187] OAuth. “Secure authorization in a simple and standard method”. http://oauth.net [188] Hardt, D.: The OAuth 2.0 authorization framework. RFC 6749, IETF, October 2012 [189] https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml [190] J. Camenisch and E. V. Herreweghen, “Design and implementation of the idemix anonymous credential system”, in Proc. of the 9th ACM conference on Computer and communications security (CCS '02), Vijay Atluri (Ed.), 2002. [191] https://www.opendaylight.org/ [192] https://www.keycloak.org/ [193] https://wso2.com/library/articles/2017/08/what-is-wso2-identity-server/ [194] https://www.tremolosecurity.com/openunison/ [195] 1Password, https://agilebits.com/onepassword [196] Where Are You From, https://www.wayf.dk/en [197] https://www.avatier.com/ [198] https://www.bitium.com/ [199] https://www.ca.com/us/products/identity-and-access-management.html?intcmp=headernav [200] https://www.fischerinternational.com/ [201] https://www.forgerock.com/ [202] https://www.idaptive.com/ [203] https://www.microfocus.com/it-it/products/netiq/overview [204] https://www.okta.com/ [205] https://www.onelogin.com/ [206] https://optimalidm.com/solutions/identity-access-management/ [207] https://www.pingidentity.com/en.html [208] https://www.secureauth.com/ [209] https://www.simeiosolutions.com/ [210] https://www.ubisecure.com/ [211] https://www.tools4ever.com/ [212] https://www.crossmatch.com/ [213] https://www.rsa.com/en-us/products/rsa-securid-suite [214] https://www.identityautomation.com/iam-platform/ [215] https://www.omada.net/en-us/solutions/solution-overview/omada-enterprise [216] https://www.oneidentity.com [217] https://www.oracle.com/middleware/identity-management/governance/resources.html [218] https://www.radiantlogic.com/ [219] https://duo.com [220] https://fusionauth.io/ [221] Tom Kemp. “Despite privacy concerns, it's time to kill the password.” White House National Strategy for Trusted Identities in Cyberspace (NSTIC), http://www.forbes.com/sites/frontline/2014/07/18/ despite-privacy-concerns-its-time-to-kill-the-password/

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[222] https://www.microsoft.com/en-us/research/project/u-prove/ [223] https://docs.aws.amazon.com/cognito/index.html [224] https://www.axiomatics.com/ [225] Vacca, J. R. (2013). Managing information security. Elsevier, 2013 [226] Pawar M., Anuradha J.: "Network Security and Types of Attacks in Network", in Proc. of the International Conference on Computer, Communication and Convergence (ICCC 2015), Procedia Computer Science, Vol. 48, pp. 503-506, 2015 [227] Ruf L, AG C, Thorn A, GmbH A, Christen T, Zurich Financial Services AG, Gruber B, Credit Suisse AG., Portmann R, Luzer H: “Threat Modeling in Security Architecture - The Nature of Threats. ISSS Working Group on Security Architectures”, http://www.isss.ch/fileadmin/publ/agsa/ISSS-AG-Security- Architecture_Threat-Modeling_Lukas-Ruf.pdf [228] Geric S, Hutinski Z.: “Information system security threats classifications”. Journal of Information and Organizational Sciences; pp. 31-- 51, 2007. [229] Alhabeeb M, Almuhaideb A, Le P, Srinivasan B.: "Information Security Threats Classification Pyramid", in Proc. of the 24th IEEE International Conference on Advanced Information Networking and Applications, pp. 208-213, 2010. [230] Hutchins, E. M., Cloppert M. J., and. Amin R. M., "Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains.", Leading Issues in Information Warfare & Security Research 1.1, p.: 80, 2011. [231] Meier J, Mackman A, Vasireddy S, Dunner M, Escamilla R, Murukan A.: "Improving the application security: threats and counter measures", Satyam Computer Services, Microsoft Corporation; 2003. [232] ISO. Information Processing Systems-Open Systems Interconnection-Basic Reference Model. Part 2: Security Architecture, ISO 7498-2; 1989. [233] Liao, H.-J., Lin, C.-H. R., Lin, Y.-C., & Tung, K.-Y: „Intrusion detection system: A comprehensive review. Journal of Network and Computer Applications”, vol. 36(1), pp.19-24, 2013. [234] Uddin M., Rehman A. A., Uddin N., Memon J., Alsaqour R., Kazi S.: “Signature-based multi-layer distributed intrusion detection system using mobile agents”, International Journal of Network Security, vol. 15, pp. 97-105, 2016. [235] Szynkiewicz P., Kozakiewicz A.: “Design and Evaluation of a System for Network Threat Signatures Generation”, Journal of Computational Science, vol. 22, pp.187-197, 2017. [236] Garcia-Teodoro, P., Diaz-Verdejo, J., Macia-Fernandez, G., Vazquez, E.: “Anomaly-based network intrusion detection: Techniques, systems and challenges”. Computers & Security, vol. 28(1), pp. 18-28, 2009. [237] Chandola, V., Arindam B., and Vipin K. "Anomaly detection: A survey." ACM computing surveys (CSUR), vol. 41.3, p.: 15, 2009. [238] Skopik F.: “Collaborative Cyber Threat Intelligence: Detecting and Responding to Advanced Cyber Attacks at the National Level”. 416p., 1st edition, ISBN-10: 1138031828, ISBN-13: 978-1138031821, Taylor & Francis, CRC Press, 2017. [239] Wurzenberger, M., et al. "AECID: A Self-learning Anomaly Detection Approach based on Light-weight Log Parser Models." ICISSP. 2018. [240] Cannady J (1998) Artificial neural networks for misuse detection. In: National information systems security conference, pp 368–81. [241] Heckerman D, et al (1998) A tutorial on learning with bayesian networks. Nato Asi Series D Behavioural And Social Sciences 89:301–354. [242] Safavian SR, Landgrebe D.: “A survey of decision tree classifier methodology.” IEEE transactions on systems, man, and cybernetics, vol. 21(3) pp.:660–674,1991. [243] Baum LE, Eagon JA (1967) An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a model for ecology. Bulletin of the American Mathematical Society 73(3):360–363. [244] K. Kalkan, G. Gür, and F. Alagöz. “Filtering-based defense mechanisms against DDoS attacks: A survey”. IEEE Systems Journal, 11(4):2761_2773, Dec 2017.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[245] Shaeffer S. E., “Graph clustering”. Computer Science Review, pp. 27–64, 2010. [246] Steinwart I, Christmann A.: “Support vector machines”. Springer Science & Business Media, 2009. [247] Goldstein, M., Uchida, S.: “A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data”. PloS one, vol.11(4), 2016. [248] Kolodziej J., Xhafa F., “Supporting situated computing with intelligent multi-agent systems”, IJSSC vol. 1(1), pp. 30-42, 2011. [249] Kruczkowski M., Niewiadomska-Szynkiewicz E., Kozakiewicz A.: “Cross-Layer Analysis of Malware Datasets for Malicious Campaigns Identification”, in Proc. of the International Conference on Military Communications and Information Systems (ICMCIS), 2015. [250] Wurzenberger M., Skopik F., Settanni G. (2018): Big Data for Cyber Security. In Encyclopedia of Big Data Technologies. Sakr, Sherif and Zomaya, Albert (Eds.) Springer International Publishing, 2019. [251] Human-Computer Interaction and Cybersecurity Handbook. Abbas Moallem. CRC Press Published October 24, 2018 [252] nDPI. Open and Extensible LGPLv3 Deep Packet Inspection Library. https://www.ntop.org/products/deep-packet-inspection/ndpi/ [253] Internet security: firewalls and beyond. Rolf Oppliger. Communications of the ACM. Volume 40. Issue 5, May 1997. Pages 92-102. https://dl.acm.org/citation.cfm?id=253802&dl=ACM&coll=DL [254] Open vSwitch. https://www.openvswitch.org/ [255] Alienvault Open Source Security Information Management (OSSIM). https://www.alienvault.com/products/ossim [256] Open Source Security Information Management System Supporting IT Security Audit. Damian Hermanowski C4I Systems Department, Military Communication Institute. 2015 IEEE. https://www.wil.waw.pl/art_prac/2015/pub_cybconfCybersec15_DH-OSSIM- ieee_REVIEW_RC05_ver_PID3720933.pdf [257] Kozakiewicz A. (2018, August). Świadomość sytuacyjna cyberzagrożeń, Przegląd Telekomunikacyjny [258] Sundararajan, A., Khan, T., Aburub, H., Sarwat, A. I., & Rahman, S. (2018, April). A tri-modular human- on-the-loop framework for intelligent smart grid cyber-attack visualization. In SoutheastCon 2018 (pp. 1-8). IEEE. [259] Elfar M., Zhu H., Raghunathan A., Tay Y.Y., Wubbenhorst J., Cummings M.L. & Pajic M. (2017). WiP Abstract: Platform for Security-Aware Design of Human-on-the-Loop Cyber-Physical Systems, ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS), pp. 93-94. [260] Adams Ch. N. & Snider D. H. (2018). Effective Data Visualization in Cybersecurity, IEEE [261] McKenna, S., Staheli, D., & Meyer, M. (2015, October). Unlocking user-centered design methods for building cyber security visualizations. In 2015 IEEE Symposium on Visualization for Cyber Security (VizSec) (pp. 1-8). IEEE. [262] Arendt, D. L., Burtner, R., Best, D. M., Bos, N. D., Gersh, J. R., Piatko, C. D., & Paul, C. L. (2015). Ocelot: user-centered design of a decision support visualization for network quarantine. 2015 IEEE Symposium on Visualization for Cyber Security (VizSec). [263] Schufrin M., Ulmer A., Sessler D. & Kohlhammer J. Towards Bringing the Gap Between Visual Cybersecurity Analytics and Non-Expert by Means of User Experience Design (a poster) [264] Minami M., Suzaki K. & Okumura T. (2011). Security considered harmful. A case study of the tradeoff between security and usability, vol. pp. 523-524, Doi: 10.1109/CCNC.2011.5766529 [265] Evans M., Maglaras L. A., He Y. & Janicke H. (2016). Human behavior as an aspect of cybersecurity assurance, CoRR, vol. abs/1601.03921, [Online] Available: http://arxiv.org/abs/1601.03921 [266] Nwokedi U. O., Onyimbo B. A., Rad B. B. (2016, May). Usability and Security in User Interface Design: A Systematic Literature Review, I.J. Information Technology and Computer Science, 2016, 5, 72-80 [267] Legg P. A. (2016). Enhancing Cyber Situation Awareness for Non-Expert Users using Visual Analytics, In Cyber Situational Awareness, Data Analytics and Assessments (Cyber SA), pp. 1-8, IEEE. [268] Brown K. (2018, October). https://www.proofpoint.com/us/corporate-blog/post/designing-people- proofpoints-ux-team-reveals-process-behind-product-refresh on a redesign of Email Protection.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[269] Malek Z. & Trivedi B. (2017). GUI-Based User Behaviour Intrusion Detection, IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI) 2017. [270] Anderson J.P. (1980). Computer Security Threat Monitoring and Surveillance, 1980. [271] Denning D. E. (1987). An Intrusion-Detection Model, Software Engineering, IEEE Transactions on, vol. 13, no. 2, pp. 222-232 [272] Yampolskiy R.V. & Govindaraju V. (2008). Behavioural biometrics: a survey and classification, Int. J. Biometrics, vol. 1, nr. 1, pp. 81-113. [273] Hazard C. J., Singh M. P. (2016). Privacy Risks in Intelligent User Interfaces, IEEE Internet Computing, vol. 20, no.6, pp. 57–61 [274] Stavrou E. (2017). A situation-aware user interface to assess users ability to construct strong passwords, Conference: International Conference On Cyber Situational Awareness, Data Analytics And Assessment (CyberSA) [275] Yang J. & Jeong J. (2018). An Automata-based Security Policy Translation for Network Security Functions, 2018 International Conference on Information and Communication Technology Convergence (ICTC) [276] Li Q., Feng X., Li Z., Wang H. & Sun L. (2016). GUIDE: Graphical User Interface Fingerprints Physical Devices, IEEEE 24th International Conference on Network Protocols (ICNP) Poster Paper [277] Kothari, V., Blythe, J., Smith, S., & Koppel, R. (2014, May). Agent-based modeling of user circumvention of security. In Proceedings of the 1st International Workshop on Agents and CyberSecurity (p. 5). ACM. [278] Alshboul, Y., & Streff, K. (2017, December). Beyond Cybersecurity Awareness: Antecedents and Satisfaction. In Proceedings of the 2017 International Conference on Software and e-Business (pp. 85- 91). ACM. [279] Frik, A., Malkin, N., Harbach, M., Peer, E., & Egelman, S. (2019, April). A Promise Is A Promise: The Effect of Commitment Devices on Computer Security Intentions. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (p. 604). ACM. [280] Davison, W. P. (1983). The Third-Person Effect in Communication. The Public Opinion Quarterly, 47(1), 1-15. [281] Noureddine, M., Keefe, K., Sanders, W. H., & Bashir, M. N. (2015, April). Quantitative security metrics with human in the loop. In Symposium and Bootcamp on the Science of Security, HotSoS 2015 (p. 2746215). Association for Computing Machinery. [282] Heartfield, R., & Loukas, G. (2018). Detecting semantic social engineering attacks with the weakest link: Implementation and empirical evaluation of a human-as-a-security-sensor framework. Computers & Security, 76, 101-127. [283] Fujs, D., Vrhovec, S., & Mihelič, A. (2018, November). What drives the motivation to self-protect on social networks?: The role of privacy concerns and perceived threats. In Proceedings of the Central European Cybersecurity Conference 2018 (p. 11). ACM. [284] Ndibwile, J. D., Luhanga, E. T., Fall, D., Miyamoto, D., & Kadobayashi, Y. (2018, December). A comparative study of smartphone-user security perception and preference towards redesigned security notifications. In Proceedings of the Second African Conference for Human Computer Interaction: Thriving Communities (p. 17). ACM. [285] Ndibwile, J. D., Luhanga, E. T., Fall, D., Miyamoto, D., & Kadobayashi, Y. (2018, September). Smart4Gap: Factors that Influence Smartphone Security Decisions in Developing and Developed Countries. In Proceedings of the 2018 10th International Conference on Information Management and Engineering (pp. 5-15). ACM. [286] Baluta, T., Ramapantulu, L., Teo, Y. M., & Chang, E. C. (2017, December). Modeling the effects of insider threats on cybersecurity of complex systems. In 2017 Winter Simulation Conference (WSC) (pp. 4360- 4371). IEEE. [287] Reddy, D. (2017, June). Exploring Factors Influencing Self-Efficacy in Information Security: An Empirical Analysis by Integrating Multiple Theoretical Perspectives in the Context of Using Protective Information Technologies. In Proceedings of the 2017 ACM SIGMIS Conference on Computers and People Research (pp. 207-208). ACM.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[288] Chen, T., Hammer, J., & Dabbish, L. (2019, April). Self-Efficacy-Based Game Design to Encourage Security Behavior Online. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (p. LBW1610). ACM. [289] Wen, Z. A., Lin, Z., Chen, R., & Andersen, E. (2019, April). What. Hack: Engaging Anti-Phishing Training Through a Role-playing Phishing Simulation Game. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (p. 108). ACM. [290] Herr, C., & Allen, D. (2015, June). Video games as a training tool to prepare the next generation of cyber warriors. In Proceedings of the 2015 ACM SIGMIS Conference on Computers and People Research (pp. 23-29). ACM. [291] Awojana, T., Chou, T. S., & Hempenius, N. (2018, September). Review of the Existing Game Based Learning System in Cybersecurity. In Proceedings of the 19th Annual SIG Conference on Information Technology Education (pp. 144-144). International World Wide Web Conferences Steering Committee. [292] Jin, G., Tu, M., Kim, T. H., Heffron, J., & White, J. (2018, February). Game based cybersecurity training for high school students. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (pp. 68-73). ACM. [293] Dixon, M., Gamagedara Arachchilage, N. A., & Nicholson, J. (2019, April). Engaging Users with Educational Games: The Case of Phishing. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (p. LBW0265). ACM. [294] Shillair, R. (2016, September). Talking about online safety: A qualitative study exploring the cybersecurity learning process of online labor market workers. In Proceedings of the 34th ACM International Conference on the Design of Communication (p. 21). ACM. [295] Chattopadhyay, A., Christian, D., Ulman, A., & Petty, S. (2018, September). Towards A Novel Visual Privacy Themed Educational Tool for Cybersecurity Awareness and K-12 Outreach. In Proceedings of the 19th Annual SIG Conference on Information Technology Education (pp. 159-159). International World Wide Web Conferences Steering Committee. [296] Weiss, R., Mache, J., Locasto, M. E., & Nestler, V. (2014, March). Hands-on cybersecurity exercises in the EDURange framework. In Proceedings of the 45th ACM technical symposium on Computer science education (pp. 746-746). ACM. [297] Das, S., Lo, J., Dabbish, L., & Hong, J. I. (2018, April). Breaking! A Typology of Security and Privacy News and How It's Shared. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (p. 1). ACM. [298] Mentis, H. M., Madjaroff, G., & Massey, A. K. (2019, April). Upside and Downside Risk in Online Security for Older Adults with Mild Cognitive Impairment. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (p. 343). ACM. [299] Kankane, S., DiRusso, C., & Buckley, C. (2018, April). Can we nudge users toward better password management?: An initial study. In Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems (p. LBW593). ACM. [300] Nicholson, J., Coventry, L., & Briggs, P. (2019, April). If It's Important It Will Be A Headline: Cybersecurity Information Seeking in Older Adults. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (p. 349). ACM. [301] Maoneke, P. B., Shava, F. B., Gamundani, A. M., Bere-Chitauro, M., & Nhamu, I. (2018, December). ICTs use and cyberspace risks faced by adolescents in Namibia. In Proceedings of the Second African Conference for Human Computer Interaction: Thriving Communities (p. 11). ACM. [302] Abawajy, J. (2014, March). User preference of cyber security awareness delivery methods. Behaviour & Information Technology, 33(3), 237-248. [303] Khan, B., Alghathbar, K. S., Nabi, S. I., & Khan, M. K. (2011). Effectiveness of information security awareness methods based on psychological theories. African Journal of Business Management, 5(26), 10862-10868. [304] Bada, M., Sasse, A. M., & Nurse, J. R. (2019, January). Cyber security awareness campaigns: Why do they fail to change behaviour? arXiv preprint arXiv:1901.02672.

D2.1 Vision, State of the Art and Requirements Analysis V1.0

[305] Coventry, D.L., Briggs, P., Blythe, J., Tran, M. (2014). Using behavioural insights to improve the public’s use of cyber security best practices. Government Office for Science, London, UK. https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/309652/14-835- cybersecurity-behavioural-insights.pdf [306] NTT Com Security Inc. Risk. (2014). Value Research Report. https://www.nttcomsecurity.com/us/landingpages/risk-value-research/ [307] Gordieiev, O., Kharchenko, V. S., & Vereshchak, K. (2017, May). Usable Security Versus Secure Usability: An Assessment of Attributes Interaction. In ICTERI (pp. 727-740). [308] ISO/IEC 25010. (2011). Systems and software engineering – Systems and software Quality Requirements and Evaluation (SQuaRE) – System and software quality models, ISO/IEC JTC1/SC7/WG6 [309] Fogg, B. J. (2002). Persuasive Technology: Using Computers to Change What We Think and Do. Morgan Kaufmann. [310] Wallston, K.A. (2001). Control beliefs. In N. J. Smelser & P. B. Baltes (Eds.). International encyclopaedia of the social and behavioral sciences. Oxford, UK: Elsevier Science. [311] Dennis, A. R., & Minas, R. K. (2018). Security on Autopilot: Why Current Security Theories Hijack Our Thinking and Lead Us Astray. ACM SIGMIS Database: The DATABASE for Advances in Information Systems, 49(SI), 15-38. [312] Grady, R.B. (1992). Practical Software Metrics for Project Management and Process Improvement. Prentice-Hall, Inc., Upper Saddle River. [313] https://www.elastic.co/guide/en/logstash/current/index.html [314] https://www.json.org [315] https://www.w3.org/TR/xml/ [316] https://kubernetes.io/docs/reference/ [317] https://jaas.ai/docs/reference [318] https://openbaton.github.io/cases.html [319] http://cloudiator.org/docs/cloud.html [320] https://osm.etsi.org/ [321] https://docs.openstack.org/heat/latest/template_guide/hot_spec.html [322] https://nodered.org/docs/ [323] Y. Yang, K. Zheng, C. Wu, Y. Yang. Improving the Classification Effectiveness of Intrusion Detection by Using Improved Conditional Variational AutoEncoder and Deep Neural Network. Sensors (Basel). (2019). 19(11). [324] P. Szynkiewicz, A. Kozakiewicz: “Design and Evaluation of a System for Network Threat Signatures Generation”, Journal of Computational Science, Vol. 22, 2017, pp. 187-197. [325] T. Andrysiak, L. Saganowski: “Incoherent Dictionary Learning for Sparse Representation in Network Anomaly Detection”, Schedae Informaticae, vol. 24, pp. 63–71, Publisher of the Jagiellonian University, 2015, p. 3028. [326] T. Andrysiak, L. Saganowski, M. Choraś, R. Kozik, “Network Traffic Prediction and Anomaly Detection Based on ARFIMA Model”, Proceedings of the 8th International Conference Computational Intelligence in Security for Information Systems (CISIS’14), Bilbao, June 25–27, Spain, 2014. [327] T. Andrysiak, L. Saganowski, M. Choraś, “Greedy Algorithms for Network Anomaly Detection”, Proceedings of the 5th International Conference Computational Intelligence in Security for Information Systems (CISIS’12), Ostrava, Czech Republic, September 05–07, 2012. [328] M. Kruczkowski, E. Niewiadomska-Szynkiewicz: “Comparative Study of Supervised Learning Methods for Malware Analysis”, Journal of Telecommunication and Information Technology, vol. 4, 2014, pp. 24-33. [329] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6539759/ [330] F. Cakir, K. He, X. Xia, B. Kulis, S. Sclaroff, “Deep Metric Learning to Rank”, Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 1861-1870.