Enabling Secure Third-Party Security Analytics Using Serverless Computing
Total Page:16
File Type:pdf, Size:1020Kb
Session 6: Security in Emerging System and Networks SACMAT ’21, June 16–18, 2021, Virtual Event, Spain SCIFFS: Enabling Secure Third-Party Security Analytics using Serverless Computing Isaac Polinsky Pubali Datta North Carolina State University University of Illinois at Urbana-Champaign Raleigh, NC, USA Champaign, IL, USA [email protected] [email protected] Adam Bates William Enck University of Illinois at Urbana-Champaign North Carolina State University Champaign, IL, USA Raleigh, NC, USA [email protected] [email protected] Abstract Keywords Third-party security analytics allow companies to outsource threat Serverless Computing; Security Analytics; Decentralized Informa- monitoring tasks to teams of experts and avoid the costs of in- tion Flow Control house security operations centers. By analyzing telemetry data ACM Reference Format: from many clients these services are able to ofer enhanced insights, Isaac Polinsky, Pubali Datta, Adam Bates, and William Enck. 2021. SCIFFS: identifying global trends and spotting threats before they reach Enabling Secure Third-Party Security Analytics using Serverless Computing. most customers. Unfortunately, the aggregation that drives these In Proceedings of the 26th ACM Symposium on Access Control Models and insights simultaneously risks exposing sensitive client data if it is Technologies (SACMAT ’21), June 16–18, 2021, Virtual Event, Spain. ACM, not properly sanitized and tracked. New York, NY, USA, 12 pages. https://doi.org/10.1145/3450569.3463567 In this work, we present SCIFFS, an automated information fow monitoring framework for preventing sensitive data exposure in 1 Introduction third-party security analytics platforms. SCIFFS performs decen- A key component of maintaining a good security posture is contin- tralized information fow control over customer data in a serverless uously monitoring your environment for security-related events. setting, leveraging the innate polyinstantiated nature of serverless To detect threats in security event logs, it is necessary to frst have functions to assure precise and lightweight tracking of data fows. threat intelligence (e.g., signatures) that can be used to generate Evaluating SCIFFS against a proof-of-concept security analytics alerts. Due to the high cost of in-house security operation cen- framework on the widely-used OpenFaaS platform, we demonstrate ters and the complexity of security monitoring, companies are that our solution supports common analyst workfows (data inges- out-sourcing monitoring and threat intelligence gathering to third- tion, custom dashboards, threat hunting) while imposing just 3.87% party security analytics services. Not only do these service providers runtime overhead on event ingestion and the overhead on aggre- make security monitoring accessible to companies of all sizes, but gation queries grows linearly with the number of records in the more importantly, these providers have greater insights to global database (e.g., 18.75% for 50,000 records and 104.27% for 500,000 security trends than an in-house security operations center. For records) as compared to an insecure baseline. Thus, SCIFFS not only example, by aggregating data over all of their customers, a third- establishes a privacy-respecting model for third-party security ana- party security analytics provider can identify more threats and lytics, but also highlights the opportunities for security-sensitive proactively apply responses to new threats found in one customer’s applications in the serverless computing model. environment to their entire customer base. Unfortunately, the benefts of third-party services come with risk. Security event logs are sensitive; if data is not properly sani- CCS Concepts tized and tracked, performing aggregation over data from multiple • Security and privacy ! Information fow control; Access sources can expose sensitive customer data. For example, in 2018 control; Distributed systems security. Facebook accidentally sent developer analytics to app testers [9]. In general, accidental data leaks are a growing concern of IT security professionals [1]. To illustrate this risk, consider a report created by an analyst who inadvertently includes sensitive data from a cus- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed tomer. If this report is shared with a competitor or the public, the for proft or commercial advantage and that copies bear this notice and the full citation analyst has unintentionally exposed the customer’s data, potentially on the frst page. Copyrights for components of this work owned by others than the violating the customer’s privacy or Service Level Agreement (SLA). author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission This risk is exacerbated when data is shared with analysts who are and/or a fee. Request permissions from [email protected]. not familiar with a company’s internal procedures, an important SACMAT ’21, June 16–18, 2021, Virtual Event, Spain issue as there is a push for greater collaboration between security © 2021 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-1-4503-8365-3/21/06...$15.00 companies in the wake of the SolarWinds hack [16]. To mitigate https://doi.org/10.1145/3450569.3463567 this risk, analysts must track data from each source client to all 175 Session 6: Security in Emerging System and Networks SACMAT ’21, June 16–18, 2021, Virtual Event, Spain derived reports (i.e., sinks). Manually tracking data is not only dif- Data Ingest cult and error prone, but also slows analyst workfows. Therefore, New Events and Store Events Enrichment we need automated mechanisms to (1) track data and derived data Process Customers and (2) prevent sharing of data without explicitly acknowledging Access A Analyze Access Customer A All the sources of the data and how the data was handled. Customer Events We envision a security analytics platform that allows third- Analyst Interface Home parties to automatically track security-related event telemetry and sci!s.securityanalytics.com/johndoe?catagory=security# Legend Data Database View Widgets Query Data Summary derived results in order to prevent accidental data exposure. While Analyze Process Workflow paths: All Find Events WF1: Data Ingestion WF2: Customer Analysis decentralized information fow control (DIFC) [25, 32, 38, 54] is well- Analysts With Specific Features WF3: Global Trend Analysis suited to perform this tracking and access control, incorporating WF4: Hunting New Threats DIFC into real systems often results in , high overhead, label creep Store Exploratory Derived and signifcant restrictions on functionality. Consider a canonical Analysis Results example DIFC logic where data sensitivities are represented as tags Store Results and data and processes are assigned labels, which are sets of tags. Derived Results Over its lifetime, a process will read information from fles and inter- act with other processes that have read information from other fles. Figure 1: Common Security Analytics Workfows. During this execution, the DIFC system will propagate tags from one label to another to indicate the fow of information. However, this tracking is often imprecise (e.g., conducted at the OS process layer), which causes a phenomenon commonly referred to as label Function-as-a-Service platform. We demonstrate how to ef- creep or label explosion. Label creep is particularly problematic in fciently and securely incorporate DIFC tracking within a server environments where long-running processes receive data common serverless computing runtime without dependen- from many clients with diferent labels. In such scenarios, the result- cies on specifc application languages. ing over-approximated label is often meaningless and unnecessarily • We demonstrate the practicality of SCIFFS. We implement a restricts functionality. Applied naïvely, DIFC may artifcially limit proof-of-concept security analytics serverless application the utility of third-party security analytics. and measure the performance overhead imposed by SCIFFS In this work, we propose SCIFFS: Securing Client Information compared to an unsecured baseline. Our evaluation shows a Flows using Function-as-a-Service. SCIFFS is an automated de- 3.87% overhead on ingesting events and a linear overhead on centralized information fow control framework for preventing analyzing data starting at 18.75% for a database with 50,000 accidental sensitive data exposure in third-party security analytics records to 104.24% for 500,000 records. We further show that platforms. Our key insight for overcoming label creep is the use of SCIFFS supports four common analytics workfows. serverless computing, an emerging cloud computing runtime where functionality is separated into small functions that are reentrant The remainder of this paper proceeds as follows. Section 2 pro- and ephemeral. This polyinstantiated nature of serverless comput- vides background and motivation. Section 3 overviews SCIFFS. ing signifcantly reduces the potential for label creep. However, Sections 4 and 5 describe the label model and system design. Sec- simply tracking DIFC labels within the serverless runtime is not in- tion 6 describes a proof-of-concept security analytics application. and-of-itself sufcient for a third-party security analytics platform.