SENSOR MONITOR FOR ANDROID

A Project

Presented to the faculty of the Department of Computer Science

California State University, Sacramento

Submitted in partial satisfaction of the requirements for the degree of

MASTER OF SCIENCE

in

Computer Science

by

Aniruddha Shekhar Rajguru

SPRING 2019

© 2019

Aniruddha Shekhar Rajguru

ALL RIGHTS RESERVED

ii

SENSOR MONITOR FOR ANDROID

A Project

by

Aniruddha Shekhar Rajguru

Approved by:

______, Committee Chair Ahmed Salem, Ph.D.

______, Second Reader Yuan Cheng, Ph.D.

______Date

iii

Student: Aniruddha Shekhar Rajguru

I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the project.

______, Graduate Coordinator ______Jinsong Ouyang, Ph.D. Date

Department of Computer Science

iv

Abstract

of

SENSOR MONITOR FOR ANDROID

by

Aniruddha Shekhar Rajguru

Over the years, data privacy has been a major concern amongst consumers.

Applications such as Facebook, Uber, and Instagram collect a huge amount of data from users in return for the free service. Some of this data collection is necessary for the service to work. However, the data being collected is often not essential for the functionality but is rather used for targeted advertising or user analytics. As the data collection takes place in the background, most consumers are left clueless. Consumers also lack the technical expertise to identify such data collection. Not just third-party applications, but even the Android operating system itself sometimes violates users’ privacy heavily. There are various ways of collecting user data, one of which is using device sensors, such as microphones, cameras, GPS, Wi-Fi and accelerometer, to precisely monitor the users’ activity.

The goal of this project is to create a sensor monitor that allows users to view and capture accurately what happens to their data on a day-to-day basis. The sensor monitor also informs users to view which applications are accessing which sensors and at what time.

v

To achieve this functionality, the sensor monitor targets three parts of the

Android stack: the Linux kernel’s PROC file system, Android’s SensorManager utility, and sensor.h header file. Combining these metrics along with a flag status allows the sensor monitor to form historical insights and send real-time alerts.

The sensor monitor is designed to be modular for better maintainability and extensibility. All sensor monitor insights are stored in JSON and can be easily exported for further analysis. Thus, the sensor monitor will benefit a regular user as well as form a base for future projects in the Android domain.

______, Committee Chair Dr. Ahmed Salem

______Date

vi

ACKNOWLEDGEMENTS

I would like to thank Dr. Ahmed Salem for his guidance throughout this project with his immense experience in software engineering.

I would also like to thank Dr. Yuan Cheng with his expertise to review my project report and provide valuable suggestions and feedback.

Finally, I am very grateful to the Department of Computer Science, CSU Sacramento for providing me with access to the necessary resources and tools for completing this project successfully.

vii

TABLE OF CONTENTS

Page

Acknowledgments ...... vii

List of Tables ...... xi

List of Figures ...... xii

Chapters

1. INTRODUCTION ...... 1

1.1 Background ...... 1

1.2 Problem ...... 2

1.3 Purpose and Objective ...... 3

1.4 Current Tools and Research ...... 3

2. PROPOSAL AND GOAL ...... 6

2.1 Proposed Solution ...... 6

2.2 Use Cases ...... 7

2.3 Project Environment and Scope ...... 8

2.4 Challenges ...... 10

3. PROJECT DESIGN ...... 12

3.1 Design Overview ...... 12

3.2 Flagging Procedure ...... 14

3.3 Design Details ...... 16

viii

3.4 Project Components ...... 16

3.5 Module Synchronization ...... 21

3.6 Technology Stack ...... 22

4. IMPLEMENTATION ...... 23

4.1 Requirements and Installation Process ...... 23

4.1.1 Requirements ...... 23

4.1.2 Steps ...... 23

4.2 Source Code ...... 24

4.3 Screenshots ...... 26

5. SIMULATION ...... 30

5.1 Test Bench ...... 30

5.2 Test Scenario - 1 (Region: India) ...... 31

5.3 Test Scenario - 2 (Region: France) ...... 36

6. ANALYSIS ...... 37

6.1 Results from Test Scenario - 1 ...... 37

6.2 Results from Test Scenario - 2 ...... 39

7. PERFORMANCE ...... 40

7.1 Project Performance ...... 40

7.2 Stability ...... 41

7.3 Project Compatibility ...... 42

7.4 Limitations ...... 45

ix

8. FURTHER STUDY AND FUTURE WORK ...... 46

9. CONCLUSION ...... 47

References ...... 48

x

LIST OF TABLES

Tables Page

1. Performance Overhead with FBE Enabled ...... 19

2. Sensor Insights ...... 20

3. Test Device ...... 30

4. Simulation Result - 1 (3 hours, Android Version 6) ...... 31

5. Simulation Result - 2 (5 hours, Android Version 6) ...... 32

6. Simulation Result - 3 (1 hour, Android Version 7) ...... 33

7. Simulation Result - 4 (2 hours, Android Version 6) ...... 34

8. Simulation Result - 5 (5 hours, Android Version 7) ...... 35

9. Simulation Result - 6 (5 hours, Android Version 7) ...... 36

xi

LIST OF FIGURES

Figures Page

1. Operating System Market Share ...... 8

2. Android Version Distribution ...... 9

3. Project Structure ...... 13

4. Flagging Procedure ...... 15

5. Storing Insights ...... 24

6. Encryption ...... 25

7. Home Screen (Pane - 2) ...... 26

8. Exporting Insights ...... 27

9. Sensor List (Pane - 1) ...... 28

10. Historical Insights (Pane - 3) ...... 29

11. Simulation Analysis ...... 38

12. Project Performance ...... 40

13. Compatibility Checker ...... 43

xii 1

CHAPTER 1

INTRODUCTION

1.1 Background

Hardware improvements in smartphone and extensive availability of the

Internet have led to innovative products like Google Maps, Facebook, Uber, and

Instagram. These applications use a different business model. Instead of an upfront fee, advertisements are displayed for monetization. Unlike advertisement on TV, advertisements in the technology industry are heavily tailored to each user, thus requiring a deep understanding of the user’s personal and private information. Things like the user’s location, browsing history, applications they use, and the messages/calls they make are used to serve advertisements.

Various data collection techniques are used to understand users better. Click-

Stream analysis is one such technique. As users move from one website to another,

Click-Stream analysis tracks the behavior in the background. The algorithm forms a precise sequence and generates user-profiles mostly for advertising. Click-Stream analysis might seem harmless and straightforward, but it’s just the tip of the iceberg when it comes to data collection and online advertising.

2

1.2 Problem

Apart from Click-Stream analysis, data collection algorithms also target smartphone sensors. Smartphone sensors are much more critical, since they collect sensitive data and can be misused by applications easily. Data collection through sensors is also increasing every day, as the capability of sensors is growing. This data contains the user's real-time location, photo/video contents, messages/phone history, and even microphone recordings, and can be used to spy on users. Advertisers also sell off the data to analytics agencies and government bodies all without the user's knowledge or consent.

Controlling data collection is hard since the data collection algorithms work in the background without the user’s knowledge. Typically, users lack the technical expertise to understand the capability and value that their data has, and hence do not protect their data. Further, the governing bodies in developing countries are also incapable of acting as a gatekeeper and are unable to safeguard user data since they lack the necessary technology to do so. Privacy laws are almost non-existent in countries like India, Brazil, and Vietnam where the majority of smartphone users lie.

Thus, companies like Facebook often target these regions and collect data ruthlessly without the fear of facing any fines or lawsuits.

3

1.3 Purpose and Objective

The purpose of this project is to identify applications that collect user data in the background and present it to the user in a simple format. The sensor monitor would help users understand which applications are misbehaving, and would help take necessary measures to protect data privacy. The sensor monitor would change the current state of Android, which lacks the mechanism to aggregate sensor usage.

The sensor monitor would bring clarity to the way sensors work in the background. The following is a summary of objectives for the sensor monitor project:

• Create a flexible and modular framework that would observe the hardware

sensor usage.

• Present insights to the user in a simple dashboard.

• Secure valuable insights and data through sensor monitor using on-device

encryption.

• Notify users with suggestions and tips to improve data privacy.

• Maintain a simple installation process and support maximum number of

hardware and software combinations.

1.4 Current Tools and Research

Researchers have been studying user privacy on , with projects exploring how data sampling takes place on Android. Precisely, “Android Sensor Data

Collection Framework [1]” shows different ways in which sensor data can be captured on mobile devices and then utilized further. The paper demonstrates and highlights

4 how a dataset of sensor data is constructed and can be used to create clusters and patterns. Similarly, “TaintDroid [5]” is a sensor monitoring system that tracks application behavior for misusing user’s private information. The tool monitors

Android’s virtual execution environment and tracks data flow between applications.

Even though this methodology is supposed to be efficient, the tool still results in a

14% performance overhead. The system also does not send instant alerts of any sort.

Apart from such frameworks, there are tools to detect malware. For instance,

AVG, Avast, and Bitdefender offer tools to detect unauthorized file access, but rarely deal with device sensors. These tools also lack any historical statistics regarding sensors which can be exported. The same applies for popular monitoring utilities, such as Simple System Monitor, which do not monitor sensors at all. Since none of these utilities utilizes the Linux kernel, the statistics are not detailed enough.

There are mechanisms within Android that are supposed to provide data and privacy protection. In particular, there is a section in Android Settings where one can see what permissions every application is granted and access to files and folders. A user can go further and revoke some of these permissions. But this menu is not apparent to a typical user since there are multiple steps involved. The permissions are also limited and do not cover all sensors. Device-specific sensors are not included, since this menu is a part of the Android operating system and not tailored for device hardware.

Application developers and algorithms find clever ways of bypassing these permissions as well. An excellent example of this is how Instagram forces users to

5 grant access to the microphone. Whenever users try to upload a photo to Instagram, the application first asks for microphone access and offers no other alternative. For posting a picture, there is no necessity of microphone access, but Instagram still asks users to grant the permission. Such tactics are used by every application and even the

Android operating system itself, thus defeating the purpose of having a permission stack.

6

CHAPTER 2

PROPOSAL AND GOAL

2.1 Proposed Solution

A centralized information panel like a sensor monitor would overcome some of these privacy issues. The sensor monitor has three panes across which statistics and insights are displayed and updated in real time.

The first pane would display the sensors that are present on the user’s device

— the pane updates in real time, displaying sensors that are active or not in use.

The second pane would provide more in-depth information for every sensor and will have a corresponding flag level (more about flag level in CHAPTER 3).

Following is the information present on this screen:

1. Name of Application

2. Flag Status (0-5)

3. Recommendation

The suggestions are relatively basic right now and would highlight which applications should have their permissions revoked or modified. If an application has a higher flag level consistently, the suggestion would change to “Uninstall App.”

The third pane would provide historical statistics regarding application usage.

The statistics would be sorted on a per-application basis, listing all the sensors that the applications have been accessing. The historical data would include:

1. Application name, source of installation.

2. Permissions granted

7

3. Total network connections made

4. Total data footprint

5. Sensor name (for each sensor)

5.1 Total access in 1 week, average duration

5.2 Background / Foreground

The historical statistics would be updated every hour and would also store all the findings locally on the device. The historical statistics are useful if someone wants to export it to a different device or use it for something else.

2.2 Use Cases

These are some additional use cases where this project shines apart from its intended use case, which is to provide sensor statistics to typical Android users. Since the sensor monitor tool lays the foundation for forming statistics, it can act as a source for several machine learning and predictive analysis projects. The open source JSON format is used to store the insights obtained through the sensor monitor, allowing users to export and use them for further analysis. A user can be notified in advance even before installing the application with the help of these collected insights, thus creating a proactive model. When installing an application, the user can be warned based on the collected insights.

The sensor monitor is useful for application developers as well, since no built- in functionality within the Android SDK provides these insights. Good application

8 developers always want to reduce any overhead and want to minimize the number of background tasks that the application performs.

2.3 Project Environment and Scope

The privacy issue extends to every computing platform and every type of device currently used. But it is a lot more pronounced on Google’s Android operating system due to its structure and business model. Android is open source and allows device manufacturers to modify the Android operating system as per their

Figure 1: Operating System Market Share requirements. No one monitors what modifications device manufacturers make to the operating system. The operating system lacks focus on privacy and does not offer some

9 of the underlying privacy protection mechanisms found on other platforms (such as access to location only when the application is in foreground). The permission stack designed to protect user-data has not changed since the last four years and lacks several basic functionalities [4]. Applications (built-in and third-party) can access user-data and device sensors with very minimal resistance from the operating system. Sensor access is also much more explicit on other platforms, where users are explicitly informed when sensors like location are accessed. Android also has the highest market share amongst all mobile operating systems as shown in Figure 1, making it the target platform for malware. Regular security updates are also infrequent on Android devices, which helps data collection algorithms to use exposed vulnerabilities for data collection. All of this makes Android the ideal platform for our project.

Since there are various versions of Android in use, we had to make a choice regarding the compatibility of this project. After looking at the current market share

Figure 2: Android Version Distribution

of Android devices as shown in Figure 2, it is pretty clear that Android version 6 and

7 are still the most popular versions. Version compatibility is not a problem typically,

10 but since the sensor monitor would be accessing low-level kernel resources, compatibility becomes more complicated and needs more attention. Having said that, there shouldn’t be a problem using this code on the newer versions of Android, partly because the Linux kernel on all Android versions since Android 7 has remained the same, i.e., kernel 4.4. The Android operating system has built-in compatibility mode as well, and hence allows applications not tested on the current version of the operating system to run smoothly without any issues.

The sensor monitor is compiled into a standard Android Package (APK) without requiring any external dependencies. Hence, the installation is handled by the

Android operating system by merely side-loading the APK onto the user’s device. The installation process is described more in detail in Section 4.1.2.

2.4 Challenges

The Android operating system and the Linux kernel pose a few problems. The loadable kernel module running in the kernel space increases the chances of a kernel panic exponentially, thus causing the entire operating system to crash. Since the

Android APIs do not provide all the information we need, the sensor monitor needs to access kernel resources. Regularly accessing kernel resources also consumes a lot of power and results in battery drain. The challenge would be to have a minimum impact on system resources as the sensor monitor updates its metrics. The sensor monitor would also need to understand if there are any device specific sensors. In some cases,

11 manufacturers use private or unpublished APIs along with proprietary sensors which are challenging to identify and track.

For testing the project, we would have to simulate different hardware combinations, different geographical regions (different regions have varying privacy laws and application behavior changes accordingly), and usage patterns. Test cases would have to be detailed, covering the necessary breadth and depth of devices. There are hundreds of different sensor configurations available today, and hence considerably more effort has to be made to filter out the most popular and critical configurations.

12

CHAPTER 3

PROJECT DESIGN

3.1 Design Overview

The overall structure of the sensor monitor is shown in Figure 3.

ProcessTracker, SensorTracker, InsightsHelper, StorageHelper, and UsageHelper are the five primary components which run on separate threads at different frequencies.

Once the sensor monitor starts, all of these components start one after the other and are kept active unless the sensor monitor quits. The functionality of each component is explained in Section 3.4 in more detail.

Temporary hash tables are created as ProcessTracker starts computing information from the Linux kernel. The data in the hash tables is not permanent and gets overwritten to save space. InsightsHelper parallelly starts pulling information from the hash tables and pushes it to the appropriate screen through the flagging procedure. If any application receives a flag status of five, InsightsHelper sends an alert in the form of a push notification. The JSON file is stored as Insights.JSON, which gets encrypted/decrypted through StorageHelper.

13

Linux Kernel Sensor Manager, Usage Manager...

Process Tracker: Sensor Tracker: PID, Process Name... Sensor Name…

Temporary Hash Tables

Insights Helper: Forms the insights Display Flagging procedure O/P

Storage Helper: Maintains insights Insights.JSON

Figure 3: Project Structure

14

3.2 Flagging Procedure

The flagging procedure shown in Figure 4 decides the severity of the sensor access. It also helps reduce false positives as the procedure checks several parameters while increasing or reducing the flag status of each application. The flagging procedure receives the current sensor and process statistics that are stored in the hash table. First, it checks whether the application is in the foreground or in the background.

If the application is in the foreground, the flagging procedure reduces the flag status as well as resets the access count. Access count maintains the amount of time the sensor was accessed for that particular session. It acts like a counter and is an important metric for deciding the flag status.

If the application is in the background and is still accessing the sensor, the procedure starts checking different parameters such as source of application, SDK used by the application, network activity and the current access count. Depending upon the result for each check, the procedure increases the flag status (with a maximum value of 5). This process repeats for each application and each sensor, which allows the sensor monitor to send appropriate alerts to the user.

15

Process and Sensor Statistics from Hash table

Process state Check

Background Foreground

App Source && Access Count < 5 Access count > 10 FAIL

PASS Flag = Flag – 1 Deprecated APIs Used FAIL Flag = Flag + 1 && && 3rd Party SDK Flag < 5

PASS High Network activity & read/write FAIL PASS

Flag = Flag + 2 && Flag = Flag + 1 && Flag < 5 Flag < 5

Update Access Count

Return

Figure 4: Flagging Procedure

16

3.3 Design Details

The project is divided into five components (explained in Section 3.4) with each component running on a separate thread. This allows us to stop and start each individual component as the sensor monitor performs various tasks. Dividing the entire codebase into such separate sections allows us to better optimize functionality of our project as well, reducing the chances of crashing the entire application and allowing us to add or remove a particular component depending upon the target device (More on this in Section 3.5).

The project is designed using Android Studio 3.2.1, with the standard Android

Software Development Kit and Google Play Services. API requirement is 23 or above.

For encrypting the collected insights, the Android FBE (File-Based Encryption) is enabled by default. FBE is only available on Android version 6 and above. Hence, devices running Android version 6 will use AES-256 encryption. The encryption algorithm is provided using Pycrypto library.

3.4 Project Components

a. ProcessTracker: The sensor monitor needs to understand which applications are running on the operating system in real time. Not just applications running in the foreground, but applications in the background as well. To achieve this, we would have to monitor each individual process running on the kernel. Every application runs as a process (single or multiple sub-processes) on the kernel. Monitoring the kernel is the most precise way to find the currently running applications. Along with the third-

17 party applications, we also need to monitor system processes and first-party applications used by the operating system. The Linux kernel contains a filesystem called PROC (short for Process) that contains a variety of real-time information. The filesystem is present on every version of Linux kernel and has largely remained unchanged. Inside PROC is a list of every single process that is active and in memory running either in the background or the foreground. Every process has a unique

Process ID (PID); and corresponding to every PID, there is vast amount of information. We would be collecting four parameters from every PID:

1. Most recent read/write: This is available through /PROC/PID/fd. This allows us to keep a track of files that the process/application has been updating. A process constantly accessing device sensors would have a higher corresponding read/write since it needs to store the raw data from the sensors. Hence, we need to monitor this parameter. (Note: Due to Android’s sandboxing nature, we cannot find what information is read or written by an application/process)

2. Active network connections: Data tracking algorithms make frequent connections to transfer the collected data to the destination server. Using

/PROC/PID/tcp/ and /PROC/PID/udp/ we can find the active network connection made by a process, destination server address, and number of active sockets used by the process.

3. Process and memory: This shows the power consumption and memory utilization for every process, allowing us to understand how much processing is carried out by an application in the background.

18

4. Owner/MISC: Owner of the running processes which helps us distinguish between an operating system process and a third-party application. This helps us form a complete profile of the application usage.

All the above parameters are updated in real time by the Linux kernel, and hence must be constantly monitored. The reason for separating this component from the rest of the codebase is due to the risk involved while accessing PROC filesystem.

Since we are accessing kernel resources, there is possibility of reading invalid/garbage values. To reduce such issues, this component runs in its own thread, separating it from the remaining codebase. We did still encounter a few issues, which are explained in

Section 7.3.

b. SensorTracker: After monitoring each individual process, we need to track sensor usage, to find which sensors are being accessed. SensorManager is a utility that allows applications to know what the status of a particular sensor is. A sensor can only be used by one process at a time, and hence applications would end up waiting infinitely for the sensor to free up. Hence, to help processes understand the current status of the sensor, SensorManager is added to the Android framework. This metric would provide useful information for us since we already know what processes are currently running. We can map the status of the sensor with the processes running in the background. Along with the status, SensorManager also provides information regarding the sensor itself. This would allow us to understand what the capabilities of the sensor are, what is the vendor of the sensor, etc.

19

In total, we would be collecting the following parameters from SensorManager:

1. Sensor Information: Hardware Vendor, Range, Type, Power Consumption

2. Sensor Status: In Use / Idle

3. Sensor Type: Hardware / Software

c. StorageHelper: There are two ways in which the data would be stored. First is to form a local cache that the sensor monitor can use during its functioning; and second is storing insights in a JSON format for forming a historical statistic. A combination of an array and hash tables is used to represent this cache and would contain values obtained from ProcessTracker and SensorTracker.

d. InsightsHelper: For storing the insights, we would be using JSON format, as it is widely used and compatible with most ongoing projects. The new insights gathered would be appended to the existing file and stored in

/data/data/Sensor_Monitor/insights.JSON on the device locally. All the insights stored on the device are encrypted using File-Based Encryption (FBE). The latency with FBE is negligible as shown in Table 1.

Table 1: Performance Overhead with FBE enabled

File Size With FBE Without FBE Difference

1.5MB 0.0278 Secs 0.027Secs ~3%

14MB 1.349 Secs 1.310 Secs ~2%

This is how insights will be stored (as shown in Table 2):

• Application Name: com.Facebook

20

• Current Flag Status: 3

• Permissions granted: mic, camera, location, contacts

• Total network connections made: 48 (TCP + UDP)

• Total data footprint: 101MB

Table 2: Sensor Insights

Access Timestamp Sensor AvgDuration Type (MM.DD.YYYY.HH.MM.SS)

Microphone - 1 Continuous 0.721 Secs 12.09.2018.08.00.00

Gyroscope Continuous 3.211 Secs 12.09.2018.08.00.00

e. UsageHelper: This component can be thought of as the “CPU” of the sensor monitor. After we have the necessary data through ProcessTracker and SensorTracker, we need to process it and derive insights. There are four primary tasks for UsageHelper

1. Access Data: The data monitored using SensorTracker and ProcessTracker is utilized here. Along with this, the UsageHelper will also collect information like source of installation for each application, the APIs used by the application, and the

SDK used. This information is not monitored constantly because it is static in nature.

2. Process: Once we have all the data necessary, we start to link all the data from SensorTracker and ProcessTracker to each other to make insights for every application and the corresponding sensor.

3. Flag: Our goal is to flag application that access sensors in the background.

But we cannot simply flag and notify every single application accessing the sensor in

21 the background. Sometimes applications do actually need to access sensors in the background for proper functioning. Hence, we will perform a sequence of checks on the application profile (as shown in Figure 4) and increase or reduce the flag status.

The flag status will move from 0 to 5, 0 being the default flag and 5 being the highest

(or the worst).

4. Display: All the gathered data and the resulting insights get displayed on each of the appropriate pane of the sensor monitor. The frequency of this update will depend on the update frequency of the remaining components.

3.5 Module Synchronization

Each of the above five components run at a different frequency.

ProcessTracker and SensorTracker run almost constantly in the background (Note: we had to create one more daemon thread for stability issues, discussed in Section 7.2).

Since processes constantly change their states, it is important to have the most accurate information available regarding the process. The information gathered through both these components is also of higher priority for the flagging procedure, affecting it by a higher margin if incorrect values are used here. After testing with different refresh intervals, we found a good an optimal value of 4 and 7 seconds depending upon the configuration of the user device. If the device has only 2 processor cores, the update frequency will be 7 seconds; otherwise, it will 4 seconds. The storage helper would run at the same frequency as ProcessTracker, updating the hash table and the arrays.

22

UsageHelper would run at a 2-second frequency, thus updating the contents on the User Interface. Lastly, InsightsHelper would update every 60 minutes. Read/write operations are processor intensive and having insights for every hour is good enough for future use. Remember that the insights are appended every time to insights.JSON file and also encrypted/decrypted every time, thus the frequency for this component has to be slightly lower.

3.6 Technology Stack

• Languages: Java and Python. Java is used for all the core components such as

SensorTracker, InsightsHelper and UsageHelper. Python is used for AES

encryption. The user interface is designed using xml format.

• SDK and Editor: Standard Android SDK with gradle version 4.6 using Android

Studio version 3.2.0.

• Chaquopy: This plugin allows Android Studio to integrate Python code into

the existing Java files.

• Pycrypto-2.6: If FBE is disabled or not supported, the sensor monitor encrypts

the hash tables and insights using Pycrypto library. Specifically, the sensor

monitor uses AES-256 which provides good amount of security with little

performance degradation.

23

CHAPTER 4

IMPLEMENTATION

4.1 Requirements and Installation Process

The sensor monitor is supported on all Android devices that satisfy requirements as mentioned in Section 4.1.1. The user may proceed and follow step by step instructions provided in Section 4.1.2. The sensor monitor uses the standard

Android installer. It is also necessary that the user grants permissions to the device sensors once the installation process is complete.

4.1.1 Requirements

1. Android Operating System: Version 6 or above

2. RAM: 2GB or higher

3. Google Play Services

4. Root Access

4.1.2 Steps

1. Download the project APK from the following link:

https://github.com/anuraj17/SensorMonitorTestImage.git

2. Open Settings-> Security. Once on this page, enable the “Unknown sources”

option.

3. Open the file manager and select the project APK. The APK will run the

standard installation process.

4. Once the installation is complete, the sensor monitor launcher icon will appear

on the phone.

24

5. If there is any error, verify whether the minimum requirements are met or not.

If the installation is successful, follow step 2 to disable the “Unknown sources” option. This is very important to maintain the security of the device.

4.2 Source Code

As shown in Figure 5, the insights helper pushes information to the user interface. The function maintains a list of the sensor statistics, which gets updated.

However, only when there is a change in value, the corresponding UIComponents are updated.

Figure 5: Storing Insights

25

Figure 6 shows how StorageHelper encrypts the data and returns the encrypted list. Every time the result is appended to the existing JSON file instead of overwriting it.

Figure 6: Encryption

26

4.3 Screenshots

The screenshot as shown in Figure 7 represents the home screen for the sensor monitor. It will be the screen that would be presented by default when the application is launched. The current status of each application is displayed in individual cards.

Figure 7: Home Screen (Pane - 2)

27

As the insights are displayed on the home screen, they are also stored on the user’s device ready to be exported. Hence, whenever the user wishes to export, she/he can select the option from the drop-down menu as shown in Figure 8.

Figure 8: Exporting Insights

28

The left screen as shown in Figure 9 shows the status of each sensor in real time. Along with the status, the screen also highlights what is the source of the device driver as well as the API level used by the sensor.

Figure 9: Sensor List (Pane - 1)

29

The screen on the right as shown in Figure 10 offers much more detailed statistics for every applicaiton. The card lists information regarding the applicaiton itself, like the source of the APK, the current permission granted, total network connections made, and the total data written. Below, the card lists information related to each sensor that the application has access to. For instance, Facebook has access to microphone which was accessed continuously for an average duration of 0.7 seconds.

Lastly, the card maintains a timestamp for each session and sensor. This same pattern is used for every sensor and every application.

Figure 10: Historical Insights (Pane - 3)

30

CHAPTER 5

SIMULATION

5.1 Test Bench

Table 3 represents the hardware configuration used for testing the sensor monitor.

Table 3: Test Device

Model / Manufacturer Nexus 6 / Motorola

Processor 2.7 GHz, 4 cores

Memory 3 Gigabytes

GPU Adreno 420

Modem X7

Operating System Android 6.0 and 7.0

Security Patch Level Dec 2017

Android Kernel Version 3.18 / 4.4

Root Access Yes

Boot Loader Unlocked

Developer Option Enabled

31

5.2 Test Scenario - 1 (Region: India)

The tables contain application name, sensor, most common type of access during testing session, flag level at the end of test session, and timestamp of sensor access. Each test case contains a total duration, start time, end time, and the version of

Android that was used. Table 4, 5, and 7 list the results against Android Version 6, while table 6, 8, and 9 show the results against Android Version 7.

Table 4: Simulation Result - 1 (3 hours, Android Version 6)

Access Flag Timestamp Application Sensor Type Level (MM.DD.YYYY.HH.MM.SS)

Messenger Location Burst 0 12.09.2018.08.00.00

Facebook Location, Mic Burst 0 12.09.2018.08.00.00

Uber Location Continuous 0 12.09.2018.08.00.00

Calendar Accelerometer Burst 0 12.09.2018.08.00.00

Location, GMaps , Burst 0 12.09.2018.08.00.00 Accelerometer

Dialer Accelerometer Burst 0 12.09.2018.08.00.00

32

Table 5: Simulation Result - 2 (5 hours, Android. Version 6)

Access Flag Timestamp Application Sensor Type Level (MM.DD.YYYY.HH.MM.SS)

Messenger Mic Burst 1 14.10.2018.10.50.00

Continuo Facebook Location 3 14.10.2018.10.50.00 us

Uber ~~ ~~ 0 14.10.2018.10.50.00

Calendar Accelerometer Burst 0 14.10.2018.10.50.00

Location, Continuo GMaps 0 14.10.2018.10.50.00 Accelerometer us

Dialer Accelerometer Burst 0 14.10.2018.10.50.00

33

Table 6: Simulation Result - 3 (1 hour, Android Version 7)

Access Flag Timestamp Application Sensor Type Level (MM.DD.YYYY.HH.MM.SS)

Messenger Location Burst 2 24.10.2018.15.30.00

Location, Continuou Facebook 3 24.10.2018.15.30.00 Mic s

Continuou Uber Location 1 24.10.2018.15.30.00 s

Calendar Location Burst 1 24.10.2018.15.30.00

Location, GMaps Acceleromet Burst 1 24.10.2018.15.30.00 er

Dialer ~~ ~~ 0 24.10.2018.15.30.00

34

Table 7: Simulation Result - 4 (2 hours, Android Version 6)

Timestamp Access Flag Application Sensor (MM.DD.YYYY.HH.MM. Type Level SS)

Messenger Location Burst 2 29.10.2018.08.00.00

Facebook Camera, Mic Burst 5 29.10.2018.08.00.00

Uber Location Continuous 2 29.10.2018.08.00.00

Calendar ~~ ~~ 0 29.10.2018.08.00.00

GMaps Location Continuous 1 29.10.2018.08.00.00

Dialer Accelerometer Burst 0 29.10.2018.08.00.00

35

Table 8: Simulation Result - 5 (5 hours, Android Version 7)

Flag Applicatio Access Timestamp Sensor Leve n Type (MM.DD.YYYY.HH.MM.SS) l

Messenger ~~ ~~ 1 03.11.2018.21.40.00

Continuou Facebook Location 5 03.11.2018.21.40.00 s

Uber Location Burst 2 03.11.2018.21.40.00

Acceleromete Calendar Burst 0 03.11.2018.21.40.00 r

GMaps Location Burst 1 03.11.2018.21.40.00

Dialer ~~ ~~ 0 03.11.2018.21.40.00

36

5.3 Test Scenario - 2 (Region: France)

Table 9: Simulation Result - 6 (5 hours, Android Version 7)

Flag Application Sensor Access Type Time Stamp Level

Messenger Location Burst 1 05.11.2018.08.00.00

Facebook ~~ ~~ 0 05.11.2018.08.00.00

Uber ~~ ~~ 0 05.11.2018.08.00.00

Calendar ~~ ~~ 0 05.11.2018.08.00.00

GMaps Location Burst 1 05.11.2018.08.00.00

Dialer ~~ ~~ 0 05.11.2018.08.00.00

37

CHAPTER 6

ANALYSIS

6.1 Results from Test Scenario - 1

The simulation results (as shown in Figure 11) highlight that location was by far the most used sensor. It is not surprising that specific applications require location data in the background, but it is still much higher than one would expect. Followed by the location sensor is the accelerometer sensor, though this usage does not look suspicious. Also, accelerometer does not collect any user sensitive data, and hence the flag status for the application does not rise by much. Microphone got a far lower usage as compared to location but still higher than the camera. It is also important to distinguish between the requests made to the primary and the secondary microphones.

Most smartphones today have more than one microphone. After testing, it is not clear whether applications targeted the secondary microphone more often. The request seemed more evenly divided between both microphones. This may be because applications are not required to focus on specific microphones when making requests.

There is also no separate permission required to access different microphones available. Android considers all the microphones as a single sensor from permissions point of view. Lastly, like microphone, there are multiple camera sensors as well. But unlike the location or microphone sensor, there is no instance when the camera gets accessed in the background (only once detected). After testing with multiple configurations, the results are much the same. The reason for such behavior is the amount of memory and processing the camera module requires. If you gather the

38 amount of processing power the camera needs, it is significantly higher than any other sensor, and hence the operating system significantly throttles/stops other applications when the camera is being accessed. It is also worth noting that the information gathered using a camera may not be as useful as one might imagine. Location and microphone provide plenty of information for advertisers to work on. There is also no difference between the front and the rear camera sensor as far as permissions are concerned.

Location Microphone Accelerometer Camera

6

5

4

3

1

0 Messenger Facebook Uber Calendar GMaps Dialer

Figure 11: Simulation Analysis

39

6.2 Results from Test Scenario - 2

This test run has everything the same as Test Scenario - 1, except the region was changed to Europe, specifically France. Due to General Data Protection

Regulation

(GDPR), the privacy laws in this region are probably the strongest right now. The first test bench had the region set to India, where privacy laws are almost non-existent.

Testing between India and France covers the entire spectrum as far as privacy laws are concerned. Also, we used a VPN application (Opera VPN) so that Google Play services fully understand the change in the region. The reason for this test was to find out if application behavior changes in different regions since rules and regulations change significantly. The total number of background access was lower compared to

Test Scenario - 1, with similar workloads and sensor configuration. Even the application package (APK) of Facebook and Messenger changed, disabling certain features and altering the behavior. The operating system behavior remained the same, as far as the APIs and the permission stack were concerned.

40

CHAPTER 7

PERFORMANCE

7.1 Project Performance

In total, the codebase was tested for more than 100 hours with different workloads (Figure 12 shows a random snapshot of 2 hours of testing). We used the built-in Android Profiler utility that allows us to monitor the performance.

% 10

8

5

3

0 0 10 20 30 40 50 60 70 80 90 100 110 120 mins Project Facebook Uber

Figure 12: Project Performance

This was considering that the project was always running and never went to sleep to save power. As shown in Figure 12, the CPU usage never went above 10% with an average of 2.5%. Segmenting the project into different components helps reduce the

CPU utilization, as not all threads will be running at the same time. Having said that, there is definitely a room for improvement as far as performance is concerned.

7.2 Stability

41

During the initial test runs, there were instances of kernel issues which would cause the sensor monitor to reboot without any warning and freeze in a few cases. The issue lied with the constant kernel access, specifically with the

/PROC/PID/MEMINFO file. The sensor monitor accesses this file to understand the process state with the specific process ID, and then monitor the behavior of that process. This is not an issue if the process is in-memory but causes a problem when the operating system suspends the process. In such a case, the sensor monitor keeps looking for process information using an invalid process ID, hence at some point reads garbage content causing the application to freeze. A simple change to the hashing procedure solved this issue. A separate hash table regularly maintains a list of all the active process ID. Since the list needs to be updated constantly it runs on a separate daemon thread. This thread is separate from ProcessTracker and SensorTracker, thus creating a total of 3 daemon threads. The rest of the project components pull the process ID from this hash table and proceed forward. There is no problem of accidentally using an invalid process ID since the hash table gets updated in real time.

The only downside is the slight increase in the resources required for the code to run.

But the tradeoff is worth the stability since there were almost no crashes after this change.

42

7.3 Project Compatibility

There are a few constraints with the way the sensor monitor works currently.

The first constraint is the requirement of root permission. It is discouraged to turn on root permissions on Android devices, as it exposes the device to several vulnerabilities.

A user may also unknowingly run some incorrect instructions, thus either losing on- device data, causing the machine to crash and freeze permanently. Hence, root permissions are only enabled by professionals who want to test their applications for any issues. Since the sensor monitor needs to access the PROC filesystem from the

Android kernel, the code would only work on a device that has root permissions enabled. The system monitor will check whether the user has root permissions or not.

This is displayed through a prompt as shown in Figure 13. If yes, the sensor monitor would work as intended without any further action. If not, a prompt would ask to either enable root permissions or provide a skip option to keep using the sensor monitor without any root permissions.

43

The prompt would look something like this:

Figure 13: Compatibility Checker

Internally, there are changes to the way the flagging procedure would work without having the root permissions. The flagging procedure would only rely on metrics provided by SensorManager and UsageManager. No root permission means that PROC cannot be accessed and information regarding the process state is not available. The sensor monitor would only display the first and the second pane, with a prompt to enable root permissions on the third pane. The ability of this project would

44 be somewhat limited, but it would work and provide some useful and valuable insights.

The depth of the information will be limited, showing only the last used application for that particular sensor. It would not be able to form historical data due to lack of access to PROC and would not be able to list network connections either. This can be thought of as a compatibility mode, merely allowing users to keep using the sensor monitor without enabling root permissions.

The other constraint is regarding the insights that the code gathers and stores locally on the device. Though the sandboxing nature of applications in Android means that other applications cannot merely read these local insights, there is always a chance of someone stealing these insights. Android's File-Based Encryption would help in securing the gathered insights. The File-Based Encryption works on-device with large file sizes with very little overhead and is used to encrypt the insights. The actual insights stored would also not contain any device or user-specific data such as device-

ID, location or any device-fonts details. This would reduce the chances of creating a

“device fingerprint [3].”

45

7.4 Limitations

There are other places along the operating system and the application stack where data collection takes place. Some applications track user behavior within the application itself, with zero assistance from any hardware or software sensors. For instance, Facebook and Instagram observe how the user reacts to particular articles depending on the way they scroll the newsfeed [2]. For instance, Instagram monitors how users react to their news feed with various tracking algorithms. They use parameters such as the amount of time spent looking at a particular post and the between sections of the application. The sensor monitor will not be of significant use in such a scenario. Such data collection is typical in all multimedia applications, and the only way to avoid this type of data collection is to stop using the app itself.

Lastly, our project will not be of any importance in finding out what happens to the data after it gets collected from a sensor. The sensor monitor can only tell whether a sensor was accessed or not but does not track the destination of the collected data.

46

CHAPTER 8

FURTHER STUDY AND FUTURE WORK

The flagging procedure of the sensor monitor can be enhanced further.

Specifically, tracing the raw-data flow from the sensors to the application layer would add another dimension. Currently, the sensor monitor tracks network connections, through the PROC filesystem to determine the flow of data outside the device.

Identifying the pattern of data flow would further enhance the flagging procedure and would reduce false positives by a vast margin. There has been some work in this domain, most of it carried out on the open source version of Android (AOSP).

Specialized versions of Android like “DroidSafe [6]” have been developed to track information flow within the operating system layers.

Another improvement is to refine instant alerts further. If an application was using a particular sensor in the background, the user would get a notification regarding this behavior. Even though this is still possible on the current codebase, the implementation needs some further refinement. Creating instant alerts would also result in a lot more performance degradation, and hence would require considerable modifications to the current codebase.

47

CHAPTER 9

CONCLUSION

Data collection is not new in smartphones and will only increase moving forward as sensors get more powerful and efficient. The sensor monitor provides necessary information to users to identify bad actors and suggests useful steps to protect user privacy. The flagging procedure provides a unique functionality in segregating the bad actors from the good ones. The flagging procedure also reduces false positives and presents the necessary steps to protect user privacy.

File-based Encryption provided by Android encrypts the insights formed by the sensor monitor. The foundation of the sensor monitor is compartmentalized into different modules, thus allowing it to be flexible. As different modules can be disabled at runtime, the sensor monitor can work on a range of hardware combinations. The interface is deliberately barebone and straightforward to avoid user confusion. At the same time, the insights are detailed and granular enough to appeal to researchers.

Lastly, the installation process is short and does not require any modifications to the device kernel or software stack. All of this reduces the friction that users typically face while using Android tools.

The threaded nature of the sensor monitor allows it to use minimum system resources, allowing the sensor monitor to run on devices with low memory and CPU cores. The sensor monitor achieved good performance (as shown in Figure 12) and met all compatibility and stability requirements expected from an Android tool.

48

REFERENCES

[1] M. Atzmueller, and K. Hilgenberg, "Towards capturing social interactions with

sdcf: An extensible framework for mobile sensing and ubiquitous data collection,"

in Proceedings of the 4th International Workshop on Modeling Social Media, p. 6.

ACM, 2013.

[2] A. Besbes, “How to mine newsfeed data and extract interactive insights in

python,” [Online] Available: https://ahmedbesbes.com/how-to-mine-newsfeed-

data-and-extract-interactive-insights-in-python.html [Accessed October 2018]

[3] L. Fabian, A. Panchenko, and T. Engel, "A Formalization of Fingerprinting

Techniques," in 2015 IEEE Trustcom/BigDataSE/ISPA, vol. 1, pp. 818-825. IEEE,

2015.

[4] V. Taylor, and I. Martinovic, "A longitudinal study of app permission usage across

the google play store," CoRR, abs/1606.01708 (2016).

[5] E. William, P. Gilbert, S. Han, V. Tendulkar, B. Chun, L. P. Cox, J. Jung, P.

McDaniel, and A. N. Sheth, "TaintDroid: an information-flow tracking system for

realtime privacy monitoring on smartphones," ACM Transactions on Computer

Systems (TOCS) 32, no. 2 (2014): 5.

[6] M. I. Gordan, D. Kim, J. H. Perkins, L. Gilham, N. Nguyen, and M. C. Rinard,

"Information flow analysis of android applications in droidsafe," NDSS, vol. 15, p.

110. 2015.