Personal Data Management in the Internet of Things
Matharu, Ray Manpreet Singh
MetadataShow full item record
Due to a sharp decrease in hardware costs and shrinking form factors, networked sensors have become ubiquitous. Today, a variety of sensors are embedded into smartphones, tablets, and personal wearable devices, and are commonly installed in homes and buildings. Sensors are used to collect data about people in their proximity, referred to as users. The collection of such networked sensors is commonly referred to as the Internet of Things. Although sensor data enables a wide range of applications from security, to efficiency, to healthcare, this data can be used to reveal unwarranted private information about users. Thus it is imperative to preserve data privacy while providing users with a wide variety of applications to process their personal data. Unfortunately, most existing systems do not meet these goals. Users are either forced to release their data to third parties, such as application developers, thus giving up data privacy in exchange for using data-driven applications, or are limited to using a fixed set of applications, such as those provided by the sensor manufacturer. To avoid this trade-off, users may chose to host their data and applications on their personal devices, but this requires them to maintain data backups and ensure application performance. What is needed, therefore, is a system that gives users flexibility in their choice of data-driven applications while preserving their data privacy, without burdening users with the need to backup their data and providing computational resources for their applications. We propose a software architecture that leverages a user's personal virtual execution environment (VEE) to host data-driven applications. This dissertation describes key software techniques and mechanisms that are necessary to enable this architecture. First, we provide a proof-of-concept implementation of our proposed architecture and demonstrate a privacy-preserving ecosystem of applications that process users' energy data as a case study. Second, we present a data management system (called Bolt) that provides applications with efficient storage and retrieval of time-series data, and guarantees the confidentiality and integrity of stored data. We then present a methodology to provision large numbers of personal VEEs on a single physical machine, and demonstrate its use with LinuX Containers (LXC). We conclude by outlining the design of an abstract framework to allow users to balance data privacy and application utility.