Liu, Chuyi2018-09-212018-09-212018-09-212018-09-13http://hdl.handle.net/10012/13870Machine learning algorithms over big data have been widely used to make low-priced services better over the years, but they come with privacy as a major public concern. The European Union has made the General Data Protection Regulation (GDPR) enforceable recently, and the GDPR mainly focuses on giving citizens and residents more control over their personal data. On the other hand, with personal and collective data from users, companies can provide better experience for customers like customized news feeds and real time transportation systems. To solve this dilemma, many privacy-preserving schemes have been proposed such as homomorphic encryption and machine learning over encrypted data. However, many of them are not practical for the time being due to the high com- putational complexity. In 2017, Bonawitz et al. proposed a practical scheme for secure data aggregation from privacy-preserving machine learning, which comes with the afford- able calculation and communication complexity that considers practical users’ drop-out situations. However, the communication complexity of the scheme is not efficient enough because a mobile user needs to communicate with all the members in the network to es- tablish a secure mutual key with each other. In this thesis, by combining the Harn-Gong key establishment protocol and the mobile data aggregation scheme, we propose an efficient mobile data aggregation protocol with privacy-preserving by introducing a non-interactive key establishment protocol which re- duces the communication complexity for pairwise key establishment of n users from O(n2) to a constant value. We correct the security proof of Harn-Gong key establishment protocol and provide a secure threshold of degree of polynomial according to Byzantine Problem. We implement KDC side Harn-Gong key establishment primitives and prepare a proof-of- concept Android mobile application to test our protocol’s running time in masking private data. The result shows that our private data masking time is 1.5 to 3 times faster than the original one.enPrivacyMachine LearningProtocolAn Application of Secure Data Aggregation for Privacy-Preserving Machine Learning on Mobile DevicesMaster Thesis