Geographically Distributed Database Management at the Cloud's Edge
MetadataShow full item record
Request latency resulting from the geographic separation between clients and remote application servers is a challenge for cloud-hosted web and mobile applications. Numerous studies have shown the importance of low latency to the end user experience. Small response time increases on the order of a few hundred milliseconds directly translate to reduced user satisfaction and loss of revenue that persist even after a low latency environment is restored. One way to address this challenge in geo-distributed settings is to push all or part of the application, along with the data it requires, to the edge of the cloud - closer to application clients. This thesis explores the idea of taking advantage of clients' proximity to the edge of the network in order to reduce request latencies. SpearDB is a prototype replicated distributed database system which operates in a star network topology, with a core site and a large number of edge sites that are close to clients. Clients access the nearest edge, which holds replicas of locally relevant portions of the database. SpearDB's edge sites coordinate through the core to provide a global transactional consistency guarantee (parallel snapshot isolation or PSI), while handling as much work locally as possible. SpearDB provides full general purpose transactional semantics with ACID guarantees. Experiments show that SpearDB is effective at reducing workload latencies for applications whose access patterns are geographically localizable. Many applications fit this criteria: bulletin boards (e.g., Craigslist, Kijiji), local commerce or services (e.g., Groupon, Uber), booking and ticketing (e.g., OpenTable, StubHub), location based services (mapping, directions, augmented reality), local news outlets and client-centric services (e-mail, rss feeds, gaming). SpearDB introduces protocols for executing application transactions in a geo-distributed setting under strong consistency guarantees. These protocols automatically hide the complexity as well as much of the latency introduced by geo-distribution from applications. The effectiveness of SpearDB depends on the placement of primary and secondary replicas at core and edge sites. The secondary replica placement problem is shown to be NP-hard. Several algorithms for automatic data partitioning and replication are presented to provide approximate solutions. These algorithms work in a geo-distributed core-edge setting under partial replication. Their goal is to bring data closer to clients in order to lower request latencies. Experimental comparisons of the resulting placements' latency impact show good results. Surprisingly however, the placements produced by the simplest of the proposed algorithms are comparable in quality to those produced by more complex approaches.
Cite this work
Catalin Avram (2017). Geographically Distributed Database Management at the Cloud's Edge. UWSpace. http://hdl.handle.net/10012/12200