Projet de recherche doctoral numero :3321

Description

Date depot: 1 janvier 1900
Titre: ADAPTIVE CONSISTENCY FOR LARGE-SCALE DATA REPLICATION
Directeur de thèse: Sébastien MONNET (LISTIC)
Directeur de thèse: Marc SHAPIRO (LIP6)
Domaine scientifique: Sciences et technologies de l'information et de la communication
Thématique CNRS : Non defini

Resumé: In large scale systems, such as clouds, content distribution network, entreprise networks or peer-to-peer systems, data is distributed and replicated over the network. By placing data near its users, and by providing redundant copies of data, this improves response time, and availability for read-mostly workloads. For instance, a CDN places multiple copies of a popular web page across the Internet, so that distributed users can access the page quickly despite WAN latencies. As another example, a replicated database makes multiple copies of each data item to ensure that it survives despite crashes or network disconnection. When shared data is mutable, however, such large-scale data distribution and replication raise the issue of consistency. In order to maintain application invariants, updating an object must be coordinated across its replicas and with related objects (transactional updates). The general solution is to force a serial order of operations (using a consensus protocol) that all replicas will follow [1]. Although obviously correct, this approach does not scale. Consensus constitutes a serialisation bottleneck, and as the number of replicas increases, the load on all processors increases. However, many applications do not require strong consistency. In fact, there is a whole spectrum of trade-offs between performance, correctness, and consistency. Some techniques that are known to work are: - Partitioning the data. Operations that affect independent data items need not be mutually ordered. - Applying commutativity. Operations that commute do not need to be mutually ordered. Commutative replicated data types (CRDTs) require no foreground synchronisation at all. - Optimistic approaches move consensus moved into the background, off the critical path. Replicas may progress in parallel; they may diverge temporarily, as long as they converge eventually. This assumes the application can tolerate observing inconsistent, tentative state. - Reduced isolation levels. These guarantee a weaker class of invariants than serialisabilty, but support more parallelism. - Partial replication, i.e., a replica executes only the operations of interest to the local user. This reduces the load, but requires more complex protocols.

Doctorant.e: Cincilla Pierpaolo