Related Projects

Full title

Advantages of diversity for off-the-shelf SQL database servers: some empirical results


Diversity, fault-tolerance, performance, SQL servers


This work building upon previous research in the DOTS project explored the possible gains in dependability and performance from using off-the-shelf database servers.
We have shown empirical evidence that current data replication solutions are insufficient to protect against the range of faults documented for database servers [1]. In particular, we refuted the common assumption that fail-safe failures (i.e. crashes) are the main problem to be solved by database replication [2]. This conclusion may cast serious doubts that the current state-of-the-art in database replication is adequate. Diverse redundancy is the only known technique, which avoids this limitation, and we argued strongly in favour of using it for database replication. We have outlined possible fault-tolerant architectures using diverse servers and discussed the design problems involved.

We have also demonstrated empirically the potential for performance improvement through diverse redundancy. Diverse SQL servers exhibit systematic differences in their processing of SQL statements – server A executes some types of statements faster than server B while B is faster than A to execute other types of statements. Deploying diverse SQL servers in parallel, thus, allows for performance boost (e.g. when the fastest response is returned to the client) impossible to achieve with multiple replicas of the same SQL server. Performance gains achievable with diverse redundancy vary depending on the application profiles and can be significant for read-intensive applications.
Finally, we argued and have demonstrated empirically, that intelligent trade-offs in the form of service license agreements (SLAs) can be struck between dependability and performance of a fault-tolerant SQL server built with diverse SQL servers, impossible without diversity.

Use of SQL server diversity for dependability improvements

Fault tolerance is often the only viable way of obtaining the required system dependability from systems built out of “off-the-shelf” (OTS) products. We have studied a sample of bug reports from four off-the-shelf SQL database servers so as to estimate the possible advantages of software fault tolerance - in the form of diverse redundancy - in complex off-the-shelf software. We checked whether these bugs would cause coincident failures in more than one of the servers. We found that very few bugs affected two of the four servers, and none caused failures, on the same demand, in more than two. We also found that only four of these bugs would cause identical, undetectable failures in two servers. Since these results concerned only a certain snapshot in the evolution of these servers, we then repeated this study with new bugs reported for later releases of two (open-source) of these servers (a paper is in preparation. For updates check City’s CSR diversity page). We found again that very few bugs cause coincident failures. In both studies we also found that very few bugs caused identical, undetectable failures in two servers.

We also studied a sample of bugs reported for later releases of these servers. We checked whether these bugs cause failure on the earlier releases and observed that a significant number of them do not cause failures in the earlier release. These results suggested that a limited degree of fault tolerance can be obtained from using different releases of the same server type.

In addition we studied the possible gains on fault tolerance from exploiting data diversity, i.e. rephrasing an SQL statement to a logically equivalent [sequence of] statements. We have defined a number of generic rephrasing rules that we propose to use in a server-diverse setting for diagnosing the faulty server and for state recovery.
Therefore, a fault-tolerant server, built with diverse off-the-shelf servers, seems to have a good chance of delivering improvements in availability and failure rates compared with the individual off-the-shelf servers or their replicated, non-diverse configurations.

Use of SQL server diversity for performance improvements

We studied the performance effects of using diverse servers using the industry standard TPC-C as a client implementation. We have developed a prototype of middleware for database replication with diverse off-the-shelf SQL servers and conducted a series of systematic measurement under different regimes: i) pessimistic, under which the diverse servers vote on the outcome of the individual statements [3], i.e. we wait for both server responses before reporting it to the client ii) optimistic, under which the fastest response on a statement is reported to the client assuming that only crash failures are possible, iii) optimistic on the individual statement but voting on all statements before the transaction is committed. The preliminary results suggest that possible gains in performance can be obtained through diverse redundancy for the optimistic regimes of operation in comparison with non-diverse replication. In these cases the performance penalty due to diverse replication is small: the diverse replicated server performs always better than the slower non-replicated server and almost as well as the faster non-replicated server.




[1] Gashi I., Popov P., Stankovic V., Strigini L., "On Designing Dependable Services with Diverse Off-The-Shelf SQL Servers", in "Architecting Dependable Systems II", Lecture Notes in Computer Science, (R. de Lemos, C. Gacek and A. Romanovsky, Eds.), vol. 3069, pp. 191-214, Springer-Verlag, 2004

[2] Gashi I., Popov P., Strigini L., "Fault diversity among off-the-shelf SQL database servers", Proc. DSN 2004, International Conference on Dependable Systems and Networks,Florence, Italy, IEEE Computer Society Press, pp:389-398, 2004.

[3] P. Popov, L. Strigini, A. Kostov, V. Mollov and D. Selensky, "Software Fault-Tolerance with Off-the-Shelf SQL Servers", Proc. 3rd International Conference on Component-Based Software Systems (ICCBSS'04), 2-4 Feb. 2004, Redondo Beach, CA, U.S.A., pp. 117-126, Springer, 2004



Peter Popov, Vladimir Stankovic, Ilir Gashi (City)


Page Maintainer: Credits      Project Members only Last Modified: 11 August, 2005