PRINS: Optimizing performance of reliable internet storages

Document Type

Conference Proceeding

Date of Original Version



Distributed storage systems employ replicas or erasure code to ensure high reliability and availability of data. Such replicas create great amount of network traffic that negatively impacts storage performance, particularly for distributed storage systems that are geographically dispersed over a wide area network (WAN). This paper presents a performance study of our new data replication methodology that minimizes network traffic for data replications. The idea is to replicate the parity of a data block upon each write operation instead of the data block itself. The data block will be recomputed back at the replica storage site upon receiving the parity. We name the new methodology PRINS (Parity Replication in IP-Network Storages). PRINS trades off high-speed computation for communication that is costly and more likely to be the performance bottleneck for distributed storages. By leveraging the parity computation that exists in common storage systems (RAID), our PRINS does not introduce additional overhead but dramatically reduces network traffic. We have implemented PRINS using iSCSI protocol over a TCP/IP network interconnecting a cluster of PCs as storage nodes. We carried out performance measurements on Oracle database, Postgres database, MySQL database, and Ext2 file system using TPC-C, TPC-W, and Micro benchmarks. Performance measurements show up to 2 orders of magnitudes bandwidth savings of PRINS compared to traditional replicas. A queueing network model is developed to further study network performance for large networks. It is shown that PRINS reduces response time of the distributed storage systems dramatically. © 2006 IEEE.

Publication Title

Proceedings - International Conference on Distributed Computing Systems