A case for continuous data protection at block level in disk array storages
Document Type
Article
Date of Original Version
4-21-2009
Abstract
This paper presents a study of data storages for continuous data protection (CDP). After analyzing the existing data protection technologies, we propose a new disk array architecture that provides Timely Recovery to Any Point-in-time, referred to as TRAP. TRAP stores not only the data stripe upon a write to the array but also the time-stamped Exclusive ors (xors) of successive writes to each data block. By leveraging the xor operations that are performed upon each block write in today's RAID4/5 controllers, TRAP does not incur noticeable performance overhead. More importantly, TRAP is able to recover data very quickly to any point-in-time upon data damage by tracing back the sequence and history of xors resulting from writes. What is interesting is that the TRAP architecture is very space efficient. We have implemented a prototype of the new TRAP architecture using software at the block level and carried out extensive performance measurements using TPC-C benchmarks running on Oracle and Postgres databases, TPC-W running on a MySQL database, and file system benchmarks running on Linux and Windows systems. Our experiments demonstrated that TRAP not only is able to recover data to any point-in-time very quickly upon a failure but also uses less storage space than traditional daily incremental backup/snapshot. Compared to the state-of-the-art CDP technologies, TRAP saves disk storage space by one to two orders of magnitude with a simple and a fast encoding algorithm. In addition, TRAP can provide two-way data recovery with the availability of only one reference image in contrast to the one-way recovery of snapshot and incremental backup technologies. © 2009 IEEE.
Publication Title, e.g., Journal
IEEE Transactions on Parallel and Distributed Systems
Volume
20
Issue
6
Citation/Publisher Attribution
Xiao, Weijun, Jin Ren, and Qing Yang. "A case for continuous data protection at block level in disk array storages." IEEE Transactions on Parallel and Distributed Systems 20, 6 (2009): 898-911. doi: 10.1109/TPDS.2008.154.