A parallel and pipelined architecture for accelerating fingerprint computation in high throughput data storages

Document Type

Conference Proceeding

Date of Original Version



Rabin fingerprints are short tags for large objects that can be used in a wide range of applications, such as data deduplication, web querying, packet routing, and caching. We present a pipelined hardware architecture for computing Rabin fingerprints on data being transferred on a high throughput bus. The design conducts real-time fingerprinting with short latencies, and can be tuned for optimized clock rate with 'split fresh' technique. A pipelined sampling logic selects fingerprints based on the Minwise theory and adds only a few clock cycles of latency before returning the final results. The design can be replicated to work in parallel for higher throughput data traffic. This architecture is implemented on a Xilinx Virtex-6 FPGA, and is tested on a storage prototyping platform. The implementation shows that the design can achieve clock rates above 300 MHz with an order of magnitude improvement in latency over prior software implementations, while consuming little hardware resource. The scheme is extensible to other types of fingerprints and CRC computations, and is readily applicable to primary storages and caches in hybrid storage systems.

Publication Title

Proceedings - 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2015