Date of Award


Degree Type


Degree Name

Doctor of Philosophy in Computer Engineering


Electrical, Computer, and Biomedical Engineering

First Advisor

Qing Yang


Flash memory based SSD (solid state Drive) receives a lot of attention recently. SSD is a semiconductor device which provides great advantages in terms of high-speed random reads, low power consumption, compact size, and shock resistance. Traditional storage systems and algorithms are designed for hard disk drive (HDD), they do not work well on SSD because of SSD's asymmetric read/write performances and unavoidable internal activities, such as garbage collection (GC). There is a great need to optimize current storage systems and algorithms to accelerate data access in SSD. This dissertation presents four methods to improve the performance of the storage system by exploiting the characteristics of SSD.

GC is one of the critical overhead of any flash memory based SSD. GC slows down I/O performance and decreases endurance of SSD. This dissertation introduces two methods to minimize the negative impact of GC, “WARCIP: Write Amplification Reduction by Clustering I/O Pages" and “Thermo-GC: Reducing Write Amplification by Tagging Migrated Pages during Garbage Collection". WARCIP uses a clustering algorithm to minimize the rewrite interval variance of pages in a flash block. As a result, pages in a flash block tend to have a similar lifetime, minimizing valid page migrations during GC. The idea of Thermo-GC is to identify data's hotness during GC operations and group data that have similar lifetimes to the same block. Thermo-GC can minimize valid page movements and reduce GC cost through clustering valid pages based on their hotness. Experiment results show that both WARCIP and Thermo-GC can improve the performance of SSD and reduce data movements during GC, implying extended lifetimes of SSDs.

SSD fits naturally as a cache between the system RAM and the hard disk drive due to its performance/cost characteristics. But traditional cache replacements are designed for the hard disk drive, which do not work well on SSD because of SSD's asymmetric read/write performances and wearing issues. In this dissertation we present a new cache management algorithm. The idea is not to cache data in SSD upon the first access. Instead, SSD caches when data are determined to be hot enough and warrant caching in the SSD. Data cached in the SSD is managed using an asymmetrical replacement policy for read/write by means of conservative promotion upon hits.

The nonvolatile characteristic of SSD allows cached data persistent even after power failures or system crashes. So the system can benefit from a hot restart. Current researches on SSD caches mainly focused on cache architecture or management algorithms to optimize performance under normal working conditions. Few studies were done on exploiting SSD caches durability across system crashes or power failures. To enhance storage system performance with a hot restart, a new metadata update technique is proposed in this dissertation to maximize usable data upon a system restart. The new metadata update method runs two-fold faster than Facebook's Flashcache after recovering from a system crash or power failure.

Overall, this dissertation describes two methods that reduce valid page migrations caused by SSD's internal activities, and two cache management algorithms that improve the performance of SSD cache. These four methods overcome current shortcomings in SSD's usage in industrial and improve storage system performance efficiently.