Date of Award
Doctor of Philosophy in Computer Engineering
Electrical, Computer, and Biomedical Engineering
Solid-state drives (SSDs) offer compelling advantages, such as high performance and low cost, over traditional hard disks. Thus, SSDs have been increasingly deployed in various systems, from datacenters to personal computers, in the past few years. However, the advance of the big data era and ever-increasing work- loads place a great burden on both I/O buses and SSD's internal operations, which hinders the further adoption of SSD in the market. Hence, there is a great need to eliminate performance bottlenecks on SSDs and SSD-based storage systems to make SSDs match the speed of high-performance computing. The studies introduced in this dissertation tackle the challenges mentioned above by fully exploiting SSD's internal characteristics.
The first study addresses the I/O bottleneck issue in SSD-based storage systems. The idea is to offload compute-intensive tasks to where data reside. A special engine for regular expression (regex) search is designed and augmented inside SSD in the study. The regex search engine processes the data while they are transferred from NAND ashes to host. A library including several useful functions is created to make the hardware engine usable by applications on the host side. For evaluation purposes, the hardware engine is implemented in an FPGA, and the overall design is evaluated in an NVMe-based SSD using real-world regex cases. Experimental results show the I/O bandwidth utilization is reduced by up to 97%, and CPU utilization is reduced by as much as 82% for large data sets.
The second study focuses on SSD's performance isolation in the multitenancy environment. It presents a method to ensure the quality of services (QoS), especially when thousands of tenants share the SSD. The idea is to resolve I/O con in ash memory by properly placing data. Moreover, various performance metrics (i.e., throughput/IOPS and latency) are directly incorporated as criteria when scheduling I/O transactions inside the SSD. The overall design is implemented in an SSD simulator and is verified using a vast of real-world I/O traces that serve as thousands of tenants. From the experiments, significant reductions in tail latencies can be seen, which indicates our design can greatly reduce performance violations.
The third study aims at ensuring steady I/O performance in SSDs. According to the study, the primary cause of performance fluctuations is garbage collection (GC). GC is a background process when SSD retrieves free spaces, which is necessary for SSDs to function well. However, GC not only induces performance overhead but also shortens SSD's lifespan due to ash memory erasure. To minimize GC-induced performance fluctuations, an adaptive GC strategy that evenly distributes GCs over time is proposed. The GC strategy is derived by formulating a stochastic optimization model that characterizes the nature of GC processes. Several factors are considered to make the algorithm practical. A thorough evaluation is presented, showing that the proposed method can significantly reduce performance fluctuations and increase SSD's lifespan because fewer GC operations are triggered.
The fourth study reduces write amplification and GC count by separating hot and cold pages. Since data in the same block have a similar lifespan, fewer data need to be migrated during GC. The key observation that motivates the study is that valid data in a block tend to have a similar lifespan as GC goes on. Thus, we can take advantage of the GC process by using blocks' lifespan to identify the data's hotness. The proposed method is evaluated in terms of write amplification, erasure count, I/O performance, and additional resources required. Results show that this approach decreases GC count, increases endurance, and reduces average I/O response time because fewer data need to be migrated during GC.
Pei, Shuyi, "REMOVING PERFORMANCE BOTTLENECKS ON SSDS AND SSD-BASED STORAGE SYSTEMS" (2021). Open Access Dissertations. Paper 1235.
Available for download on Saturday, April 01, 2023