Performance of garbage collection algorithms for flash-based solid state drives with hot/cold data
Faculty of Sciences. Mathematics and Computer Science
Performance evaluation. - Amsterdam
, p. 692-703
University of Antwerp
To avoid a poor random write performance, flash-based solid state drives typically rely on an internal log-structure. This log-structure reduces the write amplification and thereby improves the write throughput and extends the drive's lifespan. In this paper, we analyze the performance of the log-structure combined with the d-choices garbage collection algorithm, which repeatedly selects the block with the fewest number of valid pages out of a set of d randomly chosen blocks, and consider non-uniform random write workloads. Using a mean field model, we show that the write amplification worsens as the hot data gets hotter. Next, we introduce the double log-structure, which uses a separate log for internal and external write requests. Although the double log-structure performs identically to the single log-structure under uniform random writes, we show that it drastically reduces the write amplification of the d-choices algorithm in the presence of hot data. In other words, the double log-structure yields an automatic form of data separation. Further, with the double log-structure there exists an optimal value ford (typically around 10), meaning the greedy garbage collection algorithm is no longer optimal. Finally, both mean field models introduced in this paper are validated using simulation experiments. (C) 2013 Elsevier B.V. All rights reserved.