Title
On the necessity of hot and cold data identification to reduce the write amplification in flash-based SSDs On the necessity of hot and cold data identification to reduce the write amplification in flash-based SSDs
Author
Faculty/Department
Faculty of Sciences. Mathematics and Computer Science
Publication type
article
Publication
Amsterdam ,
Subject
Computer. Automation
Source (journal)
Performance evaluation. - Amsterdam
Volume/pages
82(2014) , p. 1-14
ISSN
0166-5316
ISI
000347495000001
Carrier
E
Target language
English (eng)
Full text (Publishers DOI)
Affiliation
University of Antwerp
Abstract
The write performance and life span of a solid state drive is greatly influenced by the garbage collection algorithm. This algorithm selects the data blocks to be erased which can be subsequently used for storing new data. Any valid data left on a selected block needs to be written elsewhere before the block can be erased and contributes to the so-called write amplification. As all of the data on a solid state drive is not accessed equally often, data identification techniques have been proposed that identify the more frequently accessed, called hot, from the less frequently accessed, termed cold, data. These data identification techniques have been shown to be quite effective in reducing the write amplification essentially by using different blocks to store the hot and cold data, but they also contribute to the complexity of the device. Write approaches that use different blocks for writes triggered by the operating system and writes triggered by the garbage collection algorithm have also been proposed. These approaches do not require a data identification technique and thus simplify the design of the device, while also reducing the write amplification. In this paper we compare the performance of such a write approach with write approaches that do rely on data identification using both mean field models and simulation experiments. The main finding is that the added gain of identifying hot and cold data is quite limited, especially as the hot data gets hotter. Moreover, the write approaches relying on hot and cold data identification may even become inferior if either the fraction of data labeled hot is not ideally chosen or if the probability of having false positives or negatives when identifying data is substantial (e.g. 5%5%).
E-info
https://repository.uantwerpen.be/docman/iruaauth/8fd211/624131bbc41.pdf
Full text (open access)
https://repository.uantwerpen.be/docman/irua/e325b7/9559.pdf
E-info
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000347495000001&DestLinkType=RelatedRecords&DestApp=ALL_WOS&UsrCustomerID=ef845e08c439e550330acc77c7d2d848
http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000347495000001&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=ef845e08c439e550330acc77c7d2d848
Handle