DFTL - Concepedia

TLDR

Flash memory dominates embedded and enterprise storage due to its lack of moving parts and low power consumption, yet its performance is highly workload‑dependent. The study addresses the poor random‑write performance of flash by investigating the Flash Translation Layer and proposing a demand‑based FTL that selectively caches page‑level mappings. The authors develop a flash simulation framework, FlashSim, and design the Demand‑based Flash Translation Layer (DFTL) to shift from conventional FTL engines to a mapping‑caching paradigm. Experiments on realistic enterprise workloads show DFTL improves average response time by 78 % on a random‑write OLTP trace and 56 % on the read‑dominant TPC‑H benchmark, while reducing garbage‑collection overhead and enhancing overload behavior.

Abstract

Recent technological advances in the development of flash-memory based devices have consolidated their leadership position as the preferred storage media in the embedded systems market and opened new vistas for deployment in enterprise-scale storage systems. Unlike hard disks, flash devices are free from any mechanical moving parts, have no seek or rotational delays and consume lower power. However, the internal idiosyncrasies of flash technology make its performance highly dependent on workload characteristics. The poor performance of random writes has been a cause of major concern, which needs to be addressed to better utilize the potential of flash in enterprise-scale environments. We examine one of the important causes of this poor performance: the design of the Flash Translation Layer (FTL), which performs the virtual-to-physical address translations and hides the erase-before-write characteristics of flash. We propose a complete paradigm shift in the design of the core FTL engine from the existing techniques with our Demand-based Flash Translation Layer (DFTL), which selectively caches page-level address mappings. We develop a flash simulation framework called FlashSim. Our experimental evaluation with realistic enterprise-scale workloads endorses the utility of DFTL in enterprise-scale storage systems by demonstrating: (i) improved performance, (ii) reduced garbage collection overhead and (iii) better overload behavior compared to state-of-the-art FTL schemes. For example, a predominantly random-write dominant I/O trace from an OLTP application running at a large financial institution shows a 78% improvement in average response time (due to a 3-fold reduction in operations of the garbage collector), compared to a state-of-the-art FTL scheme. Even for the well-known read-dominant TPC-H benchmark, for which DFTL introduces additional overheads, we improve system response time by 56%.

References

Page 1

	Year	Citations

Page 1