-
2015-02-17
12/824,959
2010-06-28
US 8,959,284 B1
2015-02-17
-
-
Sanjiv Shah | Glenn Gossage
2032-08-14
Smart Summary: A disk drive includes a special memory called a write cache and a head that moves over a spinning disk. It receives multiple commands to save data and checks how busy the disk is. If the disk is not too busy, most of the data is saved directly onto it. However, if the disk is busy, part of the data goes into the write cache and part goes onto the disk, with more going into the cache if the disk is very busy. This system helps manage where data is stored based on how much work the disk has to do. ๐ TL;DR
A disk drive is disclosed comprising a non-volatile write cache and a head actuated over a disk. A plurality of write commands are received from a host, wherein each write command comprises write data. A workload for a non-cache area of the disk is determined, and when the workload for the non-cache area of the disk is less than a threshold independent of a workload for the write cache, substantially all of the write data is stored in the non-cache area of the disk. When the workload for the non-cache area of the disk is greater than the threshold independent of the workload for the write cache, a first percentage of the write data is stored in the non-volatile write cache and a second percentage of the write data is stored in the non-cache area of the disk, wherein the first percentage is proportional to the workload for the non-cache area of the disk.
Get notified when new applications in this technology area are published.
G06F3/06 IPC
Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
G06F12/00 IPC
Accessing, addressing or allocating within memory systems or architectures
G06F13/00 IPC
Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
Disk drives comprise a disk and a head connected to a distal end of an actuator arm which is rotated about a pivot by a voice coil motor (VCM) to position the head radially over the disk. The disk comprises a plurality of radially spaced, concentric tracks for recording user data sectors and servo sectors. The servo sectors comprise head positioning information (e.g., a track address) which is read by the head and processed by a servo control system to control the velocity of the actuator arm as it seeks from track to track.
The data sectors are accessed indirectly using logical block addresses (LBAs) mapped to physical block addresses (PBAs) representing the physical location of each data sector. This indirect accessing facilitates mapping out defective data sectors during manufacturing as well as while the disk drive is deployed in the field. Access commands (read/write) received from the host include LBAs which the disk drive maps to corresponding PBAs using any suitable mapping technique.
The LBA to PBA mapping may also facilitate a log structured file system wherein at least part of the disk is written as a circular buffer. For example, the circular buffer may be written from an outer diameter track toward an inner diameter track, and then circle back to the outer diameter track. Data is written to the head of the circular buffer such that the LBAs of new write commands are mapped to the PBAs of the corresponding data sectors. When the same LBA is written by the host, the data is written to a new PBA at the head of the circular buffer and the old PBA is marked invalid so that it may be overwritten. During a garbage collection operation, valid PBAs previously written in the circular buffer may be relocated to the head of the circular buffer so that the old PBAs may be overwritten.
FIG. 1A shows a disk drive according to an embodiment of the present invention comprising a head actuated over a disk, and control circuitry including a non-volatile semiconductor memory for implementing a write cache.
FIG. 1B is a flow diagram executed by the control circuitry for steering write data to the non-volatile write cache based on a workload of the disk drive according to an embodiment of the present invention.
FIG. 2 shows an embodiment of the present invention wherein the non-volatile write cache is implemented as a plurality of tracks on the disk.
FIG. 3 shows an embodiment of the present invention wherein the percentage of write data written to the non-volatile write cache is based on a linear function of the workload.
FIGS. 4A-4F show flow diagrams according to embodiments of the present invention for determining the workload of the disk drive.
FIG. 5 shows an embodiment of the present invention wherein the percentage of write data written to the non-volatile write cache is based on the workload of the disk drive and a percentage of free space in the non-volatile write cache.
FIG. 6 shows an embodiment of the present invention wherein the percentage of write data written to the non-volatile write cache is based on the workload of the disk drive and a percentage of life remaining of the non-volatile write cache.
FIG. 7 is a flow diagram according to an embodiment of the present invention wherein during an idle mode the write data is flushed from the non-volatile write cache to the disk.
FIG. 8A shows an embodiment of the present invention wherein during high workloads the write commands steered to the non-volatile write cache are selected based on a rotational position optimization (RPO) algorithm.
FIG. 8B shows an embodiment of the present invention wherein during high workloads write commands that are nearing a time-out limit waiting to be written to the non-cache area of the disk may be steered to the non-volatile write cache.
FIG. 1A shows a disk drive according to an embodiment of the present invention comprising a head 2 actuated over a disk 4 and control circuitry 6 for executing the flow diagram of FIG. 1B. A plurality of write commands are received from a host (step 8), wherein each write command comprises write data. A workload for the disk drive is determined (step 10), wherein a first percentage of the write data is stored in a non-volatile write cache (step 12) and a second percentage of the write data in a non-cache area of the disk (step 14), wherein the first and second percentages are based on the workload.
In the embodiment of FIG. 1A, the non-volatile write cache is implemented within a non-volatile semiconductor memory (NVSM) 16, such as a flash memory. Since the NVSM 16 does not suffer from mechanical access latency as with the disk 4, the write data received from the host can typically be stored in the NVSM 16 much faster than it can be stored to the disk 4, particularly when the write commands are non-sequential. Accordingly, when the workload is high the NVSM 16 is used to cache write data received from the host so that the host write commands are not blocked while waiting to access the disk 4. When the workload decreases, the write commands can be redirected back to the disk 4, and when the disk drive is idle, the write data stored in the non-volatile write cache of the NVSM 16 may be flushed to the disk 4.
In one embodiment, the entire capacity of the NVSM 16 is allocated to the non-volatile write cache, and in another embodiment, only part of the NVSM 16 is allocated to the non-volatile write cache. In an embodiment disclosed in greater detail below, part of the NVSM 16 may be allocated for storing a certain percentage of logical block addresses (LBAs) assigned to the disk drive, and in one embodiment the number of LBAs allocated to the NVSM 16 may change over time based on a migration policy. Also in an embodiment described in greater detail below, the write data stored in the non-volatile write cache of the NVSM 16 is eventually flushed to the disk 4 during an idle mode of the disk drive. In one embodiment, the flushed write data may remain in the NVSM 16 so that it may be accessed from either the NVSM 16 and/or the disk 4 during read operations until the LBAs of the disk 4 are overwritten (thereby invalidating the LBAs stored in the non-volatile write cache of the NVSM 16). In another embodiment, the LBAs of the flushed write data in the NVSM 16 are erased to free the memory in the NVSM 16 for other use.
The disk 4 shown in the embodiment of FIG. 1A comprises a plurality of data tracks 18 defined by servo sectors 200-20N, wherein each data track comprises a plurality of data sectors accessed indirectly through an LBA. The control circuitry 6 processes a read signal 22 emanating from the head 2 to demodulate the servo sectors 200-20N into a position error signal (PES) representing a position error of the head relative to a target data track. The control circuitry 6 comprises a servo compensator for filtering the PES to generate a control signal 24 applied to a voice coil motor (VCM) 26 that rotates an actuator arm 28 about a pivot in order to actuate the head 2 radially over the disk 4 in a direction that reduces the PES.
FIG. 2 shows an embodiment of the present invention wherein the non-volatile write cache may comprise a plurality of data tracks on the disk 4. In the example of FIG. 2, the plurality of data tracks for the write cache are located near an outer diameter of the disk where the data rate is fastest and therefore the throughput of the disk is fastest. In addition, in one embodiment the write cache on the disk 4 is implemented as a circular buffer so that the head 2 need not perform seeks in order to service multiple write commands from the host, thereby avoiding the mechanical latency that would otherwise limit the throughput of the write commands. Similar to the embodiment of FIG. 1A described above, during an idle mode of the disk drive the write data stored in the disk write cache may be flushed to a non-cache area of the disk allocated for โpermanentโ storage of the LBAs. After flushing the write data from the disk write cache, the corresponding LBAs of the write cache are marked as invalid so they may be overwritten, but may still be accessed during subsequent read commands until they are overwritten.
FIG. 3 shows an example relationship of a percentage of write data steered to the non-volatile write cache as a linear function of the workload. In this embodiment, when the workload is below a threshold, all of the write data of the write commands received from the host is written to the disk (bypassing the non-volatile write cache). As the workload increases beyond a threshold, a percentage of the write data is steered to the non-volatile write cache until eventually all of the write data is steered to the non-volatile write cache. Although the example of FIG. 3 shows a linear relationship between the amount of write data steered to the non-volatile write cache and the workload, any suitable function may be employed, such as a suitable quadratic or exponential function.
In one embodiment, a migration algorithm may be employed to migrate LBAs of write commands to either the NVSM 16 of FIG. 1A or to the disk 4. For example, if a write/read frequency of an LBA exceeds a threshold, the LBA may be assigned to the disk 4 to mitigate write amplification of the NVSM 16. In other embodiments, the migration policy may assign randomly accessed LBAs to the NVSM 16 to avoid the mechanical latency of the disk 4 whereas sequentially accessed LBAs may be assigned to the disk 4. Accordingly, in one embodiment the percentage of write data steered to the non-volatile write cache (e.g., FIG. 3) corresponds only to the write data that would have otherwise been written to the disk 4. That is, the LBAs of the write data already assigned to the NVSM 16 may not impact the steering algorithm described above. In another embodiment, the amount of write data assigned to the NVSM 16 may impact the steering algorithm if the NVSM 16 is not able to sustain throughput of all the write data (to both the write cache area and non-cache area of the NVSM 16).
In one embodiment of the present invention, a rotational position optimization (RPO) algorithm may be employed to select access commands for the disk. An RPO algorithm orders the access commands for the disk based on the radial location of the head so that the access commands are executed in an optimal order that minimizes the access latency. In one embodiment, when the workload exceeds the steering threshold (FIG. 3), the write commands steered to the non-volatile write cache are selected based on the RPO algorithm. For example, the write commands at the end of a disk command queue as ordered by the RPO algorithm may be selected as the write commands steered to the non-volatile write cache. This embodiment is illustrated in FIG. 8A which shows a disk command queue storing access commands in an order based on an RPO algorithm. When the workload exceeds the steering threshold, the write commands at the end of the disk command queue (last in the order) are steered to the non-volatile write cache whereas the write commands at the start of the disk command queue (first in order) are steered to the non-cache area of the disk until all of the write commands in the disk command queue have been processed (or until the workload falls below the steering threshold).
In one embodiment, the commands in the disk command queue have a time limit for execution. For example, a communication protocol with a host may specify a time limit for each command before the disk drive should return an error. Commands nearing the time-out limit are typically given higher priority in the RPO algorithm (or bypass the RPO algorithm altogether). FIG. 8B shows an embodiment of the present invention wherein a write command 62 nearing it's time-out limit waiting to be written to the non-cache area of the disk based on the RPO algorithm is instead written to the non-volatile write cache. That is, the write data steered to the non-volatile write cache during high workloads may be selected in response to a time-out limit of the write commands stored in the disk command queue.
The workload of the disk drive may be determined in any suitable manner in the embodiments of the present invention. FIG. 4A is a flow diagram according to an embodiment of the present invention wherein after receiving a number of write commands from the host (step 30), the workload may be determined based on the number of write commands pending (step 32). For example, as the number of write commands pending increases, there may be a corresponding increase in the determined workload.
FIG. 4B is a flow diagram according to an embodiment of the present invention wherein as write commands are received from the host (step 34), the workload may be determined based on the frequency of write commands (step 36). For example, as a frequency of the write commands increases, there may be a corresponding increase in the determined workload.
FIG. 4C is a flow diagram according to an embodiment of the present invention wherein after receiving a number of write commands from the host (step 38), the workload may be determined based on the amount of write data in the write commands (step 40). For example, as the amount of write data in the pending write commands increases, there may be a corresponding increase in the determined workload.
FIG. 4D is a flow diagram according to an embodiment of the present invention wherein after receiving a number of write commands from the host (step 42), the workload may be determined based on the RPO algorithm described above. For example, as the RPO algorithm computes a longer time to execute write commands in the command queue, there may be a corresponding increase in the determined workload.
FIG. 4E is a flow diagram according to an embodiment of the present invention wherein a plurality of access patterns are maintained (step 46), wherein the workload may be determined based on a detected access pattern (step 48). For example, the control circuitry may identify access patterns of the host that correspond to a high workload. When the control circuitry determines that a high workload access pattern is likely being executed by the host, the control circuitry adjusts the workload parameter for steering the write data to the non-volatile write cache.
FIG. 4F is a flow diagram according to another embodiment of the present invention wherein a command load message is received from the host (step 50) and the workload is adjusted accordingly (step 52). For example, the host may be aware of when a high workload access pattern is about to be executed and send a corresponding command load message to the disk drive in order to configure the workload parameter for steering the write data to the non-volatile write cache.
FIG. 5 shows an embodiment of the present invention wherein the percentage of write data steered to the non-volatile write cache is based on the workload and a percentage of free space in the non-volatile write cache. As write data is stored in the non-volatile write cache leaving less free space, the amount of write data steered to the write cache is reduced so that the performance of the disk drive does not change suddenly when the write cache is full. In one embodiment, a quadratic or exponential function of the free space may be employed since the amount of free space is more significant at smaller percentages (i.e., as the write data approaches the capacity of the write cache).
FIG. 6 shows an embodiment of the present invention wherein the percentage of write data steered to the non-volatile write cache is based on the workload and a percentage of life remaining in a NVSM 16 (FIG. 1A) that stores the write cache. The number of program/erase cycles is typically limited for the NVSM 16, and therefore in one embodiment the life remaining for the NVSM 16 is based on the number of program/erase cycles remaining. As the NVSM 16 approaches end of life (maximum number of program/erase cycles), the amount of data steered to the write cache is reduced so that the performance of the disk drive does not change suddenly when the NVSM 16 reaches end of life. In one embodiment, a quadratic or exponential function of the life percentage may be employed so that the performance of the disk drive does not change significantly until the NVSM 16 nears end of life.
FIG. 7 is a flow diagram according to an embodiment of the present invention wherein as write commands are received from the host (step 54), they are executed based on the workload by storing the write data either in the non-volatile write cache or on the disk (step 56). When the disk drive enters an idle mode (step 58), the write data stored in the non-volatile write cache is flushed to a non-cache area of the disk (step 60). In one embodiment, the write data may remain in the non-volatile write cache until it becomes invalid due to an overwrite operation. In this manner, the write data can be accessed from either the non-volatile write cache and/or the non-cache area of the disk during subsequent read operations. In another embodiment, after flushing the write data from the non-volatile write cache the corresponding portion of the write cache is invalidated (and in one embodiment erased) so that it may be re-used.
Any suitable control circuitry may be employed to implement the flow diagrams in the embodiments of the present invention, such as any suitable integrated circuit or circuits. For example, the control circuitry may be implemented within a read channel integrated circuit, or in a component separate from the read channel, such as a disk controller, or certain steps described above may be performed by a read channel and others by a disk controller. In one embodiment, the read channel and disk controller are implemented as separate integrated circuits, and in an alternative embodiment they are fabricated into a single integrated circuit or system on a chip (SOC). In addition, the control circuitry may include a suitable preamp circuit implemented as a separate integrated circuit, integrated into the read channel or disk controller circuit, or integrated into a SOC.
In one embodiment, the control circuitry comprises a microprocessor executing instructions, the instructions being operable to cause the microprocessor to perform the steps of the flow diagrams described herein. The instructions may be stored in any computer-readable medium. In one embodiment, they may be stored on a non-volatile semiconductor memory external to the microprocessor, or integrated with the microprocessor in a SOC. In another embodiment, the instructions are stored on the disk and read into a volatile semiconductor memory when the disk drive is powered on. In yet another embodiment, the control circuitry comprises suitable logic circuitry, such as state machine circuitry.
1. A disk drive comprising:
a non-volatile write cache;
a disk;
a head actuated over the disk; and
control circuitry operable to:
receive a plurality of write commands from a host, wherein each write command comprises write data;
determine a workload for a non-cache area of the disk independent of a sequentiality of the write commands;
when the workload for the non-cache area of the disk is less than a threshold independent of a workload for the write cache, store substantially all of the write data in the non-cache area of the disk; and
when the workload for the non-cache area of the disk is greater than the threshold independent of the workload for the write cache, store a first percentage of the write data in the non-volatile write cache and a second percentage of the write data in the non-cache area of the disk, wherein the first percentage is proportional to the workload for the non-cache area of the disk.
2. The disk drive as recited in claim 1, wherein the non-volatile write cache comprises a non-volatile semiconductor memory.
3. The disk drive as recited in claim 1, wherein the non-volatile write cache comprises part of the disk.
4. The disk drive as recited in claim 1, wherein the control circuitry is further operable to:
store the plurality of write commands in a command queue; and
determine the workload for the non-cache area of the disk based on a number of commands stored in the command queue.
5. The disk drive as recited in claim 1, wherein the control circuitry is further operable to determine the workload based on a frequency of the write commands received from the host.
6. The disk drive as recited in claim 1, wherein the control circuitry is further operable to:
store the plurality of write commands in a disk command queue; and
determine the workload based on a rotational position optimization (RPO) algorithm for selecting the write commands from the disk command queue based at least on a radial location of the head.
7. The disk drive as recited in claim 1, wherein the control circuitry is further operable to:
maintain a plurality of access patterns of the write commands received from the host; and
determine the workload based on the access patterns.
8. The disk drive as recited in claim 1, wherein the control circuitry is further operable to:
receive a command load message from the host; and
determine the workload based on the command load message.
9. The disk drive as recited in claim 1, wherein the first and second percentages are further based on an amount of free space in the non-volatile write cache.
10. The disk drive as recited in claim 1, wherein the first and second percentages are further based on a percentage of life remaining for the non-volatile write cache.
11. The disk drive as recited in claim 1, wherein the control circuitry is further operable to flush the write data from the non-volatile write cache to the non-cache area of the disk during an idle mode.
12. The disk drive as recited in claim 1, wherein the control circuitry is further operable to:
execute the write commands based on a rotational position optimization (RPO) algorithm for the disk; and
select the write data to be stored in the non-volatile write cache in response to the RPO algorithm.
13. The disk drive as recited in claim 1, wherein the control circuitry is further operable to select the write data to be stored in the non-volatile write cache in response to a time-out limit assigned to each write command.
14. A method of operating a disk drive, the disk drive comprising a head actuated over a disk and a non-volatile write cache, the method comprising:
receiving a plurality of write commands from a host, wherein each write command comprises write data;
determining a workload for a non-cache area of the disk independent of a sequentiality of the write commands;
when the workload for the non-cache area of the disk is less than a threshold independent of a workload for the write cache, store substantially all of the write data in the non-cache area of the disk; and
when the workload for the non-cache area of the disk is greater than the threshold independent of the workload for the write cache, storing a first percentage of the write data in the non-volatile write cache and a second percentage of the write data in the non-cache area of the disk, wherein the first percentage is proportional to the workload for the non-cache area of the disk.
15. The method as recited in claim 14, wherein the non-volatile write cache comprises a non-volatile semiconductor memory.
16. The method as recited in claim 14, wherein the non-volatile write cache comprises part of the disk.
17. The method as recited in claim 14, further comprising:
storing the plurality of write commands in a command queue; and
determining the workload for the non-cache area of the disk based on a number of commands stored in the command queue.
18. The method as recited in claim 14, further comprising determining the workload based on a frequency of the write commands received from the host.
19. The method as recited in claim 14, further comprising:
storing the plurality of write commands in a disk command queue; and
determining the workload based on a rotational position optimization (RPO) algorithm for selecting the write commands from the disk command queue based at least on a radial location of the head.
20. The method as recited in claim 14, further comprising:
maintaining a plurality of access patterns of the write commands received from the host; and
determining the workload based on the access patterns.
21. The method as recited in claim 14, further comprising:
receiving a command load message from the host; and
determining the workload based on the command load message.
22. The method as recited in claim 14, wherein the first and second percentages are further based on an amount of free space in the non-volatile write cache.
23. The method as recited in claim 14, wherein the first and second percentages are further based on a percentage of life remaining for the non-volatile write cache.
24. The method as recited in claim 14, further comprising flushing the write data from the non-volatile write cache to the non-cache area of the disk during an idle mode.
25. The method as recited in claim 14, further comprising:
executing the write commands based on a rotational position optimization (RPO) algorithm for the disk; and
selecting the write data to be stored in the non-volatile write cache in response to the RPO algorithm.
26. The method as recited in claim 14, further comprising selecting the write data to be stored in the non-volatile write cache in response to a time-out limit assigned to each write command.