🔗 Share

Patent application title:

Automatic read and write acceleration of data accessed by virtual machines

Publication number:

Publication date:

2017-07-04

Application number:

13/831,677

Filed date:

2013-03-15

✅ Patent granted

Patent number:

US 9,699,263 B1

Grant date:

2017-07-04

PCT filing:

PCT publication:

Examiner:

Joseph E Avellino | Patrick Ngankam

Agent:

Morgan, Lewis & Bockius LLP

Adjusted expiration:

2034-04-12

Smart Summary: Data access for virtual machines can be made faster by using a special storage area called a persistent cache. When multiple clients request data, the system keeps track of which data is accessed most often. It then automatically decides which data should be stored in the cache to speed up future access. This helps improve the performance of many virtual machines running on a server without needing manual management. Overall, the goal is to make data access quicker and more efficient for users. 🚀 TL;DR

Abstract:

The various implementations described herein include methods and systems for automatic management of data access acceleration in a computer system executing a plurality of clients. The method includes: receiving data access commands from two or more clients to access data in objects identified by the data access commands; and processing the data access commands to update access history information for portions of the objects identified by the data access commands. The method further includes: in accordance with the access history information, automatically identifying and marking for acceleration the portions of the objects identified by the data access commands that satisfy an access based data acceleration policy; and accelerating the object portions marked for acceleration, including accelerating data writes and data reads of the object portions to and from the persistent cache, where the persistent cache is shared by the two or more clients.

Inventors:

Serge Shats 10 🇺🇸 Palo Alto, CA, United States
Alexei JELVIS 3 🇺🇸 Menlo Park, CA, United States
Muthukumar Ratty 5 🇺🇸 Sunnyvale, CA, United States
Eugene Vignanker 1 🇺🇸 Redwood Shores, CA, United States

Assignee:

SanDisk Technologies LLC 1,343 🇺🇸 Plano, TX, United States

Applicant:

SanDisk Enterprise IP LLC 🇺🇸 Milpitas, CA, United States

Interested in similar patents?

Get notified when new applications in this technology area are published.

Create Free Alert

Classification:

G06F12/08 » CPC further

Accessing, addressing or allocating within memory systems or architectures; Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems

G06F12/0804 » CPC further

Accessing, addressing or allocating within memory systems or architectures; Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems; Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating

G06F15/167 IPC

Digital computers in general ; Data processing equipment in general; Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs; Interprocessor communication using a common memory, e.g. mailbox

Description

RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent Application No. 61/684,646, filed Aug. 17, 2012, which is hereby incorporated by reference in its entirety.

TECHNICAL FIELD

The disclosed embodiments relate generally to accelerating access to data read and written by a set of virtual machines through the selective use of a persistent cache.

BACKGROUND

Server virtualization is the masking of physical server resources (e.g., processors, memory, etc.) from users of the server. Such server resources include the number and identifications of individual physical servers, processors and operating systems. Server virtualization more efficiently utilizes server resources, improves server availability and assists in testing and development.

Virtualized enterprise data centers deploy tens of thousands of virtual machines over hundreds of physical servers. Efficient hardware utilization is a key goal in improving virtual machine systems. One of the main factors contributing to efficient hardware utilization is virtual machine density. The more virtual machines that can be run on a physical server the more efficient the hardware utilization.

Access to server storage resources is often a system bottleneck that prevents full utilization of available hardware resources. The embodiments described below are configured to improve access to server storage resources.

SUMMARY

A server system, executing a plurality of virtual machines, accelerates frequently accessed data in a persistent cache. The persistent cache is shared by the plurality of virtual machines, and the persistent cache is typically much smaller in capacity than secondary storage; thus, only a small subset of secondary storage is accelerated. Determining which data to accelerate (e.g., which portions of the virtual disks used by the virtual machines in the server system to accelerate), however, is challenging and non-automated processes are non-workable. For example, the management of acceleration by IT personnel or a system administrator would be difficult, if not impossible, in systems having hundreds or thousands of virtual machines. Various embodiments of the systems and methods described here are therefore designed for automatic management of data access acceleration in a server system that executes a plurality of virtual machines.

The server system retains and updates access history information for portions (e.g., blocks or sub-blocks) of the objects (e.g., virtual disks) associated with data access commands received from the virtual machines. In accordance with the access history information and an access based acceleration policy the server system determines which portions of the objects to accelerate. The shared persistent cache significantly improves input-output (I/O) performance of a respective server system executing a plurality of virtual machines, leading to an increase in the number of virtual machines that can run on each server system in a virtualized data center.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a distributed system, in accordance with some embodiments.

FIG. 2 is a block diagram of a server system included in FIG. 1, in accordance with some embodiments.

FIGS. 3A-3B illustrate data structures utilized by the server system included in FIG. 2, in accordance with some embodiments.

FIG. 4 is a schematic diagram of a persistent cache, in accordance with some embodiments.

FIGS. 5A-5B illustrate a flow diagram of a process for accelerating data read operations, in accordance with some embodiments.

FIG. 6 illustrates a flow diagram of a process for accelerating data write operations, in accordance with some embodiments.

FIGS. 7A-7B illustrate a flow diagram of a process for accelerating data access, in accordance with some embodiments.

Like reference numerals refer to corresponding parts throughout the drawings.

DESCRIPTION OF EMBODIMENTS

This detailed description covers methods and systems for automatic management of disk acceleration in a distributed system having a plurality of virtual machines running on each of a set of physical servers. Other related concepts will also be covered in this detailed description.

In some embodiments, a method for accelerating data access is performed by a computer system having one or more processors, memory and a persistent cache for storing accelerated data. The method includes: receiving data access commands from two or more clients to access data in objects identified by the data access commands; and processing the data access commands to update access history information for portions of the objects identified by the data access commands from the two or more clients. The method further includes: in accordance with the access history information, automatically identifying and marking for acceleration portions of the objects identified by the data access commands that satisfy an access based data acceleration policy, where the automatically identifying and marking are performed collectively for the two or more clients; and accelerating the object portions marked for acceleration, by accelerating data access, including either or both accelerating data writes and data reads of the object portions to and from the persistent cache, where the persistent cache is shared by the two or more clients, and the access history information is based on the data access commands of the two or more clients.

In some embodiments, a method for accelerating data read operations is performed by a computer system having one or more processors, memory and a persistent cache for storing accelerated data. The method includes receiving data read commands from two or more clients to read data from objects identified by the data read commands and processing the data read commands to update usage history information for portions of the objects identified by the data read commands. The method further includes determining whether a respective portion of the objects identified by a data read command from a respective client of the two or more clients is stored in the persistent cache, where the persistent cache is shared by the two or more clients. In accordance with a determination that the respective portion of the objects identified by the data read command from the respective client is stored in the persistent cache, the method includes returning the respective portion of the objects from the persistent cache to the respective client of the two or more clients. In accordance with a determination that the respective portion of the objects identified by the data read command from the respective client is not stored in the persistent cache, the method includes identifying and marking for acceleration the respective portion of the objects identified by the data read command from the respective client if the respective portion of the objects satisfies an access based data acceleration policy in accordance with the usage history information. In accordance with a determination that the respective portion of the objects is not marked for acceleration, the method includes processing the data read command from the respective client, by reading from the secondary storage the respective portion of the objects, and returning the respective portion of the objects read from the secondary storage to the respective client of the two or more clients. In accordance with a determination that the respective portion of the objects is marked for acceleration, the method includes processing the data read command from the respective client by reading from the secondary storage the respective portion of the objects, writing the respective portion of the objects to the persistent cache, and returning the respective portion of the objects to the respective client of the two or more clients.

In some embodiments, a method for accelerating data write operations is performed by a computer system having one or more processors, memory and a persistent cache for storing accelerated data. The method includes: receiving data write commands from two or more clients to write data to objects identified by the data write commands, where the persistent cache is shared by the two or more clients; and processing the data write commands to update usage history information for portions of the objects identified by the one or more data write commands. The method further includes automatically identifying and marking for acceleration a respective portion of the objects if the respective portion of the objects satisfies an access based data acceleration policy in accordance with the access history information, where the automatically identifying and marking are performed collectively for the two or more clients. In accordance with a determination that the respective portion of the objects is marked for acceleration, the method includes writing the respective portion of the objects to the persistent cache and subsequently or concurrently writing the respective portion of the objects to the secondary storage. In accordance with a determination that the respective portion of the objects is not marked for acceleration, the method includes writing the respective portion of the objects to the secondary storage.

In another aspect, a computer system includes one or more processors, a persistent cache for storing accelerated data, and memory storing one or more programs for execution by the one or more processors, wherein the one or more programs include instructions that when executed by the one or more processors cause the server system to perform any of the aforementioned methods.

In yet another aspect, a non-transitory computer readable medium stores one or more programs that when executed by one or more processors of a computer system cause the computer system to perform any of the aforementioned methods.

Numerous details are described herein in order to provide a thorough understanding of the example implementations illustrated in the accompanying drawings. However, some embodiments may be practiced without many of the specific details, and the scope of the claims is only limited by those features and aspects specifically recited in the claims. Furthermore, well-known methods, components, and circuits have not been described in exhaustive detail so as not to unnecessarily obscure more pertinent aspects of the implementations described herein.

FIG. 1 is a block diagram of a distributed system 100 including a secondary storage system 130 (sometimes called a storage system) connected to a plurality of computer systems 110 (e.g., server systems 110a through 110m) through a communication network 120 such as the Internet, other wide area networks, local area networks, metropolitan area networks, wireless networks, or any combination of such networks. In some embodiments, a respective server system 110 executes a plurality of virtual machines 112 and includes a respective persistent cache 118 shared by the plurality of virtual machines executed on the respective server system 110. In some embodiments, the persistent cache 118 comprises non-volatile solid state storage, such as flash memory. In some other examples, persistent cache 118, comprises EPROM, EEPROM, battery backed SRAM, battery backed DRAM, supercapacitor backed DRAM, ferroelectric RAM, magnetoresistive RAM, or phase-change RAM.

In some implementations, each of the plurality of the virtual machines 112 is a client 114. Each client 114 executes one or more client applications 116 (e.g., a financial application or web hosting application) that submit data access commands (e.g., data read and write commands) to the respective server system 110. The data access commands access data in objects, such as virtual disks, some portions of which may be stored in RAM in by the server, while other portions are stored in storage system 130. The respective server system 110, in turn, sends corresponding data access commands to storage system 130 so as to obtain or store data in accordance with the data access commands.

In some embodiments, secondary storage system 130 includes a front-end system 140, which obtains and processes data access commands from server systems 110 and returns results to the server systems 110. Secondary storage system 130 further includes one or more secondary storage subsystems 150 (e.g., storage subsystems 150a-150n). In some embodiments, a respective storage systems stores the data for one or more objects (e.g., one or more virtual disks) accessible to clients on a respective server system 110. Each of the one or more objects comprises a plurality of portions. For example, a respective portion of the plurality of portions of an object is a block (also herein called an address block, since the block corresponds to a block of addresses), and the block comprises a plurality of sub-blocks (e.g., a respective sub-block is a page within a block). In another example, a portion of the object is a sub-block (also herein called an address sub-block, since the sub-block corresponds to an address or sub-block of addresses).

In some embodiments, a server system 110 allocates a distinct address space to each respective virtual machine 112 executed on the server system 110, and furthermore allocates space within the address space for one or more objects (e.g., one or more virtual disks). In some embodiments, each object accessed by a respective virtual machine 112 is denoted by a corresponding object identifier (Object ID), or in the case of a virtual disk, a virtual disk ID.

FIG. 2 is a block diagram of a server system 110 (sometimes herein called a computer system). Server system 110 includes one or more processors or processing units (CPUs) 210, one or more communication interfaces 230, memory 240, persistent cache 118, and one or more communication buses 220 for interconnecting these components. The communication buses 220 may include circuitry (sometimes called a chipset) that interconnects and controls communications between system components. Memory 240 includes high-speed random access memory and optionally includes non-volatile memory. In some embodiments, memory 240 comprises a non-transitory computer readable storage medium. In some implementations, memory 240 stores the following programs, modules and data structures, or a subset or superset thereof:

- an operating system 242 including procedures for handling various basic system services and performing hardware dependent tasks;
- a plurality of virtual machines 112 (e.g., virtual machines 112a to 112v in FIG. 1);
- a network communication module 244 configured to connect server system 110 to communication network(s) 120 in FIG. 1 via the one or more communication interfaces 230;
- an access history update module 245 configured to update information within access history database 250;
- a memory access decision module 246 configured to determine whether a respective portion of an object (e.g., a block or sub-block of an object identified by a data access command) is present in persistent cache 118;
- an acceleration determination module 248 configured to determine whether to accelerate various portions (e.g., blocks or sub-blocks) of the objects accessed by data access commands received from two or more clients;
- a tier assignment module 249 at least configured to assign (or reassign) respective portions (e.g., blocks) of the objects to tiers within tiered data structure 252;
- an access history database 250 configured to store one or more items of usage history information, including tiered data structure 252 configured to organize portions (e.g., blocks) of the objects into a plurality of tiers, object portion usage metadata 254 and object portion to node map 256 configured to map a respective portion (e.g., a block) of the objects to a corresponding node of a plurality of nodes 320 in access history database 250;
- a cache management driver 420, described below;
- a cache address map 410, described below; and
- a plurality of pointers including clean pointer 412, flush pointer 414, read pointer 418 and write pointer 416.

Each of the elements identified above may be stored in one or more of the previously mentioned memory devices of server system 110, and each element corresponds to a set of instructions for performing a function described above. The modules or programs (i.e., sets of instructions) identified above need not be implemented as separate software programs, procedures or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various embodiments.

FIG. 3A illustrates a plurality of data structures within access history database 250, including a plurality of nodes 320 (also sometimes called node data structures), each of which stores access history information, as well as other information, for a portion of an object. In accordance with a respective object identifier (object ID), portion to node map 256 maps the portion (e.g., a block) of an object to a node 320 in access history database 250. Each node 320, excluding any unallocated nodes, contains information associated with the state and access history of a corresponding portion (e.g., a block, also sometimes called an address block) of an object. In some implementations, each respective node 320 of two or more of the nodes 320 includes the following items of information, or a subset or superset thereof:

- an object ID plus offset 312 identifying the object portion (i.e., the portion of an object) corresponding to the respective node;
- a tier number 314 identifying the tier within tiered data structure 252 to which the respective node (and the object portion corresponding to the node) belongs;
- two or more linked list pointers 316 identifying nodes that re positioned immediately before and after the respective node in linked list 360; linked list 360 is list of all the nodes 320 assigned to the same tier as the respective node (i.e., the tier have the tier number in field 314);
- a write operations count 318 indicating a number of times data has been written to the respective portion of the objects (e.g., the respective block) corresponding to the respective node;
- a read operations count 322 indicating a number of times data has been read from the object portion corresponding to the respective node;
- a most recently used (MRU) marker 324 indicating the last time an operation was performed on the object portion corresponding to the respective node; in some implementations, the MRU marker is a timestamp indicating execution time for the last operation performed on the respective object portion; in some other implementations, the MRU marker stores an operation count value (similar to a serial number or other sequentially assigned value) indicating the last operation performed on the object portion corresponding to the respective node;
- a write acceleration flag 326 indicating whether the object portion corresponding to the respective node is marked for write acceleration; and
- a plurality of subset read counts 330, each indicating a number of times that data has been read from a respective subset of the object portion (e.g., a sub-block of an address block of an object) corresponding to the respective node.

In some implementations, each subset read count 330 has a value of 0 (not read), 1 (read once) or 2 (read 2 or more times). Subset read counts 330 function as read acceleration flags, where values 0 and 1 correspond to a flag value of “off” or “disabled,” and a value of 2 corresponds to a flag value of “on” or “enabled.” Furthermore, any attempt to increment a subset read count 330 that is already equal to 2 results in a value of 2 for the sub-block.

FIG. 3B illustrates tiered data structure 252, which is part of usage history database 250, according to some embodiments. In some embodiments, tiered data structure 252 is a bucket array, with each bucket representing a different tier. Tiered data structure 252 contains tier numbers 0-N (e.g., tiers 352-358), where N is a positive integer greater than 1 and typically greater than 10 once the server system has processed a large number of data access commands. In some implementations, each tier of tiered data structure 252 includes the following items of information, or a subset or superset thereof:

- a tier number 314;
- a list head 342 containing a pointer to the head of a linked list 360 of nodes 320 within a respective tier;
- a list end 344 containing a pointer to the end of the linked list 360 of nodes 320 within the respective tier;
- a portion count 346 containing the number of object portions (e.g., address blocks) represented by the plurality of nodes 320 in the linked list 360; and
- a usage range indicator 348 designating the range of usage counts for object portions (e.g., address blocks) within the respective tier.

The plurality of tiers within tiered data structure 252 are arranged such that the lowest tier contains object portions (e.g., address blocks) having the lowest “usage rate,” and the highest tier contains object portions (e.g., address blocks) having the highest “usage rate.” Tier assignment module 249 is configured to assign each tier a range of usage rates (e.g., indicated via usage range indicator 348) corresponding to the usage rates of the object portions (e.g., address blocks) eligible for assignment to that tier. For example, all object portions (e.g., address blocks) with a usage rate of 1-3 are assigned to Tier 0 (the lowest tier), all object portions (e.g., address blocks) with a usage rate of 4-6 are assigned to Tier 1, etc.

In some embodiments, tier assignment module 249 is configured to determine a “usage rate” or a usage history value for an object portion (e.g., address block) by, for example, combining the write operations count 318 for the object portion with the read operations count 322 for the object portion. Tier assignment module 249 is configured to assign a respective object portion (e.g., a respective address block) to a tier based on the usage rate of the respective object portion and the usage range indicators 348 assigned to the plurality of tiers. For example, if the respective address block has a usage rate of 2, tier assignment module 249 will assign the respective address block to tier 0, where the usage range of tier 0 is 1-3. If the usage rate for a respective address block increases above the usage range assigned to a particular tier to which the block is currently assigned, then tier assignment module 249, for example, re-assigns the respective address block to a higher tier. Similarly, if the usage rate for a respective address block decreases below the usage range assigned to a particular tier to which the block is currently assigned, then tier assignment module 249, for example, re-assigns the respective address block to a lower tier.

In some embodiments, the object portions (e.g., address blocks) assigned to a respective tier are represented by a linked list 360 of nodes 320. Each node 320 of the plurality of nodes 320 stores usage history for a respective object portion (e.g., a respective address block). In some implementations, the head of the linked list 360 is associated with a node 320 corresponding to the most recently updated object portion (e.g., address block) in the respective tier, and the end of the linked list 360 is associated with a node 320 corresponding to the least recently updated object portion (e.g., address block) in the respective tier.

Referring to FIG. 4, in some embodiments, data for accelerated portions of the objects (e.g., address blocks and/or sub-blocks) is stored in a persistent cache 118, which is implemented as a “log-structured cache.” That is, the persistent cache 118 (sometimes herein called “the cache,” for ease of discussion) provides caching services to a plurality of system components (e.g., multiple client systems or virtual machines), while being structured as a log, with data and metadata being sequentially written to the cache storage device. In this manner, persistent cache 118 operates as a circular buffer. The advantages of operating persistent cache 118 as a circular buffer include simplicity (particularly reduced internal metadata management requirements), automatic wear leveling of memory locations within persistent cache 118, and automatic garbage collection, which essentially eliminates the risk of stale data in the persistent cache 118 crowding out (e.g., preventing storage of) more current data in persistent cache 118. In some embodiments, a single log-structured persistent cache 118 is used to store cached data for multiple virtual machines (e.g., all virtual machines executed by a respective server system 110), thereby eliminating the need to separately manage the caching of data for each of the virtual machines.

In some embodiments, persistent cache 118 includes clean region 440, dirty region 450 and unused region 470. Both clean region 440 and dirty region 450 store cached data, while unused region 470 is an “empty” portion of persistent cache 118 that is ready to be overwritten with new data. Any of the clean, dirty and unused regions can be wrapped over the end boundary of persistent cache 118. In FIG. 4, for example, clean regions 440-1 and 440-2 are logically a single clean region 440 that is wrapped over the end boundary of persistent cache 118. Due to persistent cache 118 operating as a circular buffer, as will be explained below, the boundaries of the clean, dirty and unused regions 440, 450, 470 are adjusted as data is added to persistent cache 118 and as data is removed from the persistent cache 118. Additional information regarding the functioning of a log-structured cache as illustrated in FIG. 4 is found in U.S. Patent Application Publication No. 2011/0320733, which is hereby incorporated by reference in its entirety.

In some implementations, data stored in the log-structured cache (persistent cache 118) includes data corresponding to both write and read caches. Accordingly, the write and read caches share a circular buffer, and, in some implementations, write and read data are intermingled in the log-structured cache. In other implementations, the write and read caches are maintained separately in separate circular buffers, either in the same persistent cache 118, or in separate instances of persistent cache 118.

In some implementations, dirty region 450 contains both read and write cached data. In some implementations, write data in dirty region 450 is data stored in persistent cache 118, but not yet flushed to secondary storage system 130 (e.g., any of secondary storage subsystems 150 in FIG. 1). Write data stored in persistent cache 118 (whether in clean region 440 or dirty region 450) is said to be “accelerated.” The beginning of dirty region 450 is represented by flush pointer 414, and the end of dirty region 450 is represented by write pointer 416. As cache write data is flushed to secondary storage system 130, flush pointer 414 is advanced (e.g., incremented) so that it points to the next segment of write data not yet flushed to secondary storage. Similarly, write pointer 416 is advanced as new blocks of write data are stored to persistent cache 118, thereby “moving” a portion of persistent cache 118 that was formerly in unused region 470 into dirty region 450. Clean region 440, which is bounded at its beginning by clean pointer 412 and at its end by flush pointer 414, stores cached data that is also stored in secondary storage. Typically, clean region 440 stores both cached read data and write data, both of which are said to be “accelerated.”

Unused region 470, bounded at its beginning by write pointer 416 and at its end by clean pointer 412, represents an “empty” portion of persistent cache 118 that is ready to be overwritten with new data. In some implementations, unused region 470 corresponds to flash memory regions that have been erased in preparation for storing “new” blocks or sub-blocks of data that have been selected for acceleration, as well as updated data for blocks or sub-blocks already stored in persistent cache 118.

Both clean region 440 and dirty region 450 store cached data (data for accelerated address block and/or sub-blocks) for multiple virtual machines (e.g., all the virtual machines executed by a respective server system 110). The cached data for the various virtual machines is interleaved within the clean region 440 and dirty region 450 and ordered within the clean region 440 and dirty region 450 in the same order (sometimes called log order) that the data was written to persistent cache 118.

Cache address map 410 maps accelerated portions of the objects (e.g., address sub-blocks) to specific locations in persistent cache 118. Cache address map 410 is used, when reading cached data from persistent cache 118, to locate requested data in persistent cache 118. In some embodiments, cache address map 410 is apportioned on a sub-block by sub-block basis. Cache management driver 420 handles both data read and write commands directed to persistent cache 118.

FIGS. 5A-5B illustrate a flow diagram of a method 500 for accelerating data read operations performed by a computer system (e.g., server system 110 in FIG. 1) having one or more processors, memory and a persistent cache for storing accelerated data. In some embodiments, method 500 is governed by a set of instructions stored in memory (e.g., a non-transitory computer readable storage media) that are executed by the one or more processors of the computer system.

The computer system receives (502) data read commands from two or more clients to read data from objects identified by the data read commands. FIG. 1, for example, shows a respective server system 110 (sometimes herein called a computer system) configured to receive data reads commands from two or more virtual machines (sometimes herein called clients 114) executed on respective server system 110 to read data from objects (e.g., one or more virtual disks) identified by the data read commands. In various embodiments, the number of virtual machines from which a respective server system 110 receives data read commands is more than 20, more than 50, or more than 100.

The computer system processes (504) the data read commands to update usage history information for portions of the objects identified by the data read commands. FIG. 2, for examples, shows access history update module 245 (a component of server system 110) configured to process the data read commands to update usage history information in access history database 250 for portions of the objects (e.g., address blocks or sub-blocks) identified by the data read commands. For example, access history update module 245 is configured to update object portion usage metadata 254 within access history database 250 for portions (e.g., address sub-blocks) of the objects identified by the data read commands.

Referring to FIG. 3A, for example, access history update module 245 is configured to increment a read operations count 322 and update an MRU marker 324 within access history data base 250 for each respective node of the plurality of nodes 320 corresponding to the portions of the objects (e.g., the address blocks) identified by the data read commands. In this example, access history update module 245 is further configured to increment one or more subset read counts 330 within the nodes of access history data base 250 corresponding to the one or more portions of the object(s) (e.g., one or more sub-blocks of the address block) identified by the data read commands.

In some implementations, access history update module 245 is also configured to decrement a read operations count 322 or write operations count 318 within access history data base 250 for each respective node of a plurality of nodes 320 corresponding to a portion of the objects (e.g., an address block) not identified by any data access command received within a predefined period of time, or alternatively not identified by any data access command of a predefined number of data access commands received by the computer system from the clients. Stated another way, in some implementations, the read operations count 322 or write operations count 318 for object portions not recently accessed are decremented, where “not recently accessed” is automatically determined either periodically or each time a predefined number of data access commands have been received. Furthermore, as described in more detail below, the tiers to which those object portions are assigned and the object portions to mark for acceleration are re-evaluated in accordance with the decremented read operations count 322.

The computer system determines (506) whether a respective portion of the objects identified by a data read command from a respective client of the two or more clients is stored in the persistent cache. FIG. 2, for example, shows memory access decision module 246 (a component of server system 110) configured to determine whether a respective portion of the objects (e.g., an address sub-block) identified by a data read command from a respective client (e.g., virtual machine 112a) of the two or more clients (e.g., virtual machines 111a to 112v in FIG. 1) is stored in persistent cache 118. In some embodiments, memory access decision module 246 module makes such a determination based on cache address map 410.

In accordance with a determination that the respective portion of the objects identified by the data read command from the respective client is stored in the persistent cache, the computer system returns (508) the respective portion of the objects from the persistent cache to the respective client of the two or more clients. FIG. 2, for example, shows server system 110 configured to return a respective object portion (e.g., address sub-block) from persistent cache 118 to the respective client (e.g., virtual machine 112a) in accordance with the previous determination by memory access decision module 246 that the respective portion is stored in persistent cache 118.

In accordance with a determination that the respective portion of the objects identified by the data read command from the respective client is not stored in the persistent cache, the computer system automatically identifies and marks (510) for acceleration the respective portion of the objects identified by the data read command from the respective client if the respective portion of the objects satisfies an access based data acceleration policy, in accordance with the usage history information. The automatic identification and marking of object portions is performed collectively for the two or more clients. FIG. 2, for example, shows acceleration determination module 248 (a component of server system 110) configured to automatically identify and mark for read acceleration a respective object portion (e.g., a respective address sub-block identified by a data read command from a respective client, such as virtual machine 112v) if the object portion satisfies an access based data acceleration policy, in accordance with the usage history information. In this example, acceleration determination module 248 is configured to perform these operations in accordance with a previous determination by memory access decision module 246 that the respective object portion identified by the data read command from the respective client is not stored in persistent cache 118 (e.g., cache address map 410 indicates that the respective portion is not present in persistent cache 118). In some embodiments, an object portion (e.g., the respective address sub-block identified by the data read command from the respective client) satisfies the data access acceleration policy and is marked for acceleration when (A) the node for the respective object portion is assigned to a tier that is above the “low-water mark” for acceleration (as described in more detail below), and furthermore the subset read count 330 corresponding to the respective portion indicates a value of 2 or “enabled.”

In accordance with a determination that the respective portion of the objects is not marked for acceleration, the computer system processes (512) the data read command from the respective client, by: reading (514) from the secondary storage the respective portion of the objects; and returning (516) the respective portion of the objects read from the secondary storage to the respective client of the two or more clients. FIG. 2, for example, shows server system 110 configured to process the data read command from the respective client (e.g., virtual machine 112a) by reading the respective portion (e.g., a respective address sub-block) from secondary storage 150 (e.g., within secondary storage system 130 in FIG. 1) via network communications module 244. Server system 110, as shown in FIG. 2, is further configured to return the respective portion (e.g., the respective address sub-block) read from secondary storage 150 to the respective client (e.g., virtual machine 112a). In this example, server system 110 is configured to perform these operations in accordance with the previous determination by acceleration determination module 248 that the respective portion (e.g., the respective address sub-block) of the objects is not marked for acceleration. In some embodiments, an object portion (e.g., the respective address sub-block) identified by the data read command from a client is not marked for acceleration when either (A) the subset read count 330 corresponding to the object portion indicates a value of 0 or 1 or “disabled,” or (B) the address block that includes the object portion has a usage rate (e.g., the sum of the write operation count 318 and read operation count 322 for the address block, see FIG. 3A) that does not meet the minimum usage rate that qualifies address blocks for acceleration.

In accordance with a determination that the respective portion of the objects is marked for acceleration, the computer system processes (518) the data read command from the respective client, by: (520) reading from the secondary storage the respective portion of the objects; writing (522) the respective portion of the objects to the persistent cache; and returning (524) the respective portion of the objects to the respective client of the two or more clients. FIG. 2, for example, shows server system 110 configured to process the data read command from the respective client (e.g., virtual machine 112a) by reading an object portion (e.g., a respective address sub-block identified by the data read command from the respective client, e.g., virtual machine 112a) from secondary storage 150 (e.g., within secondary storage system 130 in FIG. 1) via network communications module 244. FIG. 2 shows server system 110 further configured to write the object portion (e.g., the respective address sub-block) read from secondary storage 150 to persistent cache 118. FIG. 2, for example, further shows server system 110 configured to return the object portion (e.g., the respective address sub-block) to the respective client (e.g., virtual machine 112a). In this example, server system 110 is configured to perform these operations in accordance with the previous determination by acceleration determination module 248 that the object portion (e.g., the respective address sub-block identified by the data read command from the respective client) is marked for acceleration.

In some embodiments, returning the respective portion of the objects comprises returning (526) the respective portion of the objects read from the secondary storage to the respective client of the two or more clients. For example, in some implementations, the object portion read from secondary storage is written to an intermediate buffer (not shown in the figures), and then copied from the intermediate buffer to a memory location at which the requesting client receives the object; and furthermore the same data (object portion) is copied from the intermediate buffer to the persistent cache (522). In some implementations, to minimize latency, the requested object portion is returned to the requesting client prior to the same data being written to persistent cache 118. FIG. 2, for examples, shows server system 110 configured to return the object portion (e.g., the respective address sub-block) from secondary storage 150 to the respective client (e.g., virtual machine 112a) of the two or more clients.

Alternatively, in some embodiments, returning the respective portion of the objects comprises returning (528) the respective portion of the objects from the persistent cache to the respective client of the two or more clients. FIG. 2, for example, shows server system 110 configured to return the object portion (e.g., the respective address sub-block) from persistent cache 118 to the respective client (e.g., virtual machine 112a) of the two or more clients.

FIG. 6 illustrates a flow diagram of a method 600 for accelerating data write operations performed by a computer system (e.g., server system 110 in FIG. 1) having one or more processors, memory and a persistent cache for storing accelerated data. In some embodiments, method 600 is governed by a set of instructions stored in memory (e.g., a non-transitory computer readable storage media) that are executed by the one or more processors of the computer system.

The computer system receives (602) data write commands from two or more clients to write data to objects identified by the data write commands. FIG. 1, for example, shows a respective server system 110 configured to receive data write commands from two or more virtual machines executed on respective server system 110 to write data to objects (e.g., one or more virtual disks) identified by the data write commands.

The computer system processes (604) the data write commands to update usage history information for portions of the objects identified by the data write commands. FIG. 2, for example, shows access history update module 245 configured to process the data write commands to update usage history information in access history database 250 for portions of the objects (e.g., address blocks) identified by the data write commands. For example, access history update module 245 is configured to update address portion usage metadata 254 within access history database 250 for portions (e.g., address blocks) of the objects identified by the data write commands. Referring to FIG. 3A, for example, access history update module 245 is configured to increment a write operations count 318 and update an MRU marker 324 within access history data base 250 for each respective node of the plurality of nodes 320 corresponding to the object portions (e.g., address blocks) identified by the data write commands.

Furthermore, as discussed above, in some implementations, access history update module 245 is also configured to decrement a read operations count 322 or write operations count 318 within access history data base 250 for each respective node of a plurality of nodes 320 corresponding to a portion of the objects (e.g., an address block) not identified by any data access command received within a predefined period of time, or alternatively not identified by any data access command of a predefined number of data access commands received by the computer system from the clients.

In accordance with the access history information, the computer system automatically identifies and marks (606) for acceleration a respective portion of the objects identified by a data write command from a respective client of the two or more clients satisfying an access based data acceleration policy. FIG. 2, for example, shows acceleration determination module 248 configured to automatically identify and mark for write acceleration a respective object portion (e.g., a respective address block identified by a data write command from a respective client, such as virtual machine 112v) satisfying an access based data acceleration policy, in accordance with the usage history information. In some embodiments, the respective object portion (e.g., the respective address block identified by the data write command from the respective client) satisfies the access based data acceleration policy when a write acceleration flag 326 for the node associated with the respective portion (e.g., the respective address block) is enabled (e.g., indicates a flag value of 1). In some embodiments, a write acceleration flag 326 for the node corresponding to an object portion (e.g., address block) is enabled when the node is assigned to a tier within tiered data structure 252 that qualifies the corresponding object portions for acceleration.

In some embodiments, only a predefined number of object portions (e.g., address blocks) qualify for acceleration based on the storage capacity of the persistent cache 118. In some embodiments, acceleration determination module 248 is configured to identify a “low-water mark” for object portions (e.g., address blocks) that meet the minimum usage rate that “qualifies for acceleration.” For example, in a server system 110 comprising a persistent cache with storage capacity to accelerate data from 800,000 address blocks, acceleration determination module 248 is configured to identify the tiers with the highest usage rates whose total number of address blocks is, in the aggregate, no more than 800,000. The ‘low-water mark” identifies a respective object portion (e.g., address block), or, set of object portions (e.g., a set of address blocks), in a respective tier (e.g., tier Y, for ease of reference) having the minimum usage rate included in the 800,000 accelerated object portions. Acceleration determination module 248 is configured to mark all object portions (e.g., address blocks) corresponding to nodes in tier Y for write acceleration (e.g., the respective write acceleration flags 326 for all address blocks in tier Y are enabled). In some implementations, one or more object portions (e.g., one or more address blocks) in tier Y below the ‘low-water mark” are also marked for write acceleration. Furthermore, acceleration module 248 marks all portions of the objects (e.g., all address blocks) in tiers above tier Y (i.e., all tiers having usage rate ranges higher than tier Y) for write acceleration (e.g., the respective write acceleration flags 326 for all address blocks in these tiers are enabled). It is noted that the “low water mark” qualifying a portion of the objects (e.g., an address block) for acceleration changes over time, due to fluctuations in the concentration of data read and data write commands.

In accordance with a determination that the respective portion of the objects is marked for acceleration, the computer system writes (608) the respective portion of the objects to the persistent cache and subsequently or concurrently writes the respective portion of the objects to the secondary storage. If a write-back caching methodology is used, the source of the data subsequently written to secondary storage is persistent storage and the write to secondary storage occurs at a later time (e.g., when more room is needed in unused region 470 (FIG. 4) for writing new data to persistent cache 118). If a write-through caching methodology is used, the source of the data written to secondary storage is either the client (which concurrently sends the data to persistent storage and secondary storage) or persistent storage. FIG. 2, for example, shows server system 110 configured to write an object portion (e.g., an address block identified by the data write command from the respective client, such as virtual machine 112a) to the persistent cache 118. FIG. 2, for examples, further shows server system 110 configured to subsequently write the object portion (e.g., an address block identified by the data write command from the respective client, such as virtual machine 112a) to secondary storage 150 (e.g., within secondary storage system 130 in FIG. 1). In this example, server system 110 is configured to perform these operations in accordance with the previous determination by acceleration determination module 248 that the object portion block is marked for acceleration.

In some implementations, server system 110 is configured to implement a write-through methodology whereby, for example, the respective object portion (e.g., the respective address block) identified by the data write command from the respective client is concurrently written to persistent cache 118 and secondary storage 150. In some implementations, server system 110 is configured to implement a write-back methodology whereby, for example, the respective object portion (e.g., the respective address block) identified by the data write command from the respective client is written to persistent cache 118 and subsequently written to secondary storage after some delay. In this example, persistent cache 118 is configured as a circular buffer (also called a log-structured cache) as discussed above with respect to FIG. 4, whereby the respective object portion (e.g., the respective address block) is written to secondary storage 150 when the respective object portion stored in dirty region 450 is subsequently flushed to secondary storage 150. The respective object portion is then retained in clean region 440 of persistent cache 118 until that portion of clean region 440 is reclaimed for inclusion in unused region 470.

In accordance with a determination that the respective portion of the objects is not marked for acceleration, the computer system writes (610) the respective portion of the objects to the secondary storage. FIG. 2, for example, shows server system 110 configured to write a respective object portion (e.g., a respective address block) identified by the data write command from a respective client (e.g., virtual machine 112a) to secondary storage 150 via network communications module 244 in accordance with a previous determination by acceleration determination module 248 that the respective object portion (e.g., the respective address block) is not marked for acceleration.

FIGS. 7A-7B illustrate a flow diagram of a method 700 for accelerating data access performed by a computer system (e.g., server system 110 in FIG. 1). The computer system includes (702) one or more processors, memory and a persistent cache for storing accelerated data. In some embodiments, method 700 is governed by a set of instructions stored in memory (e.g., a non-transitory computer readable storage media) that are executed by the one or more processors of the computer system.

The persistent cache is shared (704) by the two or more clients. FIG. 1, for example, shows server system 110m configured to share persistent cache 118m between virtual machines 112a-112v. For example, each virtual machine executed on server system 110m is a client. In some embodiments, the persistent cache comprises (706) non-volatile solid state storage, such as flash memory, or any of the other examples of non-volatile storage provided above.

The computer system receives (708) data access commands from two or more clients to access data in objects identified by the data access commands. FIG. 1, for example, shows a respective server system 110 configured to receive data access commands from two or more clients executed on respective server system 110 to access data objects identified by the data access commands. In some embodiments, the two or more clients comprise (710) virtual machines executed by the computer system. FIG. 1, for example, shows server system 110m executing a plurality of virtual machines (e.g., virtual machines 112a-112v). In some embodiments, the objects comprise (712) one or more virtual disks. FIG. 1, for examples, shows secondary storage system 130 comprising a plurality of secondary storage subsystems (e.g., secondary storages 150a-150n). In this example, each secondary storage subsystem 150 comprises one or more virtual disks containing data accessible to the two or more clients.

The computer system processes (714) the data access commands from the two or more clients to update access history information for portions of the objects identified by the data access commands from the two or more clients, where the access history information is based on the data access commands of the two or more clients. FIG. 2, for example, shows access history update module 245 configured to process the data access commands to update access history information in access history database 250 for portions of the objects (e.g., address blocks or sub-blocks) identified by the data read commands. For example, access history update module 245 is configured to update a write operations count 318, a read operations count 322 and an MRU marker 324 within access history database 250 corresponding to each portion of the objects (e.g., each address block) identified by the data access commands. In some embodiments, access history update module 245 is further configured to update one or more subset read counts 330 within access history database 250 corresponding to one or more subsets of the portions of the objects (e.g., one or more address sub-blocks) identified by the data access commands when the data access commands comprise data read commands.

Furthermore, as described above, in some embodiments access history update module 245 is also configured to decrement a write operations count 318 or a read operations count 322 within access history database 250 corresponding to each of the portions of the objects (e.g., address block) not identified by the data access commands. In some embodiments, access history update module 245 is configured to update one or more of the aforementioned data structures within access history database 250 either upon receiving data access commands or based on predefined criteria (e.g., upon receiving a predefined number of data access commands or after a predefined period of time).

In some embodiments, the portions of the objects comprise (716) respective blocks or sub-blocks of the objects. For example, a portion of the objects is a block (sometimes called an address block, as discussed above), and the block comprises a plurality of sub-blocks. In another example, a portion of the objects is a sub-block. In some embodiments, the access history information includes (718) at least one usage history value for a respective block and at least one distinct respective usage history value for each sub-block of the respective block. FIG. 3A, for example, shows items of information for a respective node corresponding to an object portion (e.g., the portion of a virtual disk, or other object, called a block or address block). In this example, the items of information include a write operations count 318 for the node corresponding to the object portion (e.g., the address block), a read operations count 322 for the respective node corresponding to the object portion (e.g., the address block) and one or more of subset read counts 330 for the respective node corresponding to each of one or more of subsets of the object portion (e.g., one or more address sub-blocks of the address block).

In some embodiments, the access history information includes (720) at least one operations count for each respective object portion of a plurality of object portions identified by the data access commands. FIG. 3A, for example, shows a write operations count 318 and a read operations count 322 for a respective node corresponding to a portion of the objects (e.g., an address block of a virtual disk). As will be understood to one skilled in the art, in this example, access history database 250 similarly includes items of information for all other nodes 320, which correspond to other object portions (e.g., remaining address blocks) identified by data access commands received from the two or more clients.

In some embodiments, processing the data access commands to update access history information comprises assigning (724) the object portions identified by the data access commands to tiers in accordance with the data acceleration policy and the access history information. FIG. 2, for examples shows, tier assignment module 249 configured to assign (or reassign) object portions (e.g., address blocks) to tiers in accordance with the data acceleration policy and the access history information. For example, as discussed above with reference to FIGS. 3B and 6, tier assignment module 249 is configured to determine respective usage rates for the object portions (e.g., address blocks) identified by the data access commands. Tier assignment module 249 is further configured to assign the object portions to tiers based on the respective usage rates for the portions of the objects and the usage range indicators 348 assigned to the plurality of tiers.

In some embodiments, updating the access history information comprises (726): incrementing a usage history value for a respective portion of the objects assigned to a respective tier for which recent usage criteria are satisfied; and decrementing a usage history value for other portions of the objects assigned to the respective tier for which recent usage criteria are not satisfied. Recent usage criteria are satisfied when a respective portion of the objects (e.g., a respective address block) is identified in one or more current data access commands, or MRU marker 324 corresponding to the respective portion of the objects (e.g., the respective address block) indicates that the respective portion of the objects has been accessed within a predefined period of time, or, alternatively, less than a predefined number of data access commands have been received since the respective portion of the objects was last identified in any of the data access commands received by the server system. Recent usage criteria are not satisfied when a respective portion of the objects (e.g., a respective address block) is not identified in the one or more current data access commands and the respective portion of the objects has not been identified in any data access commands for a predefined period of time or, alternatively, a predefined number of data access commands have been received by the server system 110 since the respective portion of the objects was last identified in any data access commands.

As discussed above with respect to FIG. 3B, in some embodiments, tier assignment module 249 is configured to determine a “usage rate” or a usage history value for a respective portion of the objects (e.g., a respective address block) by, for example, combining the write operations count 318 for the respective portion of the objects with the read operations count 322 for the respective portion of the objects.

In accordance with the access history information, the computer system automatically identifies and marks (728) for acceleration portions of the objects identified by the data access commands that satisfy an access based data acceleration policy, where the automatically identifying and marking are performed collectively for the two or more clients. FIG. 2, for example, shows acceleration determination module 248 configured to identify and mark for acceleration portions of the objects (e.g., address blocks or sub-blocks) identified by the data read command from the respective client satisfying an access based data acceleration policy, in accordance with the usage history information. The above discussion of FIG. 5A provides a detailed description of marking portions of the objects (e.g., address sub-blocks) for read acceleration, and the above discussion of FIG. 6 provides a detailed description of marking portions of the objects (e.g., address blocks) for write acceleration. In some embodiments, read acceleration is determined on a sub-block by sub-block basis, and write acceleration is determined on a block-by-block basis.

In some embodiments, identifying and marking for acceleration portions of the objects comprises identifying and marking (730) for acceleration portions of the objects in accordance with the tiers to which the portions of the objects have been assigned. The above discussion of FIG. 6 provides a detailed description of identifying and marking for write acceleration portions of the objects (e.g., address blocks) in accordance with the tiers to which the portions of the objects are assigned. For example, only the object portions (e.g., address blocks) having nodes 320 within tiers (see FIGS. 3A, 3B) that are at or above a “low-water mark” are identified by acceleration determination module 248 as being qualified for acceleration and are marked for acceleration.

The computer system accelerates (732) the object portions marked for acceleration, by accelerating data access, including either or both accelerating data writes and data reads of the object portions to and from the persistent cache. FIG. 2, for example, shows server system 110 configured to accelerate portions of objects (e.g., address blocks or sub-blocks) marked for acceleration by acceleration determination module 248. The above discussion of FIGS. 5A-5B and 6 provides a detailed description of read and write operations, respectively, performed after marking portions of the objects for acceleration.

Although the terms “first,” “second,” etc. have been used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first contact could be termed a second contact, and, similarly, a second contact could be termed a first contact, which changing the meaning of the description, so long as all occurrences of the “first contact” are renamed consistently and all occurrences of the second contact are renamed consistently. The first contact and the second contact are both contacts, but they are not the same contact.

The terminology used above is for the purpose of describing particular embodiments only and is not intended to be limiting of the claims. As used in the description of the embodiments and the appended claims, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

As used herein, the term “if” may be construed to mean “when” or “upon” or “in response to determining” or “in accordance with a determination” or “in response to detecting,” that a stated condition precedent is true, depending on the context. Similarly, the phrase “if it is determined [that a stated condition precedent is true]” or “if [a stated condition precedent is true]” or “when [a stated condition precedent is true]” may be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” that the stated condition precedent is true, depending on the context.

The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit any claimed invention to the precise forms disclosed. Many modifications and variations are possible in view of the descriptions and examples provided above. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the various embodiments with various modifications as are appropriate to particular uses and requirements.

Claims

What is claimed is:

1. A method for accelerating data access, performed by a computer system having one or more processors, memory, a tiered data structure stored in the memory, the tiered data structure comprising a plurality of tiers, and a persistent cache for storing accelerated data, the method comprising:

receiving, at the computer system, data access commands from two or more clients to access data in objects identified by the data access commands;

processing, by the computer system, the data access commands to update access history information for respective object portions of the objects identified by the data access commands from the two or more clients, wherein the processing includes:

in accordance with the access history information, determining usage rates for each of the respective object portions identified by the data access commands, and

assigning each of the respective object portions identified by the data access commands to one of the plurality of tiers within the tiered data structure in accordance with a determination that a respective determined usage rate for a respective object portion is within a range of usage rates for a respective tier to which the respective object portion is assigned;

wherein each tier of the plurality of tiers within the tiered data structure stores information identifying object portions assigned to that tier;

automatically identifying and marking for acceleration at least some of the respective object portions identified by the data access commands from the two or more clients in accordance with respective tiers to which the at least some of the respective object portions have been assigned, wherein the automatically identifying and marking are performed collectively for the at least some of the respective object portions identified by the data access commands from the two or more clients; and

accelerating data access to the at least some of the respective object portions marked for acceleration, by writing the at least some respective object portions marked for acceleration to the persistent cache;

wherein the persistent cache is shared by the two or more clients.

2. The method of claim 1, wherein the persistent cache comprises non-volatile solid state storage.

3. The method of claim 1, wherein the two or more clients comprise virtual machines executed by the computer system.

4. The method of claim 1, wherein the objects comprise one or more virtual disks.

5. The method of claim 1, wherein the respective object portions comprise respective address blocks or sub-blocks of the objects.

6. The method of claim 5, wherein the access history information includes at least one usage history value for a respective address block and at least one distinct respective usage history value for each sub-block of the respective address block.

7. The method of claim 1, wherein the access history information includes at least one operations count for each respective object portion of a plurality of object portions identified by the data access commands.

8. The method of claim 1, wherein updating the access history information comprises:

incrementing a usage history value for a respective object portion assigned to a respective tier for which recent usage criteria are satisfied; and

decrementing a usage history value for other portions of objects assigned to the respective tier for which recent usage criteria are not satisfied.

9. A computer system, comprising:

one or more processors;

a persistent cache for storing accelerated data;

memory storing a tiered data structure, the tiered data structure comprising a plurality of tiers, and one or more programs for execution by the one or more processors, wherein the one or more programs include instructions that when executed by the one or more processors cause the computer system to:

receive, at the computer system, data access commands from two or more clients to access data in objects identified by the data access commands;

process, by the computer system, the data access commands to update access history information for respective object portions of the objects identified by the data access commands from the two or more clients, wherein the instructions for processing the data access commands include instructions that when executed by the one or more processors cause the computer system to:

in accordance with the access history information, determine usage rates for each of the respective object portions identified by the data access commands, and

assign each of the respective object portions identified by the data access commands to one of the plurality of tiers within the tiered data structure in accordance with a determination that a respective determined usage rate for a respective object portion is within a range of usage rates for a respective tier to which the respective object portion is assigned;

wherein each tier of the plurality of tiers within the tiered data structure stores information identifying object portions assigned to that tier;

automatically identify and mark for acceleration at least some of the respective object portions identified by the data access commands from the two or more clients in accordance with respective tiers to which the at least some of the respective object portions have been assigned, wherein the automatically identifying and marking are performed collectively for the at least some of the respective object portions identified by the data access commands from the two or more clients; and

accelerate data access to the at least some of the respective object portions marked for acceleration, by writing the at least some respective object portions marked for acceleration to the persistent cache;

wherein the persistent cache is shared by the two or more clients.

10. The computer system of claim 9, wherein the two or more clients comprise virtual machines executed by the computer system.

11. The computer system of claim 9, wherein the respective object portions comprise respective address blocks or sub-blocks of the objects.

12. The computer system of claim 11, wherein the access history information includes at least one usage history value for a respective address block and at least one distinct respective usage history value for each sub-block of the respective address block.

13. The computer system of claim 9, wherein the access history information includes at least one operations count for each respective object portion of a plurality of object portions identified by the data access commands.

14. A non-transitory computer readable medium storing one or more programs that when executed by one or more processors of a computer system cause the computer system to:

receive, at the computer system, data access commands from two or more clients to access data in objects identified by the data access commands;

process, by the computer system, the data access commands to update access history information for respective object portions of the objects identified by the data access commands from the two or more clients, wherein causing the computer system to process the data access commands includes causing the computer system to:

in accordance with the access history information, determine usage rates for each of the respective object portions identified by the data access commands, and

wherein each tier of the plurality of tiers within the tiered data structure stores information identifying object portions assigned to that tier;

wherein the persistent cache is shared by the two or more clients.

15. The non-transitory computer readable medium of claim 14, wherein the two or more clients comprise virtual machines executed by the computer system.

16. The non-transitory computer readable medium of claim 14, wherein the portions of the objects comprise respective address blocks or sub-blocks of the objects.

17. The non-transitory computer readable medium of claim 16, wherein the access history information includes at least one usage history value for a respective address block and at least one distinct respective usage history value for each sub-block of the respective address block.

18. The non-transitory computer readable medium of claim 14, wherein the access history information includes at least one operations count for each respective object portion of a plurality of object portions identified by the data access commands.

19. The non-transitory computer readable medium of claim 14, wherein:

processing the data access commands to update access history information comprises assigning the portions of the objects identified by the data access commands to tiers in accordance with the data acceleration policy and the access history information; and

identifying and marking for acceleration portions of the objects comprises identifying and marking for acceleration portions of the objects in accordance with the tiers to which the portions of the objects have been assigned.

20. The method of claim 1, wherein automatically identifying and marking for acceleration at least some of the respective object portions identified by the data access commands further comprises determining a threshold tier in the tiered data structure, and marking for acceleration object portions assigned to the threshold tier and all tiers higher in the tiered data structure than the threshold tier.

21. The method of claim 20, wherein all object portions assigned to the threshold tier and all tiers higher in the tiered data structure than the threshold tier cumulatively comprise a total number of address blocks that is less than or equal to the storage capacity of the persistent cache.

22. The method of claim 1, further including:

processing a data write command to write an identified object portion, the data write command received from a respective client of the two or more clients, by:

automatically marking for acceleration the identified object portion if the identified object portion is assigned to a tier at or above a threshold tier of the plurality of tiers;

in accordance with a determination that the identified object portion is marked for acceleration, writing the identified object portion to the persistent cache and subsequently or concurrently writing the identified object portion to a secondary storage; and

in accordance with a determination that the identified object portion is not marked for acceleration, writing the identified object portion to the secondary storage.

23. The computer system of claim 9, wherein the instructions that when executed by the one or more processors cause the computer system to automatically identify and mark for acceleration portions of the objects identified by the data access commands further comprise instructions that when executed by the one or more processors cause the computer system to determine a threshold tier in the tiered data structure and mark for acceleration object portions assigned to the threshold tier and all tiers higher in the tiered data structure than the threshold tier.

24. The method of claim 1, wherein the at least some of the respective object portions marked for acceleration include only a predefined number of the respective object portions, and the predefined number is determined based on a storage capacity of the persistent cache.

25. The method of claim 1, wherein marking a respective object portion for acceleration includes updating information accessible via the tiered data structure to indicate that the respective object portion is marked for acceleration.

Resources