Patent application title:

Inference Models for Well Production with Limited Training Data

Publication number:

US20240263555A1

Publication date:
Application number:

18/432,689

Filed date:

2024-02-05

Smart Summary: Training data is collected that includes information about drilling wells, the geology of the area, the distance between wells, and their production output. A first model is created to predict the spacing between wells using the drilling and geology data. The accuracy of this prediction is measured by comparing it to the actual spacing data. Next, a second model predicts the production output of the wells based on the same drilling and geology information, with its accuracy also assessed. Finally, a third model is developed to predict any errors in production output by considering the spacing errors along with the initial drilling and geology data. 🚀 TL;DR

Abstract:

Example embodiments involve obtaining training data comprising a first set of features relating to drilling wells, a second set of features relating to geology at the well locations, a third set of features relating to spacings between wells, and a fourth set of features relating to production output of the wells; training a first model to predict the third set of features given the first and second set of features; determining a spacing error based on differences between the third set of features in the training data and as predicted; training a second model to predict the fourth set of features given the first and second set of features; determining production error based on differences between the fourth set of features in the training data and as predicted; and training a third model to predict the production error given the spacing error and the first and second set of features.

Inventors:

Applicant:

Interested in similar patents?

Get notified when new applications in this technology area are published.

Classification:

E21B49/087 »  CPC main

Testing the nature of borehole walls; Formation testing; Methods or apparatus for obtaining samples of soil or well fluids, specially adapted to earth drilling or wells; Obtaining fluid samples or testing fluids, in boreholes or wells Well testing, e.g. testing for reservoir productivity or formation parameters

E21B2200/22 »  CPC further

Special features related to earth drilling for obtaining oil, gas or water Fuzzy logic, artificial intelligence, neural networks or the like

E21B49/08 IPC

Testing the nature of borehole walls; Formation testing; Methods or apparatus for obtaining samples of soil or well fluids, specially adapted to earth drilling or wells Obtaining fluid samples or testing fluids, in boreholes or wells

Description

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. provisional patent application No. 63/443,913, filed Feb. 7, 2023, which is hereby incorporated by reference in its entirety.

BACKGROUND

Drilling new hydrocarbon wells is a resource-intensive and time-consuming endeavor. Consequently, it is desirable to be able to model, understand, and predict the performance characteristics of these wells before drilling takes place. In this way, well planning, drilling, and operational activities can be focused on wells that are likely to be higher-productivity, exhibit lower drilling costs, lower operational costs, and/or increased operational safety. Further, even after wells have been drilled, it is still desirable to be able to accurately model, understand, and predict the future performance of these wells. Existing models, including many conventional machine-learning models, are notoriously unable to infer accurate results about scenarios other than those on which they were trained.

SUMMARY

A first example embodiment may involve obtaining training data comprising a first set of features relating to drilling (i.e., completion design) of a set of wells, a second set of features relating to respective geological properties of locations at which the wells were drilled, a third set of features relating to spacing between the wells, and a fourth set of features relating to production output of the wells; training a first machine-learning model to predict the third set of features given the first set of features and the second set of features; determining a spacing error based on a difference between the third set of features appearing in the training data and as predicted; training a second machine-learning model to predict the fourth set of features given the first set of features and the second set of features; determining a production error based on a difference between the fourth set of features appearing in the training data and as predicted; and training a third machine-learning model to predict the production error given the spacing error, the first set of features, and the second set of features, wherein the third set of features is not considered by the third machine-learning model, and wherein the production error is usable to estimate a further production output of wells with spacings not represented in the third set of features.

A second example embodiment may involve obtaining, from a first pre-trained machine-learning model, a predicted production for a new well given instances of a first set of features and a second set of features as relating to the new well, wherein the first pre-trained machine-learning model was trained with first set of features related to drilling of a set of wells and the second set of features related to respective geological properties of locations at which the wells were drilled, and wherein the predicted production represents a counterfactual spacing scenario specified during training of the first pre-trained machine-learning model; obtaining, from a second pre-trained machine-learning model, a predicted scaling factor that represents a difference between attested per-well spacings and counterfactual spacing values across the wells represented in the first set of features and the second set of features used train the first pre-trained machine-learning model; and modifying the predicted production by the predicted scaling factor.

A third example embodiment may involve a non-transitory computer-readable medium, having stored thereon program instructions that, upon execution by a computing system, cause the computing system to perform operations in accordance with any previous example embodiment.

In a fourth example embodiment, a computing system may include at least one processor, as well as memory and program instructions. The program instructions may be stored in the memory, and upon execution by the at least one processor, cause the computing system to perform operations in accordance with any previous example embodiment.

In a fifth example embodiment, a system may include various means for carrying out each of the operations of any previous example embodiment.

These, as well as other embodiments, aspects, advantages, and alternatives, will become apparent to those of ordinary skill in the art by reading the following detailed description, with reference where appropriate to the accompanying drawings. Further, this summary and other descriptions and figures provided herein are intended to illustrate embodiments by way of example only and, as such, that numerous variations are possible. For instance, structural elements and process steps can be rearranged, combined, distributed, eliminated, or otherwise changed, while remaining within the scope of the embodiments as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a schematic drawing of a computing device, in accordance with example embodiments.

FIG. 2 illustrates a schematic drawing of a server device cluster, in accordance with example embodiments.

FIG. 3 is a graph depicting how previous models can be inaccurate, in accordance with example embodiments.

FIG. 4 is a graph depicting desired output, in accordance with example embodiments.

FIG. 5 depicts a causal graph, in accordance with example embodiments.

FIG. 6 depicts use of an ensemble of machine-learning models, in accordance with example embodiments.

FIG. 7 depicts an extrapolation technique, in accordance with example embodiments.

FIGS. 8 and 9 are flow charts, in accordance with example embodiments.

FIG. 10 is a flow chart, in accordance with example embodiments.

FIG. 11 is a flow chart, in accordance with example embodiments.

DETAILED DESCRIPTION

Example methods, devices, and systems are described herein. It should be understood that the words “example” and “exemplary” are used herein to mean “serving as an example, instance, or illustration.” Any embodiment or feature described herein as being an “example” or “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or features unless stated as such. Thus, other embodiments can be utilized and other changes can be made without departing from the scope of the subject matter presented herein.

Accordingly, the example embodiments described herein are not meant to be limiting. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations. For example, the separation of features into “client” and “server” components may occur in a number of ways.

Further, unless context suggests otherwise, the features illustrated in each of the figures may be used in combination with one another. Thus, the figures should be generally viewed as component aspects of one or more overall embodiments, with the understanding that not all illustrated features are necessary for each embodiment.

Additionally, any enumeration of elements, blocks, or steps in this specification or the claims is for purposes of clarity. Thus, such enumeration should not be interpreted to require or imply that these elements, blocks, or steps adhere to a particular arrangement or are carried out in a particular order.

I. Example Computing Devices and Cloud-Based Computing Environments

FIG. 1 is a simplified block diagram exemplifying a computing device 100, illustrating some of the components that could be included in a computing device arranged to operate in accordance with the embodiments herein. Computing device 100 could be a client device (e.g., a device actively operated by a user), a server device (e.g., a device that provides computational services to client devices), or some other type of computational platform. Some server devices may operate as client devices from time to time in order to perform particular operations, and some client devices may incorporate server features.

In this example, computing device 100 includes processor 102, memory 104, network interface 106, and input/output unit 108, all of which may be coupled by system bus 110 or a similar mechanism. In some embodiments, computing device 100 may include other components and/or peripheral devices (e.g., detachable storage, printers, and so on).

Processor 102 may be one or more of any type of computer processing element, such as a central processing unit (CPU), a co-processor (e.g., a mathematics, graphics, or encryption co-processor), a digital signal processor (DSP), a network processor, and/or a form of integrated circuit or controller that performs processor operations. In some cases, processor 102 may be one or more single-core processors. In other cases, processor 102 may be one or more multi-core processors with multiple independent processing units. Processor 102 may also include register memory for temporarily storing instructions being executed and related data, as well as cache memory for temporarily storing recently-used instructions and data.

Memory 104 may be any form of computer-usable memory, including but not limited to random access memory (RAM), read-only memory (ROM), and non-volatile memory (e.g., flash memory, hard disk drives, solid state drives, compact discs (CDs), digital video discs (DVDs), and/or tape storage). Thus, memory 104 represents both main memory units, as well as long-term storage. Other types of memory may include biological memory.

Memory 104 may store program instructions and/or data on which program instructions may operate. By way of example, memory 104 may store these program instructions on a non-transitory, computer-readable medium, such that the instructions are executable by processor 102 to carry out any of the methods, processes, or operations disclosed in this specification or the accompanying drawings.

As shown in FIG. 1, memory 104 may include firmware 104A, kernel 104B, and/or applications 104C. Firmware 104A may be program code used to boot or otherwise initiate some or all of computing device 100. Kernel 104B may be an operating system, including modules for memory management, scheduling and management of processes, input/output, and communication. Kernel 104B may also include device drivers that allow the operating system to communicate with the hardware modules (e.g., memory units, networking interfaces, ports, and buses) of computing device 100. Applications 104C may be one or more user-space software programs, such as web browsers or email clients, as well as any software libraries used by these programs. Memory 104 may also store data used by these and other programs and applications.

Network interface 106 may take the form of one or more wireline interfaces, such as Ethernet (e.g., Fast Ethernet, Gigabit Ethernet, and so on). Network interface 106 may also support communication over one or more non-Ethernet media, such as coaxial cables or power lines, or over wide-area media, such as Synchronous Optical Networking (SONET) or digital subscriber line (DSL) technologies. Network interface 106 may additionally take the form of one or more wireless interfaces, such as IEEE 802.11 (Wifi), BLUETOOTH®, global positioning system (GPS), or a wide-area wireless interface. However, other forms of physical layer interfaces and other types of standard or proprietary communication protocols may be used over network interface 106. Furthermore, network interface 106 may comprise multiple physical interfaces. For instance, some embodiments of computing device 100 may include Ethernet, BLUETOOTH®, and Wifi interfaces.

Input/output unit 108 may facilitate user and peripheral device interaction with computing device 100. Input/output unit 108 may include one or more types of input devices, such as a keyboard, a mouse, a touch screen, and so on. Similarly, input/output unit 108 may include one or more types of output devices, such as a screen, monitor, printer, and/or one or more light emitting diodes (LEDs). Additionally or alternatively, computing device 100 may communicate with other devices using a universal serial bus (USB) or high-definition multimedia interface (HDMI) port interface, for example.

In some embodiments, one or more computing devices like computing device 100 may be deployed. The exact physical location, connectivity, and configuration of these computing devices may be unknown and/or unimportant to client devices. Accordingly, the computing devices may be referred to as “cloud-based” devices that may be housed at various remote data center locations.

FIG. 2 depicts a cloud-based server cluster 200 in accordance with example embodiments. In FIG. 2, operations of a computing device (e.g., computing device 100) may be distributed between server devices 202, data storage 204, and routers 206, all of which may be connected by local cluster network 208. The number of server devices 202, data storages 204, and routers 206 in server cluster 200 may depend on the computing task(s) and/or applications assigned to server cluster 200.

For example, server devices 202 can be configured to perform various computing tasks of computing device 100. Thus, computing tasks can be distributed among one or more of server devices 202. To the extent that these computing tasks can be performed in parallel, such a distribution of tasks may reduce the total time to complete these tasks and return a result. For purposes of simplicity, both server cluster 200 and individual server devices 202 may be referred to as a “server device.” This nomenclature should be understood to imply that one or more distinct server devices, data storage devices, and cluster routers may be involved in server device operations.

Data storage 204 may be data storage arrays that include drive array controllers configured to manage read and write access to groups of hard disk drives and/or solid state drives. The drive array controllers, alone or in conjunction with server devices 202, may also be configured to manage backup or redundant copies of the data stored in data storage 204 to protect against drive failures or other types of failures that prevent one or more of server devices 202 from accessing units of data storage 204. Other types of memory aside from drives may be used.

Routers 206 may include networking equipment configured to provide internal and external communications for server cluster 200. For example, routers 206 may include one or more packet-switching and/or routing devices (including switches and/or gateways) configured to provide (i) network communications between server devices 202 and data storage 204 via local cluster network 208, and/or (ii) network communications between server cluster 200 and other devices via communication link 210 to network 212.

Additionally, the configuration of routers 206 can be based at least in part on the data communication requirements of server devices 202 and data storage 204, the latency and throughput of the local cluster network 208, the latency, throughput, and cost of communication link 210, and/or other factors that may contribute to the cost, speed, fault-tolerance, resiliency, efficiency, and/or other design goals of the system architecture.

As a possible example, data storage 204 may include any form of database, such as a structured query language (SQL) database. Various types of data structures may store the information in such a database, including but not limited to tables, arrays, lists, trees, and tuples. Furthermore, any databases in data storage 204 may be monolithic or distributed across multiple physical devices.

Server devices 202 may be configured to transmit data to and receive data from data storage 204. This transmission and retrieval may take the form of SQL queries or other types of database queries, and the output of such queries, respectively. Additional text, images, video, and/or audio may be included as well. Furthermore, server devices 202 may organize the received data into web page or web application representations. Such a representation may take the form of a markup language, such as HTML, the eXtensible Markup Language (XML), or some other standardized or proprietary format. Moreover, server devices 202 may have the capability of executing various types of computerized scripting languages, such as but not limited to Perl, Python, PUP Hypertext Preprocessor (PUP), Active Server Pages (ASP), JAVASCRIPT®, and so on. Computer program code written in these languages may facilitate the providing of web pages to client devices, as well as client device interaction with the web pages. Alternatively or additionally, JAVA® may be used to facilitate generation of web pages and/or to provide web application functionality.

II. Example Well Drilling Overview

Sites are selected for drilling via a well planning process typically conducted by teams of geoscience and/or petroleum science experts. These teams may evaluate quantities of data, drill test wells to learn about the nature of the underlying geology, and/or attempt to make determinations about the performance potential of each well drilled. To optimize efficiency of drilling activities, multiple drilling operations in the same area are often contemplated so that they can be coordinated and executed in parallel subsequent operations.

A well may be created by drilling a hole into the earth with a drilling rig. The hole may be 5 to 50 inches in diameter, but narrower or wider holes may be drilled. The hole might not be drilled straight down. For many wells, holes are drilled laterally and/or horizontally.

To drill the hole, a drilling rig rotates a drill pipe with a bit attached. As or after the hole is drilled, sections of steel casings may be placed into the hole. Concrete may also be placed between the outside of the casing and the hole. The casing and concrete provide structural stability for the well.

With the hole protected by the casing and concrete, the well may be drilled deeper with a smaller bit, possibly within a smaller casing. This process may repeat a number of times, with the hole being drilled by progressively smaller bits inside of narrower casings.

The rotary drill bit, possibly aided by the weight of drill collars above it, either slices into the rock or otherwise breaks the rock down into smaller pieces. Drilling fluid, sometimes called “mud,” may also be pumped into the drill pipe to cool the drill bit as well as to normalize the pressure differential between the hole and the surrounding rock. The rock pieces cut by the bit may be brought to the surface outside of the drill pipe.

After the well is drilled and cased, it may be completed so that it can produce and recover hydrocarbons. Small holes, sometimes referred to as “perforations” may be made in the casing at a depth at which the oil reservoir exists. These provide a path for the hydrocarbons to flow from the surrounding rock, up the well shaft, and to the surface. In some wells, the pressure of the hydrocarbon reservoir is high enough for the hydrocarbons to flow to the surface. However, if this is not the case, artificial lift methods can be used to force hydrocarbons out of the ground.

In some cases, fissures or fractures in formations of hard rock surrounding the well can be created by hydraulically injecting water, sand, and/or chemicals into the rock. When the hydraulic pressure is removed from the well, small grains of proppant (e.g., sand or aluminum oxide) may be used to hold the fissures. This process is known as “stimulation” and may take place in multiple stages. In this way, hydrocarbons that otherwise would be inaccessible might become available for production.

III. Example Well Production Modeling

A type of machine-learning model known as a decision tree can be used as the basis for modeling the performance of planned wells before they are drilled. Doing so gives engineers an empirically-based estimate of how productive the well would be based on its characteristics and the characteristics of the surrounding substrate.

Input features to such a model may include type of geological formation where the well is to be drilled, brittleness of the rock, porosity of the rock, number of stimulation stages, fluid used during drilling, pressures applied during drilling, proppant type, proppant amount, completion type, longitude, latitude, elevation above sea level, drilling equipment type, and distances to nearest wells, for example. Output features could include well productivity in barrels over n days (where n could be 30, 60, 90, 180, etc.), in addition to measures of cost, safety, and well lifetime. Other input and output features may be used.

In any event, a decision tree uses a form of feature selection to determine which subsets of input features are most influential over each of the output features, and the nature of this influence. Such a decision tree maps the values of input features to values of output characteristics using a tree-like structure. Branching points can be found in a greedy fashion based on the entropy or Gini index of the training data. Branches that are most likely to direct the traversal toward relevant features (features that have more impact on one or more output features) are placed higher in the tree. In practical embodiments, the depth, number of splits per node, or total number of leaf nodes may be constrained so that each tree is more tractable. Decision trees can be constructed in an iterative or recursive fashion.

Using randomization or by varying features, multiple decision trees may be generated for a given data set (e.g., a random forest or gradient-boosting model). As an example, results from subsets of the decision trees may be added together so that a loss function is minimized. After calculating the loss function for a given subset of trees, a gradient descent procedure is invoked to add a new tree to the model that reduces the loss (i.e., follows the gradient). This can be accomplished by parameterizing the new tree, then modifying the features of the new tree and moving in the right direction by reducing the residual loss.

Variations of these techniques may be used. Thus the embodiments herein may employ one or more of decision trees, random forests, gradient-boosted trees, extremely randomized trees, support vector machines, and neural networks. Regardless of the analytical technique employed, the goal of the analysis is to determine which input features have the most significant impact on the well output features. In order to simplify the analysis, the number of features selected should be smaller than the number of total features, perhaps one-quarter to one-tenth as many.

A separate analysis and feature selection step may be performed for each output feature. Thus, for example, if the output features of interest are well productivity and well cost, feature selection may be performed twice, and different sets of input features could be selected each time.

Thus, a model of well output features may be developed based on the input features selected. This model may predict one or more output features of a proposed well site, based on the values of the input features determined to be most influential on these output features.

The model(s) may be validated prior to general deployment. One way of doing so is to test a model against the output features of an actual well site that was not used to develop the model. For instance, in a given oil field of 100 wells, 75 may be randomly chosen as training data for the model. Once the model is developed, it may be tested against the remaining 25 wells.

For a given output feature, the value predicted by the model may be compared to the actual value exhibited by the well. The difference between these values is considered error, and may be expressed as an absolute error. Over the remaining 25 wells, the aggregate error may be characterized as the total absolute error, mean squared error, root mean squared error, median error, or in some other fashion. If the aggregate error is sufficiently small (e.g., less than a particular value or less than the aggregate error produced by another model), the modeling may be considered a success, and the model may be considered validated.

A validated model may be applied to proposed well sites. Application of the validated model may result in predicted well output features for each proposed site. Thus, the model may help determine where and how to drill wells on this land. In some cases, visualization tools (in the form of interactive tables, graphs, and charts displayed on a graphical user interface) may be used in order to allow a user to change the values of one or more input features for a proposed well site, and see how these changes impact well output features.

Various types of tree-based modeling of well characteristics are disclosed in U.S. Pat. No. 10,138,717 and U.S. Patent Application Pub. No. 2022/0083873. Both of these patent documents are incorporated by reference in their entirety herein.

IV. Limitations of Current Models

It is generally understood that the quality of machine-learning model predictions is highly dependent on the extent, type, and scope of the data used to train the model. In situations where such a trained model is used to predict outcomes of scenarios for which there was little or no similar training data, these models often provide predictions with limited or low accuracy and reliability. Furthermore, when there are confounding factors, such as input features that are correlated, these models may be unable to distinguish or characterize the impact of each input feature on a given output feature.

In the context of models that predict well production, it has been noted above that there are many input features that can potentially influence this output feature. One of the input features is how close one or more other wells are to a well that is proposed for drilling. As a simple example, the closer that two wells are spaced together, the more likely that they will be competing for the same underground hydrocarbon resources.

In practice, current modeling techniques such as those described above, tend to have poor sensitivity to well spacing. Here, well spacing includes horizontal spacing (e.g., x and y distances between wells on a plane representing a surface) in addition to vertical spacing (e.g., z distances between wells in three-dimensional space). In some cases, well spacing may refer to number of wells per a geographical location, distance to a parent well, and so on. Thus, well spacing may consist of a vector of one or more features.

Most of today's well locations (e.g., basins) are already populated with wells, so much of current drilling efforts focus on placing new wells between or nearby existing wells. Given the importance of spacing-related input parameters on production, it is desirable to develop models that produce more accurate results when presented with various types of well spacing distances.

FIG. 3 provides an example of where previous modeling techniques can be inadequate. In FIG. 3, graph 300 plots two input features—proppant per foot on the x axis and wells per section on the y axis. Here, a “section” is a defined geographical location with specified geometrical dimensions used for drilling, and wells per section is a representation of well spacing. For instance, the y axis may be in terms of existing wells within a radius of one-half mile (or some other distance) from the proposed well. In a decision-tree-based model, both of these input features can be highly influential on well production. However, when one is selected for placement higher in the tree than the other, the model may lose some degree of sensitivity that the lower-placed feature has on well production.

More formally, current models are discriminative in that all input features combined (X) are assumed to have some overall relationship with the expected value of the output feature y. At a high level, these models are based on the relationship E(y|X), where E is an expected value function, X is a vector of input features, and y is the output feature of interest.

In the example shown in FIG. 3, the training data is clustered between proppant per foot values of 1250-2750 and wells per section values of 0-20. Forecasts for wells with values of proppant per foot and/or wells per section that fall outside of these ranges get pulled toward the center of the next closest leaf. Thus, if a decision tree splits on proppant before spacing, the prediction will move toward a center point or mean value along the spacing axis. This means that the model's sensitivity to spacing is lost—the two planned wells 302 and 304 represented by circles at the top left of graph 300 will not respond much to changes in spacing, even though in reality spacing can be highly influential on well production. For instance, planned well 302 may indicate little or no change in production when well spacing is changed from 30 to 50, even though a significant impact might be expected or observed in practice.

In this particular scenario of interest, one of the reasons why there is a deleterious impact on modeling is that the amount of proppant used is based on environmental and/or geological characteristics, such as rock porosity, that also influence well spacing. The amount of proppant used also may be related to well spacing. A well with few or no close neighbors may be provided with a large amount of proppant to fracture the rock around it, knowing that this will have a minimal impact on the neighboring wells. On the other hand, if well spacing is tight, the amount of proppant used may be reduced to maximize the overall output across all wells in the section and to avoid damaging nearby wells. Thus, the input features of proppant and spacing may be related to one another in some way (correlated), or may both be dependent on the same underlying variables.

V. Improved Models

The embodiments herein overcome these and potentially other drawbacks to the prior art. The effects of changes to input features can be thought of as “treatments” being applied to hypothetical wells. Thus, to disentangle the impact of well spacing on production from other input features, spacing is modeled separately from the other features. Put more formally, the goal is to determine E(y1-y0|T), where y1 is what really happened, y0 is when might have happened, and T is the treatment (intervention). This is done in a way that allows determination of the impact of the treatment when one cannot randomly assign treatments to a trial group and non-treatment to a control group. Thus, this technique can be used to address the well production problem above where completely randomized data for all combinations of input features is not available.

Specially, the treatment considers a number of input features relating to well spacing (i.e., T may be considered a vector of values specifying or indicative of well spacing). The goal is to determine the change in well production conditioned on these input features.

FIG. 4 provides an example of the type of output that is desired. For graph 400, the x axis represents additional wells per section that are being proposed, and the y axis represents a multiplicative factor for how much well production is expected to change. Note that the x axis begins below 0 because this example assumes a starting point of four wells per section.

Graph 400 shows that if the number of additional wells is small (e.g., 1-2) then the overall production across all wells goes up by about 2-5%, where if the number of additional wells is large (e.g., 15-20), then the overall production across all wells drops by about 12-15%. Notably, even if the production of each well decreases, adding more wells to the section can ultimately increase the sum of production across all wells in the section.

To achieve these goals, a simplified causal model of well production has been developed. FIG. 5 depicts this model as a causal graph. Simply put, completions (which incorporates proppant loading among other features) and geology (which incorporates rock porosity among other features) both influence well spacing, as indicated by the arrows. Also as indicated by the arrows, completions, geology, and well spacing all influence well production.

This model (or, more accurately, ensemble of models) may incorporate a technique sometimes referred to as “double machine learning,” and is defined visually and mathematically in FIG. 6. Double machine learning attempts to control for confounding variables in a way that does not require full model specification or parametric relationships between any of the variables. The basic idea is to learn a relationship from residualized treatment (T) values to residualized target (Y) values.

As shown in equation 600, one model is used to determine E(T|X), where X is a vector of the input features of interest (e.g., completion features, geology features, and so on) deemed to influence a treatment, where T is treatment (e.g., a vector of well spacing features). The goal of this model is to predict what treatment was applied given the input features. In other words, equation 600 is an attempt to predict well spacing features given other relevant input features of those wells. Doing so may involve training a supervised machine-learning model (e.g., decision trees or random forest models) based on the non-spacing features as input and the well spacing features as output. Notably, well production may be omitted from any model based on equation 600. The model based on equation 600 may produce a prediction of well spacing {circumflex over (T)}. But since the true values of the well spacing features, T, are known from the training data, the difference (error) can be calculated as {tilde over (T)}={circumflex over (T)}−T.

As shown in equation 602, another model is used to determine E(Y|X), where X is a vector of the input features of interest omitting the spacing features T, and where Y is well production. The goal of this model is to predict well production given just the non-spacing input features. Doing so may involve training a supervised machine-learning model (e.g., decision trees or random forest models) based on the non-spacing features as input and the well spacing features as output. The model based on equation 602 may produce a prediction of well production Ŷ. But since the true values of the well production features, Y, are known from the training data, the difference (error) can be calculated as {tilde over (Y)}=Ŷ−Y.

As shown in equation 604, yet another model is used to determine E({tilde over (Y)}|{tilde over (T)}, X). The goal of this model is to predict the well production error based on the spacing prediction error and the non-spacing features. Doing so may involve training a supervised machine-learning model based on the well spacing error and non-spacing features as input and the well production error as output. The model based on equation 604 can produce a prediction of the change in well production given a change in spacing to a spacing layout that is unusual for the set of non-spacing input features X. This model may be based on decision trees, random forests, or generalized random forests (GRFs) for example. Notably, GRFs provide non-parametric estimation of heterogeneous treatment effects, as well as least-squares regression, quantile regression, and survival regression, all with support for missing covariates.

Advantageously, {tilde over (Y)} can be used to infer production of wells with spacings that do not appear in the training data—that are “off the manifold” of the training data. In principle, this technique controls for correlations between well spacing and other input features and therefore provides a de-biased estimate of {tilde over (Y)}. Thus, the model learns how much of a change in Y it should expect from unusual values of T, conditioned on other features in the model. Y can be a single outcome, like the EUR of the well, or a multivalued outcome, like a timeseries of production. In the latter case, the moment between the residualized values will also be smoothed across the outcomes.

As shown in equation 606, these models can be used to produce:

σ = E ⁡ ( Y ⁢ ❘ "\[LeftBracketingBar]" T 1 , X ) - E ( Y ❘ "\[RightBracketingBar]" ⁢ T 0 , X )

Here, σ is an estimate of the difference in well production between two scenarios (e.g., well spacing features T0 may be actual spacing features from the training data while well spacing features T1 do not appear in the training data). In order to predict well production based on this difference, any form of linear or non-linear extrapolation can be used. FIG. 7 depicts one such possibility.

The relationship between equations 604 and 606 can be described as follows. In equation 604, an estimator (e.g., using GRFs) learns a locally smooth moment between the residualized treatment and residualized production. Thus, the model learns how much of a change in Y it should expect from unusual values of T, conditioned on other features in the model. In some cases, the GRF moment estimation may include multiple points in time to facilitate smoother forecasts.

In equation 606, the model trained in equation 604 is used to generate a prediction between the historic spacing layout of wells in the training data, and a counterfactual scenario. The treatment counterfactual employs the median spacing features contained in the training data. So, for each well in the training data, the magnitude of the treatment effect on production incurred by moving the well from its real spacing to the median spacing in the dataset is predicted. This delta becomes the scaling factor described below. In some embodiments, other measures of central tendency could be used instead of a median, such as a mean, weighted mean, mode, and so on. Other statistics may be used in place of a median as well.

In FIG. 7, equation 700 relates to equation 600, equation 702 relates to equation 602, and equation 704 relates to equation 604. However, various values of Y may be log transformed for convenience and so that a represents a multiplicative scale factor. Also in FIG. 7, equation 706 provides a form of extrapolation from 6 to expected well production values. In other words, E (Ye−σ|X) multiplies the historical well production data Y by the scaling factor. The counterfactual being implicitly considered is what production would be if the well had the median (or other) spacing in the training data.

At inference time, input features for a proposed well are provided to the model as shown in FIG. 7, and this model produces a scaling factor from the median (or other) spacing to the well that is being forecasted. The scaling factor is then used with a pre-existing decision tree or random forest model of well production to scale that model's prediction of well production. As a consequence, production for wells with unusual well spacings not in the training data can be predicted more accurately. Going back to the causal graph of FIG. 5, the techniques herein provide an estimate of the influence of well spacing on well production, as represented by the arrow from the spacing circle to the production circle. In other words, production can be predicted just from completions and geology because the spacing is implicitly incorporated by the scaling factor.

Another view of the overall technique can be found in FIG. 8. Two separate models are used to generate residuals. Residualized spacing values are from a model that knows about geology and completions, while residualized hydrocarbon production is from a model that does not know about spacing. Then, a generalized random forest is fit as an estimator on the residualized spacing and the residualized production, conditioned on geology and spacing. This lets the model learn different spacing effects. The model's output can be a per-producing-day, spacing-based, scaling factor. The scaling factor is a difference between the attested per-well spacing values and the median (or other) spacing values across all wells in the training data.

FIG. 9 depicts how features are provided to various models during training and inference. Training may involve dividing the historical production by the scaling factor to get counterfactual production (e.g., this represents what the generalized random forest model predicts that the well would have produced if it had perfectly median (or other) spacing). Then an ExtraTrees model is trained to learn counterfactual production from completions and geography (but not spacing). At inference time, a forecast of production for each planned well at median (or other) spacing using is generated using the ExtraTrees model, and a forecast of the scaling factor from the generalized random forest model is generated using all features. The results are multiplied together to obtain the forecast at the desired spacing layout.

Notably, such an ExtraTrees model refers to extremely randomized trees, a machine-learning technique involving use of a large number of unpruned decision trees. Predictions are made by averaging the predictions of each of the decision trees in the case of regression or using majority voting in the case of classification.

VI. Example Operations

FIG. 10 is a flow chart illustrating an example embodiment relating to training of machine-learning models. The process illustrated by FIG. 10 may be carried out by a computing device, such as computing device 100, and/or a cluster of computing devices, such as server cluster 200. However, the process can be carried out by other types of devices or device subsystems.

The embodiments of FIG. 10 may be simplified by the removal of any one or more of the features shown therein. Further, these embodiments may be combined with features, aspects, and/or implementations of any of the previous figures or otherwise described herein.

Block 1000 may involve obtaining training data comprising a first set of features relating to drilling of a set of wells, a second set of features relating to respective geological properties of locations at which the wells were drilled, a third set of features relating to spacings between the wells, and a fourth set of features relating to production output of the wells.

Block 1002 may involve training a first machine-learning model to predict the third set of features given the first set of features and the second set of features.

Block 1004 may involve determining a spacing error based on a difference between the third set of features appearing in the training data and as predicted.

Block 1006 may involve training a second machine-learning model to predict the fourth set of features given the first set of features and the second set of features.

Block 1008 may involve determining a production error based on a difference between the fourth set of features appearing in the training data and as predicted.

Block 1010 may involve training a third machine-learning model to predict the production error given the spacing error, the first set of features, and the second set of features, wherein the third set of features is not considered by the third machine-learning model, and wherein the production error is usable to estimate a further production output of wells with spacings not represented in the third set of features.

In some embodiments, determining the spacing error comprises determining counterfactual examples for the third set of features, wherein determining the production error is based on the counterfactual examples and true values of the third set of features. These embodiments may involve applying the production error to the fourth set of features to generate counterfactual production values.

Some embodiments may involve training a fourth machine-learning model to predict the counterfactual production values based on the first set of features and the second set of features, wherein the third set of features is not considered by the fourth machine-learning model.

In some embodiments, the counterfactual examples for the third set of features are based on a measure of central tendency (e.g., mean, median, or some other metric) as applied to the spacings between wells.

In some embodiments, each of the first machine-learning model, the second machine-learning model, and the third machine-learning model incorporate decision trees, random forests, extremely randomized trees, or gradient boosting trees.

In some embodiments, the first set of features includes one or more of: a number of stimulation stages, fluid and/or volume thereof used during drilling, pressures applied during drilling, proppant type, proppant amount, completion type, or drilling equipment type.

In some embodiments, the second set of features includes one or more of: type of geological formations, brittleness of rock, or porosity of rock.

In some embodiments, the third set of features includes one or more of: horizontal spacing, vertical spacing, number of wells per the locations, or distance to a parent well.

Some embodiments may involve, based on use of the third machine-learning model, determining a scaling factor that represents a difference between attested per-well spacings and spacing values across the wells in the training data; obtaining, from a pre-trained machine learning model, a predicted production for a new well given instances of the first set of features, the second set of features, and the third set of features as relating to the new well; and modifying the predicted production by the scaling factor.

In some embodiments, the spacings between wells are in terms of median spacing values.

FIG. 11 is a flow chart illustrating an example embodiment relating to execution of a pre-trained machine-learning model. The process illustrated by FIG. 11 may be carried out by a computing device, such as computing device 100, and/or a cluster of computing devices, such as server cluster 200. However, the process can be carried out by other types of devices or device subsystems.

The embodiments of FIG. 11 may be simplified by the removal of any one or more of the features shown therein. Further, these embodiments may be combined with features, aspects, and/or implementations of any of the previous figures or otherwise described herein.

Block 1100 may involve obtaining, from a first pre-trained machine-learning model, a predicted production for a new well given instances of a first set of features and a second set of features as relating to the new well, wherein the first pre-trained machine-learning model was trained with first set of features related to drilling of a set of wells and the second set of features related to respective geological properties of locations at which the wells were drilled, and wherein the predicted production represents a counterfactual spacing scenario specified during training of the first pre-trained machine-learning model.

Block 1102 may involve obtaining, from a second pre-trained machine-learning model, a predicted scaling factor that represents a difference between attested per-well spacings and counterfactual spacing values across the wells represented in the first set of features and the second set of features used train the first pre-trained machine-learning model.

Block 1104 may involve modifying the predicted production by the predicted scaling factor.

In some embodiments, a first additional machine-learning model was trained to predict a third set of features related to spacings between the wells given the first set of features and the second set of features. These embodiments may further involve determining a spacing error based on a difference between the third set of features used to train the first additional machine-learning model and as predicted by the first additional machine-learning model.

In some embodiments, a second additional machine-learning model was trained to predict a fourth set of features relating to production output of the wells given the first set of features and the second set of features. These embodiments may further involve determining a production error based on a difference between the fourth set of features used to train the second additional machine-learning model and as predicted by the second additional machine-learning model.

In some embodiments, a third additional machine-learning model was trained to predict the production error given the spacing error, the first set of features, and the second set of features, wherein the third set of features is not considered by the third additional machine-learning model, and wherein the predicted scaling factor is determined based on use of the third additional machine-learning model.

In some embodiments, the production error is usable to estimate a further production output of wells with spacings not represented in the third set of features.

In some embodiments, each of the first additional machine-learning model, the second additional machine-learning model, and the third additional machine-learning model incorporate decision trees, random forests, extremely randomized trees, or gradient boosting trees.

In some embodiments, the first set of features includes one or more of: a number of stimulation stages, fluid and/or volume thereof used during drilling, pressures applied during drilling, proppant type, proppant amount, completion type, or drilling equipment type.

In some embodiments, the second set of features includes one or more of: type of geological formations, brittleness of rock, or porosity of rock.

In some embodiments, the third set of features includes one or more of: horizontal spacing, vertical spacing, number of wells per the locations, or distance to a parent well.

In some embodiments, the spacings between wells are in terms of median spacing values.

VII. Closing

The present disclosure is not to be limited in terms of the particular embodiments described in this application, which are intended as illustrations of various aspects. Many modifications and variations can be made without departing from its scope, as will be apparent to those skilled in the art. Functionally equivalent methods and apparatuses within the scope of the disclosure, in addition to those described herein, will be apparent to those skilled in the art from the foregoing descriptions. Such modifications and variations are intended to fall within the scope of the appended claims.

The above detailed description describes various features and operations of the disclosed systems, devices, and methods with reference to the accompanying figures. The example embodiments described herein and in the figures are not meant to be limiting. Other embodiments can be utilized, and other changes can be made, without departing from the scope of the subject matter presented herein. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations.

With respect to any or all of the message flow diagrams, scenarios, and flow charts in the figures and as discussed herein, each step, block, and/or communication can represent a processing of information and/or a transmission of information in accordance with example embodiments. Alternative embodiments are included within the scope of these example embodiments. In these alternative embodiments, for example, operations described as steps, blocks, transmissions, communications, requests, responses, and/or messages can be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved. Further, more or fewer blocks and/or operations can be used with any of the message flow diagrams, scenarios, and flow charts discussed herein, and these message flow diagrams, scenarios, and flow charts can be combined with one another, in part or in whole.

A step or block that represents a processing of information can correspond to circuitry that can be configured to perform the specific logical functions of a herein-described method or technique. Alternatively or additionally, a step or block that represents a processing of information can correspond to a module, a segment, or a portion of program code (including related data). The program code can include one or more instructions executable by a processor for implementing specific logical operations or actions in the method or technique. The program code and/or related data can be stored on any type of computer readable medium such as a storage device including RAM, a disk drive, a solid-state drive, or another storage medium.

The computer readable medium can also include non-transitory computer readable media such as non-transitory computer readable media that store data for short periods of time like register memory and processor cache. The non-transitory computer readable media can further include non-transitory computer readable media that store program code and/or data for longer periods of time. Thus, the non-transitory computer readable media may include secondary or persistent long-term storage, like ROM, optical or magnetic disks, solid-state drives, or compact disc read only memory (CD-ROM), for example. The non-transitory computer readable media can also be any other volatile or non-volatile storage systems. A non-transitory computer readable medium can be considered a computer readable storage medium, for example, or a tangible storage device.

Moreover, a step or block that represents one or more information transmissions can correspond to information transmissions between software and/or hardware modules in the same physical device. However, other information transmissions can be between software modules and/or hardware modules in different physical devices.

The particular arrangements shown in the figures should not be viewed as limiting. It should be understood that other embodiments could include more or less of each element shown in a given figure. Further, some of the illustrated elements can be combined or omitted. Yet further, an example embodiment can include elements that are not illustrated in the figures.

While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for purpose of illustration and are not intended to be limiting, with the true scope being indicated by the following claims.

Claims

What is claimed is:

1. A computer-implemented method comprising:

obtaining training data comprising a first set of features relating to drilling of a set of wells, a second set of features relating to respective geological properties of locations at which the wells were drilled, a third set of features relating to spacings between the wells, and a fourth set of features relating to production output of the wells;

training a first machine-learning model to predict the third set of features given the first set of features and the second set of features;

determining a spacing error based on a difference between the third set of features appearing in the training data and as predicted;

training a second machine-learning model to predict the fourth set of features given the first set of features and the second set of features;

determining a production error based on a difference between the fourth set of features appearing in the training data and as predicted; and

training a third machine-learning model to predict the production error given the spacing error, the first set of features, and the second set of features, wherein the third set of features is not considered by the third machine-learning model, and wherein the production error is usable to estimate a further production output of wells with spacings not represented in the third set of features.

2. The computer-implemented method of claim 1, wherein determining the spacing error comprises determining counterfactual examples for the third set of features, and wherein determining the production error is based on the counterfactual examples and true values of the third set of features, the computer-implemented method further comprising:

applying the production error to the fourth set of features to generate counterfactual production values.

3. The computer-implemented method of claim 2, further comprising:

training a fourth machine-learning model to predict the counterfactual production values based on the first set of features and the second set of features, wherein the third set of features is not considered by the fourth machine-learning model.

4. The computer-implemented method of claim 2, wherein the counterfactual examples for the third set of features are based on a measure of central tendency as applied to the spacings between wells.

5. The computer-implemented method of claim 1, wherein each of the first machine-learning model, the second machine-learning model, and the third machine-learning model incorporate decision trees, random forests, extremely randomized trees, or gradient boosting trees.

6. The computer-implemented method of claim 1, wherein the first set of features includes one or more of: a number of stimulation stages, fluid and/or volume thereof used during drilling, pressures applied during drilling, proppant type, proppant amount, completion type, or drilling equipment type.

7. The computer-implemented method of claim 1, wherein the second set of features includes one or more of: type of geological formations, brittleness of rock, or porosity of rock.

8. The computer-implemented method of claim 1, wherein the third set of features includes one or more of: horizontal spacing, vertical spacing, number of wells per the locations, or distance to a parent well.

9. The computer-implemented method of claim 1, further comprising:

based on use of the third machine-learning model, determining a scaling factor that represents a difference between attested per-well spacings and spacing values across the wells in the training data;

obtaining, from a pre-trained machine learning model, a predicted production for a new well given instances of the first set of features, the second set of features, and the third set of features as relating to the new well; and

modifying the predicted production by the scaling factor.

10. The computer-implemented method of claim 1, wherein the spacings between wells are in terms of median spacing values.

11. A computer-implemented method comprising:

obtaining, from a first pre-trained machine-learning model, a predicted production for a new well given instances of a first set of features and a second set of features as relating to the new well, wherein the first pre-trained machine-learning model was trained with first set of features related to drilling of a set of wells and the second set of features related to respective geological properties of locations at which the wells were drilled, and wherein the predicted production represents a counterfactual spacing scenario specified during training of the first pre-trained machine-learning model;

obtaining, from a second pre-trained machine-learning model, a predicted scaling factor that represents a difference between attested per-well spacings and counterfactual spacing values across the wells represented in the first set of features and the second set of features used train the first pre-trained machine-learning model; and

modifying the predicted production by the predicted scaling factor.

12. The computer-implemented method of claim 11, wherein a first additional machine-learning model was trained to predict a third set of features related to spacings between the wells given the first set of features and the second set of features, the computer-implemented method further comprising:

determining a spacing error based on a difference between the third set of features used to train the first additional machine-learning model and as predicted by the first additional machine-learning model.

13. The computer-implemented method of claim 12, wherein a second additional machine-learning model was trained to predict a fourth set of features relating to production output of the wells given the first set of features and the second set of features, the computer-implemented method further comprising:

determining a production error based on a difference between the fourth set of features used to train the second additional machine-learning model and as predicted by the second additional machine-learning model.

14. The computer-implemented method of claim 13, wherein a third additional machine-learning model was trained to predict the production error given the spacing error, the first set of features, and the second set of features, wherein the third set of features is not considered by the third additional machine-learning model, and wherein the predicted scaling factor is determined based on use of the third additional machine-learning model.

15. The computer-implemented method of claim 14, wherein the production error is usable to estimate a further production output of wells with spacings not represented in the third set of features.

16. The computer-implemented method of claim 14, wherein each of the first additional machine-learning model, the second additional machine-learning model, and the third additional machine-learning model incorporate decision trees, random forests, extremely randomized trees, or gradient boosting trees.

17. The computer-implemented method of claim 8, wherein the spacings between wells are based on median spacing values.

18. A non-transitory computer-readable medium, having stored thereon program instructions that, upon execution by a computing system, cause the computing system to perform operations comprising:

obtaining, from a first pre-trained machine-learning model, a predicted production for a new well given instances of a first set of features and a second set of features as relating to the new well, wherein the first pre-trained machine-learning model was trained with first set of features related to drilling of a set of wells and the second set of features related to respective geological properties of locations at which the wells were drilled, and wherein the predicted production represents a counterfactual spacing scenario specified during training of the first pre-trained machine-learning model;

obtaining, from a second pre-trained machine-learning model, a predicted scaling factor that represents a difference between attested per-well spacings and counterfactual spacing values across the wells represented in the first set of features and the second set of features used train the first pre-trained machine-learning model; and

modifying the predicted production by the predicted scaling factor.

19. The non-transitory computer-readable medium of claim 18, wherein the first set of features includes one or more of: a number of stimulation stages, fluid and/or volume thereof used during drilling, pressures applied during drilling, proppant type, proppant amount, completion type, or drilling equipment type, wherein the second set of features includes one or more of: type of geological formations, brittleness of rock, or porosity of rock, and wherein the third set of features includes one or more of: horizontal spacing, vertical spacing, number of wells per the locations, or distance to a parent well.

20. The non-transitory computer-readable medium of claim 18, wherein the spacings between wells are based on median spacing values.