🔗 Share

Patent application title:

METHODS AND SYSTEMS FOR PREDICTION OF AN OPTIMAL RANGE FOR GAS LIFT INJECTION AND OPTIMAL LIQUID PRODUCTION RATE FOR OIL WELLS

Publication number:

US20260015933A1

Publication date:

2026-01-15

Application number:

19/269,192

Filed date:

2025-07-15

Smart Summary: A method has been developed to help oil wells produce fluids more efficiently. It uses data from the well and reservoir to create machine-learning models that can predict if gas lift should be used and what the best gas lift values are. This approach eliminates the need for physical interventions in the well, which can cause delays and extra costs. It also avoids the common trial-and-error methods that can lead to biases and inefficiencies. Overall, this system allows for better decision-making regarding gas lift and liquid production rates in oil wells. 🚀 TL;DR

Abstract:

Embodiments relate to acquiring well, reservoir and production data, synthesizing training and test data, and then constructing, training, and utilizing machine-learning models to: (i) predict whether or not gas lift should be applied to facilitate the production of subsurface fluids from an oil well, (ii) predict an optimal range of gas lift values to be used in the production of fluids from the oil well, and (iii) predict an optimal liquid production rate for the oil well when the gas lift value is within the predicted optimal range. Unlike traditional approaches, the disclosed embodiments do not require the use of well interventions, which eliminates production losses, delays, and costs. The disclosed embodiments also avoid the delays, biases, and non-optimized values associated with existing trial-and-error-based approaches to gas lift injection optimization. Disclosed embodiments enable the efficient and reliable determination of the optimal range of gas lift values and the optimal liquid production rate.

Inventors:

Siddharth Misra 1 🇺🇸 College Station, TX, United States
Sandro Priatmojo Moelyono 1 🇺🇸 College Station, TX, United States

Applicant:

THE TEXAS A&M UNIVERSITY SYSTEM 🇺🇸 College Station, TX, United States

Interested in similar patents?

Get notified when new applications in this technology area are published.

Create Free Alert

Classification:

E21B44/02 » CPC main

Automatic control systems specially adapted for drilling operations, i.e. self-operating systems which function to carry out or modify a drilling operation without intervention of a human operator, e.g. computer-controlled drilling systems ; Systems specially adapted for monitoring a plurality of drilling variables or conditions Automatic control of the tool feed

E21B21/08 » CPC further

Methods or apparatus for flushing boreholes, e.g. by use of exhaust air from motor Controlling or monitoring pressure or flow of drilling fluid, e.g. automatic filling of boreholes, automatic control of bottom pressure

E21B2200/22 » CPC further

Special features related to earth drilling for obtaining oil, gas or water Fuzzy logic, artificial intelligence, neural networks or the like

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application Ser. No. 63/671,296, entitled “Methods and Systems for Prediction of an Optimal Range for Gas Lift Injection and Optimal Liquid Production Rate for Oil Wells,” filed Jul. 15, 2024, the contents of which are incorporated by reference herein, for all purposes, in their entirety.

TECHNICAL FIELD

The technical field of the present disclosure generally relates to petroleum engineering and, more specifically, to the use of machine-learning models to predict an optimal range of gas lift values and an optimal liquid production rate for oil wells to facilitate the production of subsurface fluids.

BACKGROUND

One technique and technology used in the production of subsurface fluids in oil wells is gas lift injection. Gas lift is an artificial lift method that uses gas to facilitate the lifting of liquids from wellbore to the surface. The gas, which is often natural gas, is pressurized at the surface then injected into the wellbore. The injected gas reduces the density of the fluid in the tubing or inner casing, which lowers a flowing bottom hole pressure of the wellbore and results in higher liquid production. It is generally desirable for the gas lift injection rate to be adjusted or optimized to achieve an optimal liquid production rate.

One existing method of optimizing the gas lift injection rate involves the use of bottom hole pressure (BHP) survey data. For example, a petroleum engineer can use the flowing bottom hole pressure and reservoir pressure data to generate a gas lift performance curve, and then use nodal analysis to determine an optimal gas lift injection rate and an optimal liquid production rate from gas lift performance curve. However, the BHP survey involves a well intervention that is performed using a slickline unit disposed within the wellbore to record pressures and temperatures while the well is flowing and while the well is in a shut-in state. For example, when the well is flowing, a tool is lowered into the wellbore and records the pressure and temperature at each depth of interest. The tool will stop at each depth for some period to collect measurements, and then advance to other depths to continue data collection. After collecting measurements while the well is flowing, the well is then shut-in for several hours (e.g., approximately 12 hours) before the measurement process is repeated to determine the reservoir pressure conditions. In general, it can require two days or more to conduct the BHP survey, and additional time is then needed for data analysis to select a suitable gas lift rate. Having the well in a shut-in state results in an undesirable loss in production, and engineers are often forced to explain and justify why and how long the well will remain in the shut-in state. If the shut-in period is not sufficiently long, the well may not have time to stabilize, which can result in unreliable BHP measurements. The slickline unit may also be needed elsewhere for other activities, such as fishing jobs or workover jobs, and the more time that the slickline unit is tied up with performing BHP surveys, the more that these other jobs could be undesirably delayed. Since a given reservoir may include several wells that utilize gas lift, a BHP survey would need to be conducted on each of these wells to determine suitable gas lift rates for each well, undesirably resulting in substantial lost time and lost production.

Another existing method of optimizing the gas lift injection rate involves the use of production test history data in a trial-and-error-based approach. For this approach, instead of relying on BHP survey data, a petroleum engineer observes the performance of the well based on the production test history, including the gas lift injection rate, the production test data, and surface parameter data. Based on this historical data, the engineer adjusts the value of the gas lift injection rate, and then continues to observe the well for several days using production test data, as well as the performance data of other wells and stock-tank liquid data. After some period of time, if the liquid production rate remains unsatisfactory, the engineer may further modify the gas lift injection rate and repeat the process until the liquid production rate is acceptable. However, this trial-and-error-based approach can be undesirably biased by the historical production test history data and by the engineer's interpretation of this data. This method generally results in slow, incremental changes to the gas lift injection rate and the liquid production rate. While the gas lift injection rate and the liquid production rate may incrementally improve over several iterations, the method is slow and does not ensure that the resulting gas lift injection rate and liquid production rate are truly optimized, as opposed to merely representing localized maximum values.

SUMMARY

To address the shortcomings set forth above, embodiments are described herein for constructing, training, and utilizing machine-learning models to: (i) predict whether or not gas lift should be applied to facilitate the production of fluids from an oil well, (ii) predict an optimal range of gas lift values to be used in the production of fluids from the oil well, and (iii) predict an optimal liquid production rate for the oil well when the gas lift value is within the predicted optimal range. The disclosed embodiments do not require the use of BHP surveys, which eliminates production losses, delays, and costs associated with the performance of such surveys. The disclosed embodiments also avoid the delays, biases, and non-optimized values associated with the use of existing trial-and-error-based approaches to determine suitable gas lift injection rates and liquid production rates. Additionally, the disclosed embodiments enable the efficient and reliable determination of the optimal range of gas lift values and the optimal liquid production rate.

Embodiments include technological methods, systems, and tangible computer-readable media having instructions thereupon for execution by a processor to perform a method. One example embodiment is a computing apparatus or system having at least one memory and at least one processor configured to execute instructions stored in the memory to perform actions. The actions include receiving, from one or more data sources, well data, reservoir data, and historical production data related to an oil well. The actions include generating synthetic datasets from the well data, reservoir data, and historical production data, each synthetic dataset including a plurality of samples. The actions include, for each sample of the plurality of samples of each synthetic dataset, performing nodal analysis of the sample to generate a respective gas lift performance curve and to calculate corresponding target values, the corresponding target values including applicability of gas lift injection, a target minimum gas lift injection rate, a target maximum gas lift injection rate, and a target optimal liquid production rate for the sample. The actions include generating engineered features from the well data, the reservoir data, and the historical production data. The actions include selecting features from the well data, the reservoir data, the historical production data, and the engineered features, and constructing classification and regression models using the selected features as inputs to the models. The actions include using a training portion of the plurality of samples of the synthetic datasets and the corresponding target values to train the classification and regression models to generate a trained classification model and a trained regression model. In such embodiments, the trained classification model may be configured to predict applicability of gas lift injection to the oil well and the trained regression model may be configured to predict optimal minimum gas lift injection rates, optimal maximum gas lift injection rates, and optimal liquid production rates for the oil well.

In some embodiments, the actions may include using a testing portion of the plurality of samples of the synthetic datasets and the corresponding target values to evaluate the trained classification model and the trained regression model. The actions may also include selecting the best-performing trained classification model to predict applicability of gas lift injection to the oil well, and selecting the best-performing trained regression models to predict optimal minimum gas lift injection rates, optimal maximum gas lift injection rates, and optimal liquid production rates for the oil well.

In some embodiments, the actions include: receiving current production data from one or more components of the oil well; providing portions of the well data, reservoir data, and the current production data as inputs to features of the selected trained classification model and the selected trained regression models; in response to the inputs, receiving, from the selected trained classification model, a binary output indicating whether or not gas lift injection is applicable for use in the oil well; and in response to the inputs, receiving, from the selected trained regression models, a predicted minimum gas lift injection rate and a predicted maximum gas lift injection rate for the oil well, as well as a predicted optimal liquid production rate for the oil well when the predicted minimum gas lift injection rate is applied. In some embodiments, the actions further include providing control signals to a gas lift compressor and/or a gas lift control valve of the oil well to provide gas lift injection at the predicted minimum gas lift injection rate.

In some embodiments, the actions include: determining, based on the nodal analysis, a sensitivity of the liquid production rate to changes in the gas lift injection rate for each sample; and assigning a sensitivity value to each sample based on the determined sensitivity. In such embodiments, the respective gas lift performance curve may be generated by plotting the sensitivity of the liquid production to changes in the gas lift injection rate. In some embodiments, constructing the classification and the regression models includes defining a logistic regression model as the classification model, and defining a gradient boosting regression model and a random forest regression model as the regression models. In some embodiments, each synthetic dataset includes at least 200 samples. In certain embodiments, the well data may be selected from the group consisting of well inclination, the inner diameter of the tubing or tubing identification, the inner diameter of the wellbore casing or casing identification, the depths at which the gas lift valves are installed (GLV depths), and the pressure decrease across the gas lift valves (dp Loss Actoss Valve). In some embodiments, the reservoir data may be selected from the group consisting of reservoir depth, reservoir pressure, reservoir thickness, permeability, skin, and oil density. In certain embodiments, the historical production data may be selected from the group consisting of wellhead pressure, gas lift injection pressure, gas lift injection rate, water cut, and gas-oil ratio (GOR).

Embodiments of the present disclosure also include methods for enhancing oil production in an oil well. One example embodiment is a method that includes providing a controller in signal communication with a gas lift compressor and/or a gas lift control valve. The controller may comprise at least one memory and at least one processor configured to execute instructions stored in the memory, the at least one memory storing a trained optimal gas lift injection range model, the trained optimal gas lift injection range model constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values, the selected features comprising raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well, the synthetic data sets generated from the well data, reservoir data, and historical production data, the corresponding target values calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset, the corresponding target values comprising a target minimum gas lift injection rate and a target maximum gas lift injection rate. The example method may also include determining, using the trained optimal gas lift injection range model, a predicted optimal range of gas lift injection values for the oil well, the predicted optimal range extending from a predicted minimum gas lift injection rate to a predicted maximum gas lift injection rate. The example method may also include providing control signals to the gas lift compressor and/or the gas lift control valve so as to provide gas lift at the predicted minimum gas lift injection rate.

In some embodiments, the example method may further include a controller comprising at least one memory that further stores a trained optimal liquid production model. In such embodiments, the trained optimal liquid production model may be constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values. The selected features may comprise raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well. The synthetic data sets may be generated from the well data, reservoir data, and historical production data. The corresponding target values may be calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset. The corresponding target values may comprise a target optimal liquid production rate. In such embodiments, the method may further include determining, using the trained optimal liquid production model, a predicted optimal liquid production rate when gas lift is provided to the oil well at the predicted minimum gas lift injection rate.

In some embodiments, the example method may further include a controller comprising at least one memory that further stores a trained gas lift injection applicability classification model. The trained gas lift injection applicability classification model may be constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values. The selected features may comprise raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well. The synthetic data sets may be generated from the well data, reservoir data, and historical production data. The corresponding target values may be calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset. The corresponding target values may comprise applicability of gas lift injection to the oil well. In such embodiments, the method may further include determining, using the trained gas lift injection applicability classification model, whether gas lift injection is applicable for use in the oil well.

In some embodiments, the example method may further include receiving, at the controller, well data, reservoir data, and current production data relating to the oil well. In certain embodiments, the method may further include fine-tuning, at the controller, the trained optimal gas lift injection range model, the trained optimal liquid production model, and the trained gas lift injection applicability classification model based on an actual production rate of the well and the predicted optimal production rate as a result of providing gas lift at the predicted minimum gas lift injection rate.

In some embodiments, the example method may further include providing at least a portion of the well data, the reservoir data, and the current production data as inputs to features of the trained gas lift injection applicability classification model. In some embodiments, the method may also include determining, using the trained gas lift injection applicability classification model, a binary output indicating whether or not gas lift injection continues to be applicable for use in the oil well.

Embodiments of the present disclosure also include an oil production system. One example embodiment is a system that includes a tubing disposed in a wellbore and fluidly coupling an interior of the wellbore to a wellhead. The example system may also include a gas lift compressor operable to receive and compress a gas from a gas source to produce a compressed gas. The example system may also include a gas lift control valve operable to receive the compressed gas produced by the gas lift compressor and direct the compressed gas into the wellbore on the outside of the tubing at a gas lift injection rate. The system may also include a controller in signal communication with the gas lift compressor and/or the gas lift control valve and communicatively connected to one or more data sources. The controller may also be configured to receive well data, reservoir data, and historical production data from the one or more data sources. The controller may include at least one memory and at least one processor configured to execute instructions stored in the memory. The at least one memory may store a trained optimal gas lift injection range model. The trained optimal gas lift injection range model may be constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values. The selected features may comprise raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well. The synthetic data sets may be generated from the well data, reservoir data, and historical production data. The corresponding target values may be calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset. The corresponding target values may comprise a target minimum gas lift injection rate and a target maximum gas lift injection rate. The controller may be further configured to determine, using the trained optimal gas lift injection range model, a predicted optimal range of gas lift injection values for the oil well. The predicted optimal range may extend from a predicted minimum gas lift injection rate to a predicted maximum gas lift injection rate. The controller may be further configured to provide control signals to the gas lift compressor and/or the gas lift control valve to automatically modify the gas lift injection rate to be within the predicted optimal range of gas lift injection rate so as to facilitate an optimal liquid production rate of the produced fluids. The one or more data sources may be selected from the group consisting of the gas lift compressor and/or the gas lift control valve, a well and reservoir data source, a fluid analyzer, one or more pressure sensors, one or more temperature sensors, and one or more flow rate sensors.

In some embodiments, the controller may include at least one memory that further stores a trained optimal liquid production model. The trained optimal liquid production model may be constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values. The selected features may comprise raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well. The synthetic data sets may be generated from the well data, reservoir data, and historical production data. The corresponding target values may be calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset. The corresponding target values may comprise a target optimal liquid production rate. In such embodiments, the controller may be further configured to determine, using the trained optimal liquid production model, a predicted optimal liquid production rate when gas lift is provided to the oil well at the predicted minimum gas lift injection rate.

In some embodiments, the controller may include at least one memory that further stores a trained gas lift injection applicability classification model. The trained gas lift injection applicability classification model may be constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values. The selected features may comprise raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well. The synthetic data sets may be generated from the well data, reservoir data, and historical production data. The corresponding target values may be calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset. The corresponding target values may comprise the applicability of gas lift injection. In such embodiments, the controller may be further configured to determine, using the trained gas lift injection applicability classification model, whether gas lift injection is applicable for use in the oil well.

In some embodiments, the example system may include a controller that is further configured to: receive current production data from the one or more data sources; provide at least a portion of the well data, the reservoir data, and the current production data as inputs to features of the trained gas lift injection applicability classification model; and determine, using the trained gas lift injection applicability classification model, a binary output indicating whether or not gas lift injection continues to be applicable for use in the oil well.

Embodiments of the present disclosure also include additional methods for enhancing oil production in an oil well. One example embodiment is a method that includes providing well data, reservoir data, and historical production data related to an oil well; generating synthetic datasets based on the well data, reservoir data, and historical production data, each synthetic dataset including a plurality of samples; performing nodal analysis of each sample of the plurality of samples of each synthetic dataset to generate a respective gas lift performance curve and to calculate corresponding target values, the corresponding target values including applicability of gas lift injection, a target minimum gas lift injection rate, a target maximum gas lift injection rate, and a target optimal liquid production rate for the sample; generating engineered features from the well data, the reservoir data, and the historical production data; selecting features from the well data, the reservoir data, the historical production data, and the engineered features; constructing a classification model and a regression model using the selected features as inputs to the models; and training the classification model and the regression model using a training portion of the plurality of samples of the synthetic datasets and the corresponding target values to generate a trained classification model and a trained regression model. In such embodiments, the trained classification model may be configured to predict applicability of gas lift injection to the oil well and the trained regression model may be configured to predict optimal minimum gas lift injection rates, optimal maximum gas lift injection rates, and optimal liquid production rates for the oil well.

In some embodiments, the example method may also include: providing a controller in signal communication with a gas lift compressor and/or a gas lift control valve, the controller comprising at least one memory and at least one processor configured to execute instructions stored in the memory, the at least one memory storing the trained classification model and the trained regression model; determining, using the trained regression model, a predicted optimal range of gas lift injection values for the oil well, the predicted optimal range extending from a predicted minimum gas lift injection rate to a predicted maximum gas lift injection rate; and providing one or more control signals to the gas lift compressor and/or the gas lift control valve so as to provide gas lift at the predicted minimum gas lift injection rate.

In some embodiments, the example method may also include: evaluating the trained classification model and the trained regression model using a testing portion of the plurality of samples of the synthetic datasets and the corresponding target values; selecting a best-performing trained classification model to predict applicability of gas lift injection to the oil well; and selecting a best-performing trained regression model to predict optimal minimum gas lift injection rates, optimal maximum gas lift injection rates, and optimal liquid production rates for the oil well.

In some embodiments, the example method may also include: receiving current production data from one or more components of the oil well; providing portions of the well data, reservoir data, and the current production data as inputs to features of the selected trained classification model and the selected trained regression models; in response to the inputs, receiving, from the selected trained classification model, a binary output indicating whether or not gas lift injection is applicable for use in the oil well; and in response to the inputs, receiving, from the selected trained regression models, a predicted minimum gas lift injection rate and a predicted maximum gas lift injection rate for the oil well, as well as a predicted optimal liquid production rate for the oil well when the predicted minimum gas lift injection rate is applied.

Aspects and advantages of these exemplary embodiments and other embodiments, are discussed herein. Moreover, it is to be understood that both the foregoing information and the following detailed description provide merely illustrative examples of various aspects and embodiments, and are intended to provide an overview or framework for understanding the nature and character of the claimed aspects and embodiments. Accordingly, these and other objects, along with advantages and features of the present disclosure, will become apparent through reference to the following description and the accompanying drawings. Furthermore, it is to be understood that the features of the various embodiments described herein are not mutually exclusive and may exist in various combinations and permutations.

BRIEF DESCRIPTION OF THE DRAWINGS

The described embodiments and the advantages thereof may best be understood by reference to the following description taken in conjunction with the accompanying drawings. These drawings in no way limit any changes in form and detail that may be made to the described embodiments by one skilled in the art without departing from the spirit and scope of the described embodiments.

FIG. 1A is a diagrammatic representation of an oil well that utilizes gas lift, according to an embodiment.

FIG. 1B is a diagrammatic representation of a data-driven method for generating, training, and evaluating machine-learning models for predicting the applicability of gas lift injection to a well, an optimal gas lift injection range, and an optimal liquid production rate, according to an embodiment.

FIGS. 2A, 2B, and 2C are graphical representations of example gas lift performance curves (GLPCs) generated as part of the data generation workflow, according to an embodiment.

FIGS. 3A-3H are graphical representations illustrating association between selected features and targets, as expressed in terms of F1 score and mutual information (MI), according to an embodiment.

FIG. 4A is a graphical representation of a comparison between predicted minimum gas lift injection rate values and actual minimum gas lift injection rate values, according to an embodiment.

FIG. 4B is a graphical representation of a comparison between predicted maximum gas lift injection rate values and actual maximum gas lift injection rate values, according to an embodiment.

FIG. 4C is a graphical representation of a comparison between predicted optimal liquid production rate values and actual liquid production rate values, according to an embodiment.

FIG. 5A is a graphical representation illustrating the effect, in terms of F1 score and associated uncertainty of the F1 score, of reduced dataset size on the classification task of identifying the applicability of a gas lift injection for a well, according to an embodiment.

FIG. 5B is a graphical representation illustrating the effect, in terms of Mean Absolute Percentage Error (MAPE), of reducing the dataset size on the regression tasks of estimating the optimal range of gas lift injection rates and the optimal liquid production rate, according to an embodiment.

FIG. 6 is a diagrammatic representation of a controller of an oil well that is capable of creating and applying the machine-learning models for predicting the applicability of gas lift injection to a well, an optimal gas lift injection range, and an optimal liquid production rate, to controlling the operation of an oil well, according to an embodiment.

FIG. 7 is a diagrammatic representation of a method in which the controller of an oil well uses the trained machine-learning models to predict the applicability of gas lift injection to a well, an optimal gas lift injection range, and an optimal liquid production rate, while controlling operation of the oil well.

DETAILED DESCRIPTION

Disclosed herein are various embodiments, including systems, technological methods, and tangible computer-readable media, related to machine-learning models for gas lift injection optimization, presented as technological improvements in the application of machine-learning technology, petroleum-engineering technology, and gas lift injection technology. It should be appreciated that the embodiments presented herein, and further, readily developed embodiments, may use various combinations of the examples, features, data sources, synthetic data-generation schemes, algorithms, operations, structures, parameters, and equivalents thereof, in keeping with the teachings herein.

The present disclosure describes various embodiments related to systems and methods for gas lift injection optimization of oil wells. The description may use the phrases “in certain embodiments,” “in various embodiments,” “in an embodiment,” or “in embodiments,” which may each refer to one or more of the same or different embodiments. Furthermore, the terms “comprising,” “including,” “having,” and the like, as used with respect to embodiments of the present disclosure, are synonymous. The term “plurality” as used herein refers to two or more items or components. The terms “about” or “approximately” are defined as being close to as understood by one of ordinary skill in the art. In one non-limiting embodiment, these terms are defined to be within 10%, preferably within 5%, more preferably within 1%, and most preferably within 0.5%. The use of the words “a” or “an” when used in conjunction with any of the terms “comprising,” “including,” “includes,” or “having,” in the claims or the specification may mean “one,” but it is also consistent with the meaning of “one or more,” “at least one,” and “one or more than one.”

Present embodiments are directed to an approach that combines synthetic data-generation and machine-learning models to predict, within the use of well interventions, (i) whether gas lift injection is applicable to an oil well, (ii) a range of optimal gas lift injection rates to be used at the oil well, and (iii) an optimal liquid production rate expected when the gas lift injection is within the optimal range. Present embodiments predict minimum and maximum values for optimal gas lift injection (also referred herein as the range for optimal gas lift injection, optimal range of the gas lift injection rate, or the optimal gas lift injection range), which is a new concept for gas lift production optimization. That is, instead of predicting a single optimal gas lift injection value, present embodiments are capable of identifying a range of optimal gas lift injection values that will lead to an optimal liquid production rate. Present embodiments can be used, for example, to accelerate the determination of the optimal range of gas lift injection rates for an oil well by eliminating the need for a well intervention job, which may further aid well intervention planning. In some embodiments, the trained machine-learning models can be deployed within a control system of an oil well to automatically control the gas lift injection rate based on the predicted optimal range of gas lift injection rates and the predicted liquid production rate.

For the embodiments described herein, the technique involves the collection of well data, reservoir data, and historical production data related to an oil well from various data sources. The collected data may include surface measurement data (e.g., wellhead pressure, gas lift injection pressure) and subsurface data (e.g., reservoir depth, reservoir pressure, reservoir thickness, permeability, skin, and oil density), and may include well, reservoir, and/or production data from the oil well and/or surrounding wells with similar geological and reservoir properties. Using the collected data, synthetic datasets are generated with samples that represent synthetic well production performance under a number of different conditions. For each combination of parameter values or sample of the synthetic dataset, a suitable scripting language (e.g., Visual Basic, Python) and petroleum-engineering software may be used to perform nodal analysis and generate a respective gas lift performance curve (GLPC). By analyzing GLPCs, a target range of optimal gas lift injection values and a target liquid production rate is determined for each combination of parameter values. In some embodiments, the system is capable of producing many (e.g., 1000, 500, 200) gas lift performance curves for each different production parameter, gas-oil ratio (GOR) value, and reservoir pressure in a short time window (e.g., about six hours or less) on a traditional computing workstation. Additionally, machine-learning models are developed using these datasets in various embodiments, with surface parameters and well data as input features and the range of optimal gas lift injection rates and the optimal liquid production rate as output targets. For example, in some embodiments, evaluation begins with a classification analysis to determine the applicability of gas lift injection to the oil well, followed by regression analysis to establish the range of optimal gas lift injection rates and the optimal liquid production rate. Embodiments disclosed herein benefit production engineers and field operations, offering a faster and more efficient method for optimizing gas lift injection rates, ultimately leading to enhanced oil production. For example, certain embodiments could save substantial time expenditure (e.g., 27 hours) compared to the traditional approach involving well intervention jobs, and present embodiments enable the generation of gas lift injection rate recommendations in under an hour once the models have been trained. As such, present embodiments may substantially reduce or prevent production losses.

In the present disclosure, a data-driven workflow or method is presented for preparing the machine-learning models for prediction, starting from collecting data, generating synthetic datasets, building, training, and evaluating the machine-learning models, resulting in trained machine-learning models capable of providing (or automatically implementing) gas lift optimization recommendations. These recommendations include (i) the applicability of gas lift injection to the oil well, (ii) a predicted minimum gas lift injection rate and a predicted maximum gas lift injection rate of an optimal range of gas lift injection rates, and (iii) the predicted optimal liquid production rate when the gas lift injection rate is within the optimal range of gas lift injection rates. The importance of optimal gas lift injection rate lies in enabling the operator to ascertain the precise amount of injection for the well to achieve optimal hydrocarbon production. Similarly, understanding the optimal liquid production rate is beneficial for the operator to anticipate the well's potential flow when using the optimal gas lift injection rate. Using the embodiments disclosed herein, operators can anticipate a reduction in well interventions, mitigate the need for well shut-ins, and expedite the optimization of the gas lift injection process, ultimately leading to increased oil production efficiency.

FIG. 1A is a diagrammatic representation of an embodiment of an oil well 100 that utilizes gas lift. The oil well 100 includes a wellbore 102 with an outer casing 103 having an inner casing or tubing 104 disposed therein. A wellhead 106 is fluidly coupled to the tubing 104. During operation, a gas lift compressor 108 receives and compresses a gas (e.g., natural gas) from a gas source 110, in which the gas may be gas previously produced by the oil well 100 or a nearby well. The compressed gas is directed through a gas lift control valve 109 and pumped into the wellbore 102 on the outside of the tubing 104, as indicated by dashed arrows 112. The tubing 104 includes a number of gas lift valves 114 disposed at different depths within the wellbore 102. Once the pressure of the gas within the wellbore 102 outside of the tubing 104 reaches a threshold value at the depth of each of the gas lift valves 114, the gas lift valves respond by opening to allow the gas to enter the tubing 104 to reduce the density of the fluid disposed therein.

For the embodiment illustrated in FIG. 1A, the oil well 100 includes at least one controller 116 that is designed to monitor and control at least some of the aspects of operation of the oil well 100. As such, the controller 116 is communicatively connected (e.g., via a suitable wired or wireless communication channel) to various equipment of the oil well 100 to receive monitoring or measurement data and to provide control signals to modify the operation of the equipment in response to the monitoring data. For example, the controller 116 is communicatively connected to the gas lift compressor 108 and/or the gas lift control valve 109 to monitor and control parameters related to the gas lift injection, including the gas lift injection pressure and the gas lift injection rate. The controller 116 may also be connected to various sensors 118 that are in fluid communication with the wellbore 102, such as pressure sensors, temperature sensors, flow rate sensors, and so forth, that measure operational data of the oil well 100, such as the wellhead pressure. The controller 116 may also be communicatively connected to a fluid analyzer 120 that is capable of measuring properties of the produced fluids 122 extracted from the oil well 100, such as a production rate, a water cut, and/or a gas-oil ratio (GOR) of the produced fluids 122.

Embodiments disclosed herein generally involve the use of a computing device to predict (i) the applicability of gas lift injection to the operation of the oil well, (ii) an optimal range of gas lift injection rates to be used with the oil well, and (iii) an optimal liquid production rate when the gas lift injection rate is in the optimal range, without the need for well interventions. In some embodiments, the computing device may be the controller 116 or another computing device that is communicatively connected to the controller 116. As such, in certain embodiments, the controller 116 may also be communicatively connected to other data sources, such as a well and reservoir data source 124, to receive well data and reservoir data related to the oil well 100. As discussed below, using the well and reservoir data received from the data source 124 and the monitoring data received from the various components (e.g., sensors 118, fluid analyzer 120, gas lift compressor 108, gas lift control valve 109) of the oil well 100, the controller 116 may generate a synthetic dataset that is subsequently used to train and evaluate machine-learning models to make the aforementioned predictions. Furthermore, in some embodiments, the controller 116 may provide control signals to the gas lift compressor 108 and/or the gas lift control valve 109 to automatically modify the gas lift injection rate to be within the predicted optimal range of gas lift injection rates to facilitate an optimal liquid production rate of the produced fluids 122.

FIG. 1B is a diagrammatic representation of an embodiment of a data-driven method 140 for generating, training, and evaluating machine-learning models for predicting (i) the applicability of gas lift injection to a well, (ii) an optimal gas lift injection range, and (iii) an optimal liquid production rate. The method 140 is discussed with reference to elements illustrated in FIGS. 1A, FIGS. 2A-C, FIGS. 3A-H, FIGS. 4A-C, and FIGS. 5A-B. In some embodiments, the method 140 may be partially or entirely implemented as computer-implemented instructions that are stored in a memory and executed by one or more processors of a suitable computing apparatus or system, such as the controller 116 illustrated in FIG. 1A.

For the embodiment illustrated in FIG. 1B, the method 140 begins with the step 142 of the controller 116 receiving well data, reservoir data, and historical production data related to an oil well. For example, in some embodiments, the well data includes, but is not limited to: the well inclination, the inner diameter of the tubing 104 (Tubing ID), the inner diameter of the wellbore casing 103 (Casing ID), the depths at which the gas lift valves 114 are installed (GLV depths), and the pressure decrease across the gas lift valves 114 (dp Loss Across Valve). In some embodiments, the reservoir data includes, but is not limited to: reservoir depth, reservoir pressure, reservoir thickness, permeability, skin, and oil density. In some embodiments, the historical production data includes, but is not limited to: wellhead pressure, gas lift injection pressure, gas lift injection rate, water cut, and gas-oil ratio (GOR). The well data, reservoir data, and historical production data is obtained from various sources located at various sites (e.g., in the field, at a remote office or datacenter). By integrating with these data sources, certain embodiments are capable of obtaining the latest information about the properties of interest, which is important for the desired computations and predictions. For example, well data is typically sourced from wellbore diagram or completion reports, while reservoir data is typically extracted from open-hole log data or petrophysical analysis conducted after drilling. Historical production data (also referred to herein as historical production parameters) are derived from measurements taken at the surface (e.g., by sensors 118, fluid analyzer 120, gas lift compressor 108, gas lift control valve 109). In some embodiments, these data sources are used retrieve or collect data for nodal analysis to perform gas lift injection optimization, as discussed below.

For the embodiment illustrated in FIG. 1B, the method 140 continues with the step 144 of the controller 116 using the well data, the reservoir data, and the historical production data to generate a set of synthetic datasets and to generate a plurality of samples (or combinations of parameter values) for each dataset. For example, in some embodiments, five synthetic datasets are generated, while in other embodiments, additional or fewer datasets may be generated. These datasets portray diverse production scenarios based on the data received from different data sources. In some embodiments, the datasets include (A) datasets illustrating production performance from the same well with different reservoir targets. For example, when a current reservoir layer's production rate is low or no longer economically viable, a workover job (e.g., a perforation job) is typically executed. The workover job is undertaken to isolate the existing reservoir layer and perforate a new reservoir layer in the wellbore, and this shift in reservoir targets will lead to distinct production behaviors compared to the previous layer. In an example embodiment, the datasets generated at step 144 may include datasets illustrating production performance from the same well with different reservoir targets, including dataset 1 (DS-1), dataset 2 (DS-2), dataset 3 (DS-3), and dataset 4 (DS-4), wherein DS-1 and DS-2 belong to same well but different reservoirs, and wherein DS-3 and DS-4 belong to same well but different reservoirs. For the example embodiment, the datasets also include datasets (B) illustrating production performance from different wells and reservoirs. In general, these datasets are generated to capture dynamic production behaviors observed across different wells and reservoirs. For the example embodiment, the datasets illustrating production performance from different wells and reservoirs include DS-1, DS-3, and dataset 5 (DS-5). These two types of datasets (i.e., A and B) represent the dynamics of activities in the field that are faced by engineers and field operators. As discussed below, the predictive machine-learning models are built from these datasets, which possess the capacity to predict targets under diverse dynamic conditions.

The well data used in the example embodiment is shown in Table 1. The well inclination is measured at the inclination of the reservoir depth. As noted, a gas lift valve (GLV) is a valve which controls the injection of gas lift into the tubing from the wellbore casing. Each gas lift valve is located at a specific depth in the wellbore. To determine the number of gas lift valves to include when designing or planning a well, several factors need to be considered. These include, but are not limited to: gas lift injection pressure, total well depth, well inclination, number of reservoir layers, reservoir layer depths, reservoir pressures, well productivity, wellbore dimensions and conditions, type of completion fluid, and desired production rate. The value of GLV depth for each dataset is derived from field data wells with specified GLV depths. For the example embodiment, the GLV depths were determined by considering the previously mentioned factors.

TABLE 1

Well data used to generate synthetic datasets and
perform nodal analysis for an example embodiment.

	Well			Casing	dp Loss
Datasets	Inclination	GLV	Tubing	ID	Across
Name	(deg)	depth (ft)	ID (inch)	(inch)	Valve (psi)

Datasets 1	19.6	1927,	2.992	6.276	150
(DS-1)		2900,
		3713,
		4434,
		5153
Datasets 2	53.5	1927,
(DS-2)		2900,
		3713,
		4434,
		5153
Datasets 3	43.2	1881,
(DS-3)		3558,
		4592,
		5350
Datasets 4	45.0	1881,
(DS-4)		3558,
		4592,
		5350
Datasets 5	32.3	2057,
(DS-5)		3528,
		4661,
		5511,
		6144,
		6561

The reservoir data used in the example embodiment is displayed in Table 2. This data is used to generate the inflow performance as part of the nodal analysis. The reservoir pressure is varied to illustrate the reservoir condition at the beginning and once it is depleted. For the example embodiment, the oil density is 36 or 37 for each dataset. Despite the various reservoir depths, the reservoir fluid type is similar, which is black oil for this embodiment. Skin value can be derived from pressure build-up (PBU) test, while the assumed skin value is 5 for this example embodiment. Petroleum Experts 2 correlation is used for vertical lift performance (VLP) correlation, specifically to estimate the bottomhole pressure of a well at different depths, which is important for nodal analysis. In the field, VLP correlation will be matched with flowing bottom hole pressure data to illustrate the tubing performance.

TABLE 2

Reservoir data used to generate synthetic datasets and perform
nodal analysis (inflow performance) for an example embodiment.
The values in each column vary, indicating that each layer
possesses distinct hydrocarbon potential.

		Range of
	Reservoir	Reservoir	Reservoir	Oil	Perme-
Datasets	Depth (ft-	Pressure	Thickness	Density	ability
Name	MD/ft-TVD)	(psi)	(ft)	(API)	(md)

Datasets 1	7986/6032	2613, 2352,	56	37	466
(DS-1)		2090, 1829
Datasets 2	5990/4532	1963, 1767,	24	37	201
(DS-2)		1571, 1374
Datasets 3	10130/7652	3256, 2930,	10	36	146
(DS-3)		2605, 2279
Datasets 4	8066/6148	2642, 2378,	13	36	219
(DS-4)		2114, 1850
Datasets 5	7745/6129	2659, 2393,	10	37	307
(DS-5)		2127, 1861

TABLE 3

Ranges of production parameters for each of the 5 datasets for the example embodiment,
outlining the conditions under which hydrocarbon production will occur.

	Range of
	Wellhead	Range of Gas	Range of Gas	Range of	Range of
Datasets	Pressure	Lift Injection	Lift injection	Water Cut	GOR
Name	(psi)	Pressure (psi)	rate (MMscf/D)	(%)	(SCF/STB)

Datasets 1	150-400	1000-1600	0.4-2.0	75-99	300-3300
(DS-1)
Datasets 2	150-400	1000-1600	0.4-2.0	75-99	300-2800
(DS-2)
Datasets 3	200-400	1000-1200	0.3-1.5	50-99	400-2200
(DS-3)
Datasets 4	200-400	1000-1200	0.3-1.5	50-99	400-2500
(DS-4)
Datasets 5	170-400	1000-1350	0.3-1.3	80-99	300-3000
(DS-5)

Based on the ranges of production parameters in Table 3, 1000 combinations or samples per dataset were generated using the Latin Hypercube Sampling (LHS) method for the example embodiment. The use of LHS ensures that each parameter is sampled across its entire range without redundancy. In other embodiments, additional or fewer samples may be generated for each dataset.

After generating these samples, the embodiment of the method 140 illustrated in FIG. 1B continues with the step 146 of the controller 116 performing nodal analysis on each of these samples. That is, for each sample generated for the datasets in step 144, nodal analysis is performed on the sample to generate a respective gas lift performance curve (GLPC) and calculate target values for the sample, the target values including applicability of gas lift injection, a target minimum gas lift injection rate, a target maximum gas lift injection rate, and a target optimal liquid production rate. For the example embodiment, this resulted in 1000 nodal analyses per dataset due to the varying production parameters. A sensitivity of the liquid production rate to changes in the gas lift injection rate is then conducted, with resulting sensitivity values ranging from zero to seven MMscf/D. By plotting the sensitivity between liquid production rate and gas lift injection rate, a GLPC is produced. From the nodal analysis, the liquid production rate is obtained for each sample.

For example, FIGS. 2A-2C are graphical representations of example GLPCs plotted for three example samples (i.e., cases 2, 492, and 704) as part of the data generation workflow for the example embodiment. The respective targets are computed from analysis of each of the GLPCs. As visually illustrated in the graphical representations of FIGS. 2A-2C, the targets calculated for each GLPC include: minimum gas lift injection rate (Q_gmin) 200, maximum gas lift injection rate (Q_gmax) 202, optimal liquid production rate (Q_Lmin) 204, and the applicability of gas lift injection to the well (a categorical target). Machine-learning models learn to predict these targets by processing an extensive list of informative, relevant, and independent features. The target Q_gmin200 and Q_gmax202 define a range of optimal gas lift injection rate, and Q_Lmin204 defines the optimal liquid production rate when the gas lift injection rate is set at Q_gmin200. It is important to determine the optimal range of gas lift injection rates that optimize the production performance. By performing gas lift injection at a rate that is within this optimal range, the well production can achieve the optimal liquid production rate. Understanding the optimal liquid production rate provides insight into the well's potential flow when the gas lift injection rate is within the optimal range.

During analysis of the GLPCs, in addition to Q_gminand Q_gmax, Q_Lminis determined as the third target value. This value represents the optimal liquid production rate, which is the liquid production rate at the minimum gas lift injection rate (Q_gmin) within the optimal range. This value is referred to as the optimal liquid production rate because increasing the gas lift injection rate from Q_gminto Q_gmaxdoes not yield a significant incremental liquid production gain. For example, as illustrated in the GLPCs of FIGS. 2A-2C, the curves are nearly flat between Q_gminto Q_gmax, indicating that the liquid production rate between these gas lift injection rates does not vary significantly.

Certain criteria may be applied to determine the target values from the GLPCs. For the example embodiment, the criteria includes a threshold value of 10 stock tank barrels per day (STB/D) for the liquid production gain, which represents the difference in the well liquid production before and after applying an optimized gas lift injection rate. In the field, a liquid production gain of 10 STB/D or less is difficult to distinguish from a separator test. As such, this liquid production gain threshold value is used in the example embodiment to determine the Q_gmin200 and Q_Lmin204 from Q_gmax202. As illustrated in FIGS. 2A-2C, Q_gmax202 corresponds to the gas lift injection rate at the highest liquid production rate, while Q_Lmin204 is the liquid production rate resulting from the highest liquid production rate minus the liquid production gain threshold value. As noted, Q_gmin200 is the gas lift injection rate when the liquid production rate corresponds to Q_Lmin204.

Other criteria may also be applied to determine the target values from the GLPCs. For the example embodiment, the criteria also includes a threshold value of 10 stock tank barrels per day (STB/D) for the minimum oil production gain. Oil production gain refers to the increase in the amount of oil produced from a well, often influenced by factors like increased production choke, higher pump frequency (if using a pump as artificial lift), or, for present embodiments, optimization of gas lift injection rate. A minimum oil production gain threshold value of 10 STB/D serves as an economic benchmark for employing gas lift injection. In other words, the oil production of the well is compared when operating without gas lift injection and with gas lift injection at a gas lift injection rate of Q_gmin200. If the oil production gain falls below the minimum oil production gain threshold value of 10 STB/D, Q_gminis set to zero. This indicates that the gas lift injection fails to produce an economic improvement in oil production, and as such, gas lift is not applicable to the well. Consequently, the well can continue to flow without gas lift injection and any allocated gas lift resources can be redirected to other wells, which will yield greater overall oil production gain. As such, the minimum oil production gain threshold value can be used to extract the final target by determining the applicability of gas lift injection. Depending on surface parameters and subsurface conditions in the collected data, there are generated samples representing situations in which gas lift injection is not applicable. When the increase in oil production achieved by using gas lift injection is less than the minimum oil production gain threshold value, it is recommended not to inject gas lift, and Q_gminis set to zero. It is presently recognized that the applicability of gas lift injection target requires a different methodology for building the predictive model compared to the other three targets.

For the embodiment illustrated in FIG. 1B, the method 140 continues with the step 148 of the controller 116 generating engineered features from features of the well data, the reservoir data, and the historical production data. Those skilled in the art will appreciate that machine-learning models process independent, informative, and relevant features to predict targets, and a machine-learning model can only be as good (e.g., useful, accurate) as the features that are supplied to the model. In some embodiments, in addition to the raw features that are present within the well data, the reservoir data, and the historical production data collected in step 142, additional engineered features may be defined that represent transformations or mathematical combinations of one or more raw features. For the example embodiment, the raw features used in this data-driven workflow (either directly or to define engineered features) include, but are not limited to: gas-liquid ratio (GLR) (scf/STB), gas-oil ratio (GOR) (scf/STB), gas lift injection pressure (psi), gas lift injection rate (Mscf/D), gas production rate (Mscf/D), liquid production rate (STB/D), reservoir pressure (psi), water cut (%), wellhead pressure (psi), measured depth (Depth-MD) (ft), true vertical depth (Depth-TVD) (ft), oil density (° API), point of injection (POI) (ft), permeability (md), reservoir temperature (° F.), reservoir thickness (ft), tubing inner diameter (ID) (inch), well inclination (deg), and wellbore radius (inch).

For the example embodiment, new engineered features are created by transforming the raw features. These engineered features aid in improving the relevance of the features supplied to the machine-learning model; thereby improving the machine-learning performance. Table 4 illustrates the list of newly engineered features with the specific formula used to create the features for the example embodiment. It should be appreciated that, the list of engineered features is merely provided as an example, and in other embodiments, different engineered features may be defined, in accordance with the present approach.

TABLE 4

Engineered features used in the example embodiment.

	Engineered Features	Equation

	Engineered Feature 1	Gas Liquid Ratio (GLR) ×
	((scf/STB)²)	Gas Oil Ratio(GOR)
	Engineered Feature 2	q_{g, inj}÷ q_l
	(scf/STB)
	Engineered Feature 3	q_g× q_{g, inj}÷ p_inj
	((Mscf/D)²/psi)
	Engineered Feature 4	100 − Water Cut
	(%)
	Engineered Feature 5	q_o× p_inj÷ p_wh
	(STB/D)
	Engineered Feature 6	q_w÷ q_o
	(STB/STB)
	Engineered Feature 7	p_wh× q_l÷ p_inj
	(STB/D)
	Engineered Feature 8	k ÷ ρ_o
	(md/°API)
	Engineered Feature 9	h × D_i÷ Well Inclination
	(ft-in/deg)

For the example embodiment, as noted herein, there are five datasets (DS-1 through DS-5), each containing 1000 samples, resulting in a total dataset of 5000 samples. Each sample represents a reservoir, well, and production scenario resulting in a specific GLPC. For each sample, nodal analysis and GLPCs are created using commercial software. A combination of VBA, or another suitable scripting language, and the commercial software facilitates the automatic and efficient generation of nodal analysis, gas lift performance curves, and the target determinations. For the example embodiment, reservoir properties, fluid properties, and well properties are used for the generation. Additionally, as noted, there are two tasks for the example embodiment. The first task is to predict the applicability of gas lift injection, which entails determining whether or not gas lift injection is recommended. The second task is to predict Q_gmin, Q_gmax, and Q_Lmin. These two tasks will be approached using different methods: classification for the first task and regression for the second task.

For the embodiment illustrated in FIG. 1B, the method 140 continues with the step 150 of the controller 116 selecting and scaling features from the well data, the reservoir data, the historical production data, and the engineered features for use in training and evaluating the machine-learning models. For the example embodiment, each dataset contains 28 features, 19 of which are raw features, and the remaining 9 are engineered features that were identified after extensive numerical experimentation. It is presently recognized that these newly formulated and validated engineered features are important to the successful deployment of certain embodiments of the disclosed technique. All 28 features are listed in Table 6 and Table 7. Using a data analytics process, these 28 features will be evaluated and filtered to be smaller number of features to build the most generalizable machine-learning model for the production-optimization task. Since all of these features have different ranges in the example embodiment, scaling is applied to certain features. In feature selection, steps are performed to identify features that have strong association with the targets and those having strong collinearity with other features. After these steps, for the example embodiment, 18 features were selected for identifying the applicability of gas lift injection, 19 features were selected for predicting minimum gas lift injection rate (Q_gmin), 18 features were selected for predicting maximum gas lift injection rate (Q_gmax), and 14 features were selected for optimal liquid production rate target (Q_Lmin). In some embodiments, feature selection may involve removing features having low mutual information (MI) scores or low F scores, removing high co-linearity features using Pearson correlation, and removing features that involve or require well intervention jobs. The selected features were identified after extensive numerical experimentation. It is presently recognized that these selected features are important to the successful deployment of certain embodiments of the disclosed technique. It is further presently recognized that both feature engineering and feature selection help improve the generalizability of the predictions of the machine-learning models.

For the example embodiment, the 28 features were scaled before being analyzed for potential selection. Scaling is a process of standardizing the ranges of various features to ensure the machine-learning models are not unduly biased by certain features. The tool used for the example embodiment was StandardScaler, which standardizes the features by transforming the mean to zero and standard deviation to a value of 1. After scaling, certain of the 28 features were selected based on results of an association test, ANOVA, which determines F scores and mutual information (MI) scores for these features, which indicate the strength of the relationship between the features and the targets. In ANOVA, the F score is a ratio of the variance between group and the variance within group. F score quantifies a linear association between features and target. A high F score indicates that a feature has a strong linear relationship with a target. MI scores indicate the amount of information obtained about one variable from another variable and evaluates the relevance of each feature to the target. MI scores can capture nonlinear associations between features and targets. A high MI value indicates that a feature has a strong non-linear relationship with a target.

For the example embodiment, the strongest associated features for each continuous and categorical feature were determined using ANOVA and Mutual Information (MI) values for each target. For the applicability of gas lift injection target, the strongest associated features were determined to be the gas production rate, and Engineered Feature 8. For the minimum gas lift injection rate target (Q_gmin), the strongest associated features were determined to be gas production rate and reservoir thickness. For the maximum gas lift injection rate target (Q_gmax) the strongest associated features were determined to be Engineered Feature 5 and reservoir thickness. For the optimal liquid production rate target (Q_Lmin) the strongest associated features were determined to be liquid production rate and Engineered Feature 9. Meanwhile, the least associated features across all targets, both continuous and categorical feature are the same: gas lift injection rate and tubing ID.

For the example embodiment, some of the 28 features were eliminated for having a high co-linearity. The Pearson correlation coefficient was used to analyze co-linearity, as it identifies the co-linearity between features that may undesirably lead to model non-uniqueness and enhance the curse of dimensionality. Table 5 shows features having high co-linearity with other features. Based on these results, features water cut, Depth-TVD and GLR_injwere selected, while oil cut, depth-MD, and GLR were not selected.

TABLE 5

List of features which have high Pearson correlation coefficient.

		Pearson Correlation
Features 1	Features 2	Coefficient

Water cut	Oil cut	1.00
Depth-TVD	Depth-MD	0.99
Gas lift injected	Gas liquid	0.91
liquid ratio (GLR_inj)	ratio (GLR)

It is presently recognized that the association and collinearity tests should be performed separately for categorical and continuous features, depending on whether target is continuous or categorical. It is presently recognized that, when continuous and categorical features are not separated, the continuous features will overly bias the feature selection as compared to the categorical features, which will mistakenly not be selected because of their low F score and MI value. Also, the type of tests performed for continuous features are different from those performed for categorical features. For the example embodiment, the final list of selected features is presented in FIGS. 3A-3H, along with their respective F-score and MI value for each task. It may be appreciated that the index values indicated in FIGS. 3A-3H refer to features as indicated in Table 6 (which presents continuous features) and Table 7 (which presents categorical features). FIGS. 3A, 3C, 3E, and 3G present the selected continuous features, while FIGS. 3B, 3D, 3F, and 3H present the categorical features. Furthermore, FIGS. 3A and 3B correspond to the gas lift injection applicability classification task, FIGS. 3C and 3D correspond to the Q_gminregression task, FIGS. 3E and 3F correspond to the Q_gmaxregression task, and FIGS. 3G and 3H correspond to the Q_Lminregression task.

For the example embodiment, based on the results illustrated in FIGS. 3A-3H, features were selected for use in training and evaluating the machine-learning models. More specifically, for the applicability of gas lift injection classification task, the selected continuous features indicated FIG. 3A include indices 1, 2, 7, 8, 9, 11, 13, 14, and 16, and the selected categorical features indicated in FIG. 3B include indices 18, 19, 21, 22, 24, 25, 26, 27, and 28. For the Q_gminregression task, the selected continuous features indicated in FIG. 3C include indices 3, 4, 7, 8, 9, 11, 13, 14, 15, and 16, and the selected categorical features indicated in FIG. 3D include indices 18, 19, 21, 22, 24, 25, 26, 27, and 28. For the Q_gmaxregression task, the selected continuous features indicated in FIG. 3E include indices 3, 7, 8, 9, 11, 13, 14, 15, and 16, and the selected categorical features indicated in FIG. 3F include indices 18, 19, 21, 22, 24, 25, 26, 27, and 28. For the Q_Lminregression task, the selected continuous features indicated in FIG. 3G include indices 3, 7, 9, 11, and 16, and the selected categorical features range indicated in FIG. 3H include indices 18, 19, 21, 22, 24, 25, 26, 27, and 28. It may be appreciated that, for oil wells that are sufficiently similar to the oil well of the example embodiment, the same features may be engineered, selected, and scaled as described above, which may enable steps 148 and 150 of the method 140 to be skipped, further streamlining the method 140.

TABLE 6

Continuous features and corresponding indices
used to identify the features in FIG. 3.

Index	Feature

1	GLR (scf/STB)
2	Engineered Feature 1 ((scf/STB)²)
3	Engineered Feature 2 (scf/STB)
4	GOR (scf/STB)
5	Gas Lift Injection Pressure (psi)
6	Gas Lift Injection Rate (MMscf/D)
7	Gas Production Rate (Mscf/D)
8	Engineered Feature 3
	((Mscf/D)²/psi)
9	Liquid Production Rate (STB/D)
10	Engineered Feature 4 (%)
11	Engineered Feature 5 (STB/D)
12	Reservoir Pressure (psi)
13	Engineered Feature 6 (STB/STB)
14	Water Cut (%)
15	Wellhead Pressure (psi)
16	Engineered Feature 7 (STB/D)

TABLE 7

Categorical features and corresponding indices
used to identify the features in FIG. 3.

Index	Feature

17	Depth-MD (ft)
18	Depth-TVD (ft)
19	Oil Density (°API)
20	Point of Injection (ft)
21	Permeability (md)
22	Engineered Feature 8 (md/°API)
23	Reservoir Temperature (° F.)
24	Reservoir Thickness (ft)
25	Engineered Feature 9 (ft-in/deg)
26	Tubing ID (inch)
27	Well Inclination (deg)
28	Wellbore Radius (inch)

For the embodiment illustrated in FIG. 1B, the method 140 continues with the step 152 of the controller 116 constructing classification and regression models having the selected features. For the example embodiment, a logistic regression model was selected for the classification task. Logistic regression is a statistical method used for modeling the probability for a binary target (between 0 and 1). Logistic regression is similar to linear regression but instead uses a sigmoid function or logistic function to ensure that the predicted probabilities are between 0 and 1. The logistic regression then classifies an instance into one of the two classes based on a predefined decision threshold value. If the predicted probability is greater than predefined threshold value, the instance is classified into one class, and if it is lower than threshold, the instance is classified to the other class. Common hyperparameters used for logistic regression models include L1_ratio and C.

For the regression tasks, five different regression techniques were used and evaluated in the example embodiment, including Elastic Net, K-Nearest Neighbor (KNN), Support Vector regression (SVR), Random Forest, and Gradient Boosting.

Elastic Net regression uses two regularization methods, namely Lasso regression (L1) and Ridge regression (L2), to reduce the dimensionality of the feature space and improve the model generalization by reducing overfitting. Common hyperparameters used for Elastic Net regression models include Alpha and L1_ratio.

K-Nearest Neighbor (KNN) regression predicts the target value for a new sample by considering the “k” nearest samples. The predicted value for the new sample is calculated by taking the average of the target values of its K nearest neighbors in the training set. Common hyperparameters used for KNN regression models include N_neighbors and Weights.

Support Vector regression (SVR) finds the optimal hypertube that best fits the training samples while minimizing errors. Initially, it transforms the input data into a higher-dimensional space using the kernel technique. SVR then identifies a hypertube that best fits the samples points with the smallest margin of error, balancing the trade-off between the margin violations and the model complexity through regularization. Finally, SVR predicts targets by evaluating the distance between the new data points and the hypertube. Common hyperparameters used for SVN regression models include C, gamma, and degree.

Random Forest regression uses an ensemble of independent decision trees, trained in parallel. Each decision tree is trained independently on the data and created by recursively partitioning the feature space into smaller regions based on the selected features. These decision trees will be used to predict the target by averaging the predictions of all individual trees in the ensemble. Common hyperparameters used for Random Forest regression models include Max_depth, Max_features, and Max_samples.

Gradient Boosting also uses an ensemble of decision trees that are trained sequentially, with each tree learning to fix the mistakes of its predecessors. Each new tree is trained to correct the errors made by the existing ensemble. This iterative learning process gradually improve its performance. Common hyperparameters used for Gradient Boosting regression models include N_estimators, Learningrate, and Max_depth.

For present embodiments, hyperparameters may defined by the model developer to govern the model training. The learning of the model and the model complexity is determined by the hyperparameters. Generally, grid-search with cross-validation is performed to find the optimal hyperparameters of a model based on the technique used, available dataset, and the predictive task at hand. Hyperparameters govern the parameters learned by the model/estimator. The search for optimal hyperparameters is referred as hyperparameter tuning. As illustrated in Table 8, each hyperparameter for a given regressor technique is defined, along with its range. The fine-tuning process aims to identify the optimal combination of hyperparameters, as indicated in the “Optimal Value” column, by searching with the user-defined range for each hyperparameter. These optimal values contribute to achieving the best generalization performance of the regressor or classifier.

TABLE 8

Range of grid search for hyperparameter tuning and optimal value of hyperparameters
for the classifier and regressors used in the example embodiment.

Hyper-

Optimal Value for Target

Task	Technique	parameters	Range	Binary	Q_gmin	Q_gmax	Q_Lmin

Classification	Logistic	L1_ratio	0.001-0.7	0.5	—	—	—
	Regression	C	0.005-1	0.1	—	—	—
Regression	Random	Max_depth	10-30	—	30	30	30
	Forest	Max_features	0.7-1	—	0.7	0.8	0.7
		Max_samples	0.6-1	—	0.8	0.8	0.8
	Gradient	N_estimators	50-200	—	200	200	200
	Boosting	Learning_rate	0.05-0.2	—	0.1	0.2	0.2
		Max_depth	3-5	—	5	4	3
	K-Nearest	N_neighbors	3-7	—	7	7	5
	Neighbors	Weights	uniform,	—	distance	distance	distance
			distance
	SVR	C	0.01, 1	—	1	1	1
	(RBF)	Gamma	0.01, 1	—	0.01	0.01	0.01
		Degree	2, 3	—	2	2	2
	Elastic	Alpha	0.1-10	—	0.1	0.1	0.1
	Net	L1_ratio	0.1-0.9	—	0.1	0.1	0.9

For the embodiment illustrated in FIG. 1B, the method 140 continues with the step 154 of the controller 116 using a training portion of the datasets and the corresponding calculated target values to train the classification and regression models. For the example embodiment, the 5000 samples of the datasets generated in step 144 are divided into a training portion (representing about 85% of the samples) and a testing portion (representing about 15% of the samples). In general, during training, the values of each sample of the training portion of the datasets are provided as inputs to the selected features for a given machine-learning model, and the model is trained to output the corresponding target values (e.g., applicability of gas lift injection, a target minimum gas lift injection rate, a target maximum gas lift injection rate, a target optimal liquid production rate) determined in step 146, with minimal error, in response to the inputs for that sample.

For the embodiment illustrated in FIG. 1B, the method 140 continues with the step 156 of the controller 116 using a testing portion of the datasets and the corresponding target values to evaluate the trained classification and regression models, and then selecting the best performing trained classification model 158 and the best performing trained regression models 160 for deployment. In general, during testing, the values of each sample of the training portion of the datasets are provided as inputs to the selected features for a given trained machine-learning model, and the model output is subsequently compared to the corresponding target values (e.g., applicability of gas lift injection, a target minimum gas lift injection rate, a target maximum gas lift injection rate, and a target optimal liquid production rate) calculated in step 146 for each sample to evaluate performance of each trained machine-learning model.

Various metrics may be used to assess the performance of a trained model during testing. For the example embodiment, the metrics included F1 score, Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R²). F1 Score measures the harmonic mean of precision and recall, which is used in evaluating the classification task. MAE measures the average absolute difference between predicted and actual values. This metric is used in evaluating the regression task. MAPE measures the average absolute percentage difference between predicted and actual values, and this metric is applied to evaluate regression tasks. R²indicates the effectiveness of the regression model by measuring the percentage of the dependent variable's variance that is explained by the independent variables.

For the example embodiment, the trained logistic regression model was evaluated for its ability to perform the classification task to identify the applicability of gas lift injection to a well based on the surface parameter and well data. From the total dataset, there are 4250 samples were used for training the logistic regression model, and the remaining 750 samples were used for testing. As discussed herein, optimal hyperparameters for logistic regression model were used ensure the model learns the most generalizable patterns from the training data. The generalization performance of this model was evaluated on the 750 test samples. F1 score is used as the metric for the evaluation of the generalization performance. As shown in Table 9, the F1 score of the logistic regression model for the classification task is 0.969, which is very high. With this result, the trained logistic regression model can be used to provide recommendations as to whether or not gas lift injection is applicable to a well. For example, armed with this information, a production engineer or field operation team can allocate gas lift resources to other wells that need gas lift in order to improve the oil production.

For the example embodiment, the trained regression models were evaluated for their ability to perform the regression tasks of determining an optimal range of gas lift injection values, extending from a minimum gas lift injection rate (Q_gmin) to a maximum of gas lift injection rate (Q_gmax), as well as an optimal production liquid rate (Q_Lmin). Table 8 presents the optimal hyperparameters of each model for each target. The number of training and testing samples used in these tasks were 4209 and 743 samples, respectively. The number of training and testing samples differs from the classification task because the samples that do not require a gas lift injection rate were removed. The best-performing model for each target is identified by comparing the generalization performances of various models based on the MAE, MAPE, and R2 value. Table 9 indicates that, for the example embodiment, gradient boosting is the best performing model for predicting Q_gminand Q_gmax, and random forest is the best performing model for predicting Q_Lmin.

TABLE 9

Summarizing the top-performing model for the classification
and regression tasks in the data-driven workflow for
the gas lift injection optimization. Fl score is employed for
evaluating the classification models, while regression models
are assessed based on MAE, MAPE, and R².

Metrics

	Best	F1
Task	Model	Score	MAE	MAPE	R²

Applicability of	Logistic	0.969	—	—	—
Gas Lift	Regression
Injection Rate
Minimum Gas Lift	Gradient	—	0.11	0.06	0.940
Injection Rate	Boosting		MMscf/D
Maximum Gas Lift	Gradient	—	0.10	0.04	0.970
Injection Rate	Boosting		MMscf/D
Optimal Liquid	Random	—	37.57	0.02	0.998
Production Rate	Forest		STB/D

For the example embodiment, the minimum gas lift injection rate prediction of the gradient boosting regression model resulted in a MAE of 0.11 MMscf/D and a MAPE of 0.06 with an R²of 0.94. For the maximum gas lift injection rate prediction of the gradient boosting regression model, MAE was 0.10 MMscf/D and MAPE was 0.04 with an R²of 0.97. These two predictive models can provide recommendations to a petroleum engineer or field operation team about how they should adjust the gas lift injection rate of a well to achieve the optimal oil production rate. In addition, the third model, which also performs very well based on those metrics, can give information about the well liquid production rate after the gas lift injection rate is optimized.

The performances of the regression models are presented in FIGS. 4A-4C. More specifically, FIG. 4A illustrates the performance of the trained gradient boosting regression model to predict the minimum gas lift injection rate, FIG. 4B illustrates the performance of the trained gradient boosting regression model to predict the maximum gas lift injection rate, and FIG. 4C illustrates the performance of the trained random forest regression model to predict the optimal liquid production rate. Each of FIGS. 4A-4C includes a cross-plot with the model predictions on y-axis and the actual data on the x-axis for each target. Each of FIGS. 4A-4C includes a dashed line 400 that illustrates perfect prediction, while each of the testing samples are represented as dots. From these results, it is observed that almost all dots coincide with the dashed line, which confirms the high generalization performance for each target.

Computational Time

Table 10 compares the time expenditure for traditional gas lift optimization techniques to the time expenditure when using the gas lift injection optimization techniques disclosed herein. In the traditional approach, the most time-consuming step is running the bottom-hole pressure (BHP) survey. This survey involves recording the bottom hole pressure while the well is flowing, then closing the well until the pressure at the bottom hole stabilizes, and finally recording the pressure when the well is shut-in. It is important to stabilize the bottom hole condition before running the survey during well shut-in because the pressure data will reflect the updated reservoir pressure. These flowing and shut-in bottom hole pressures are used to perform nodal analysis, which determines the optimal range of the gas lift injection rate.

In contrast, embodiments of the disclosed data-driven method are substantially faster compared to the traditional approach. The data generation is the most time-consuming step. For the example embodiment, this step generates 1000 samples with various gas lift injection rates per dataset, resulting in the generation of 1000 gas lift performance curves generated per dataset. Subsequently, each curve is then evaluated and used as a dataset to build the machine-learning model to determine the range of optimal gas lift injection rate. As indicated by Table 10, present embodiments enable a substantial reduction in time expenditure, which substantially reduces costs and production losses as compared to traditional techniques.

TABLE 10

Comparison of the computational times of the traditional approach versus
the disclosed data-driven method for gas lift optimization, in one
embodiment. Once trained, the data-driven workflow can be applied to
rapidly predict the optimal gas lift injection rate for a well.

Task	Traditional	Data-Driven

Data	Run bottom-hole	30-32 hours	Data Generation	6 hours for 1000
	pressure survey	for 1 well		well scenarios
Analysis	Analyze flowing and	10-20 min	Build ML model,	Approx. 1 hour
	shut-in bottom hole		hyperparameter
	pressure data		tuning, and
	Generate nodal	30-50 min	training time
	analysis
Estimation	Determine range of	10-20 min	Determine range of	Approx. 1 second
	optimal gas lift	per well	optimal gas lift	for 1000 wells
	injection rate		injection rate

Reducing the Training Dataset Size

For the example embodiment discussed above, model evaluation verified that excellent generalization performance was enabled when the dataset included 5000 total samples representing different well conditions and parameters. An additional study was performed to determine the performance of the machine-learning models when a smaller dataset with fewer samples were used to train the models. For this study, the training dataset size was reduced from 5000 to about 500 samples for one embodiment, and then further reduced to 100 samples for another embodiment. The predictive performance of the machine-learning models was evaluated for each of the training dataset sizes. FIG. 5A illustrates the effect, in terms of F1 score and associated uncertainty of the F1 score, of reduced dataset size on the classification task of identifying the applicability of a gas lift injection for a well. Observing the F1 score, the reduction in dataset size results in no significant difference in the predictive performance. This is also supported by the uncertainty of F1 score showing that for each dataset, the uncertainty remains low. FIG. 5B illustrates the effect, in terms of MAPE, of reducing the dataset size on the regression tasks of estimating the three targets (Q_gmin, Q_gmax, and Q_Lmin). Compared to the embodiment that utilized a dataset size of 500, the embodiment that utilized a dataset size of 200 still enabled a decent MAPE of 0.07 for target Q_gmin. As indicated by FIGS. 5A and 5B, for all the three targets, the training dataset size can be lowered to 200 in some embodiments with only limited reduction in the generalization performance. Because reducing the training dataset size has a substantial impact on the total time expenditure of the disclosed embodiments, it may be desirable for certain embodiments to utilize a smaller training dataset with fewer samples to enable an even more substantial reduction in time expenditure compared to traditional gas lift optimization techniques. For example, the disclosed computing apparatus or system is capable of generating the 200 samples per dataset in approximately 1 hour, compared to the 6 hours spent generating the 1000 samples per dataset for the example embodiment.

Assumptions

In the example embodiment, the data-driven workflow illustrated in FIG. 1B assumes only a single VLP correlation. Many VLP correlations are built based on relationships between specific parameters, such as well inclination, reservoir permeability, and GOR. Before determining which VLP correlation is suitable for a well, the correlation should be adjusted to match the flowing bottom hole pressure data. The matched VLP correlation will represent the tubing performance relationship when the well is flowing. In some embodiments, each field, and even each well, can be assumed to have a different VLP correlation.

In the example embodiment, it is assumed that the gas lift injection point is located at the deepest gas lift valve, as indicated in the well and reservoir data. However, in certain situations, the gas lift can enter the tubing not only through the deepest gas lift valve, but also through other gas lift valves at the same time. This can occur due to aging of both wells and the gas lift valves installed in the wellbore. Having multiple gas lift injection points can affect the well's nodal analysis and, consequently, the optimal gas lift injection and liquid production rate. As such, in certain embodiments, the workflow can be modified to accommodate situations in which the gas lift is injected through gas lift valves other than the gas lift valve disposed at the deepest depth or through a combination of multiple gas lift valves.

In the example embodiment, another assumption relates to the application of the inflow performance relationship, utilizing Darcy-Vogel equation. This equation is utilized to accommodate the flow for more than one phase within the reservoir. One parameter that must be assumed to remain constant in the equation is the skin. However, in some cases, each well may possess a different skin value, and even each reservoir layer in the same well may possess a different skin value. The different skin could lead to variations in nodal analysis, consequently affecting the GLPC. Moreover, the presence of sand in the wellbore can influence the nodal analysis. As such, in other embodiments, the workflow may be modified to accommodate different skin values to enhance the inflow performance relationship for such cases.

Controller and Deployment

FIG. 6 is a diagrammatic representation of a control system 600 of an oil well. In some examples, the control system 600 includes at least the controller 116. While described herein as a controller, it may be appreciated by those skilled in the art that, in other embodiments, the controller may be or include any suitable computing apparatus or system, such as a desktop, laptop, or tablet computing device. Additionally, while the control system 600 is illustrated and described as including a single controller 116, in some embodiments, the operation of the controller may instead be implemented through use of a plurality of controllers of the control system 600 in signal communication with one another, e.g., distributed, in series, or supervisory to sub-component controllers, among others, as will be understood by those skilled in the art.

The controller 116 of various examples disclosed herein includes one or more processors, such as processor 602, as well as a memory or machine-readable storage medium, such as memory 604. As used herein, a “machine-readable storage medium” may be, for example, any electronic, magnetic, optical, or other physical storage apparatus to contain or store information such as executable instructions, data, and the like. For example, any machine-readable storage medium described herein may be any of random-access memory (RAM), volatile memory, non-volatile memory, flash memory, a storage drive, a hard drive, a solid-state drive, any type of storage disc, and the like, or a combination thereof. The memory 604 stores or includes instructions executable by the processor 602. As used herein, a “processor” includes, for example, one processor or multiple processors included in a single device or distributed across multiple computing devices. The processor 602 may be at least one of a central processing unit (CPU), a semiconductor-based microprocessor, a graphics processing unit (GPU), a field-programmable gate array (FPGA) to retrieve and execute instructions, a real-time processor (RTP), other electronic circuitry suitable for the retrieval and execution instructions stored on a machine-readable storage medium, or a combination thereof.

For the embodiment illustrated in FIG. 6, the controller 116 is in signal communication with various data sources and various components of the oil well, including a well data source 606, a reservoir data source 608, the one or more sensors 118, the fluid analyzer 120, the gas lift compressor 108, and the gas lift control valve 109. As used herein, “signal communication” refers to electric communication such as hard wiring two components together or wireless communication, as understood by those skilled in the art. For example, wireless communication may be Wi-Fi®, Bluetooth®, ZigBee, or forms of near field communications, as will be understood by those skilled in the art. In addition, signal communication may include one or more intermediate controllers or relays disposed between elements that are in signal communication with one another. In some embodiments, the well data source 606 and the reservoir data source 608 may be a common data source, and the one or more data sources may be located on-site (e.g., at the oil well) or remote (e.g., at an office or data storage facility).

The memory 604 of the controller 116 stores the well data 610 collected from the well data source 606, such as the well inclination, Tubing ID, Casing ID, GLV depths, and the dp Loss Across Valve. The memory 604 also stores the reservoir data collected from the reservoir data source 608, such as the reservoir depth, reservoir pressure, reservoir thickness, permeability, skin, and oil density. The memory 604 further stores the production data 614 collected from the communicatively connected components of the oil well, such as the wellhead pressure, gas lift injection pressure, gas lift injection rate, water cut, and GOR. In some embodiments, the production data 614 includes historical production data, current production data, or a combination of both. In some embodiments, the memory 604 stores instructions of a user interface 615 that are executed by the processor 602 to provide messages and prompts to, and to receive inputs (e.g., confirmations, data) from, a petroleum engineer user of the controller 116 via suitable input/output devices.

In some embodiments, the memory 604 of the controller 116 includes a model creation module 616 storing instructions executed by the processor 602 to facilitate the performance of the data-driven method 140 of FIG. 1B to produce the generated datasets 618 having the training and testing samples, the trained gas lift injection applicability classification model 158, and the trained regression models, including the trained optimal gas lift injection range model 160A and the trained optimal liquid production model 160B. The memory 604 may store any other supporting tools or modules used in the data collection, synthetic dataset generation, nodal analysis and GLPC generation, model training, and/or model evaluation, such as the VBA scripts and the commercial petroleum engineering software described herein. In other embodiments, the model creation module 616 may be executed on a separate computing device to produce the generated datasets 618 and to train the machine-learning models, and then the resulting trained machine-learning models 158, 160A, and 160B are stored in the memory 604 of the controller 116 to provide gas lift injection predictions to facilitate operation of the oil well. In some embodiments, the memory 604 includes a gas lift injection control module 620 storing instructions executed by the processor 602 to use the predictions provided by the trained machine-learning models 158, 160A, and 160B to control operation of the gas lift injection of the oil well.

For example, FIG. 7 is a diagrammatic representation of an embodiment of a method 700 in which the controller 116 uses the trained machine-learning models 158, 160A, and 160B to predict the applicability of gas lift injection to a well, an optimal gas lift injection range, and an optimal liquid production rate, while controlling operation of the oil well. The method 700 is discussed with reference to elements illustrated in FIG. 6. In some embodiments, the method 700 may be stored as instructions of the gas lift injection control module 620 that are executed by the executed by the processor 602 to control operation of the gas lift compressor 108 and/or the gas lift control valve 109.

For the embodiment illustrated in FIG. 7, the method 700 begins with the step 702 of the controller 116 receiving well data 610 and reservoir data 612 from the well data source 606 and the reservoir data source 608. In some embodiments, the well data 610 and reservoir data 612 was previously collected when preparing the machine-learning models (e.g., step 142 of the method 140 of FIG. 1B) and step 702 may be skipped. At step 704, the controller 116 receives the current production data (e.g., wellhead pressure, gas lift injection pressure, gas lift injection rate, water cut, and GOR) from the components of the oil well (e.g., sensor(s) 118, fluid analyzer 120, gas lift compressor 108, gas lift control valve 109). The optional fine-tuning step 706 may be skipped during a first iteration through the method 700, but may optionally be performed during subsequent iterations, as discussed below.

For the embodiment illustrated in FIG. 7, the method 700 continues with the step 708 of the controller 116 providing at least a portion of the well data, the reservoir data, and the current production data as inputs to features of the trained classification model 158 and the trained regression models 160A and 160B. At step 710, in response to providing the inputs, the controller 116 receives, from the trained classification model 158, a binary output indicating whether or not gas lift injection is applicable (or continues to be applicable) for use in the oil well. Responsive to the controller 116 determining, at decision step 712, that the binary output indicates that gas lift injection is not applicable (or no longer applicable) to the oil well, the controller proceeds to step 714 and outputs an indication (e.g., to the petroleum engineer via the user interface 615) that gas lift injection is not applicable (or no longer applicable) to the oil well.

For the embodiment illustrated in FIG. 7, responsive to determining, at decision step 712, that gas lift injection is applicable to the oil well, the controller 116 proceeds to step 716. At step 716, in response to the inputs provided to the trained regression models, the controller 116 receives, from the trained regression models, a predicted optimal range of gas lift injection values (extending from a predicted minimum gas lift injection rate to a predicted maximum gas lift injection rate), as well as a predicted optimal liquid production rate when the predicted minimum gas lift injection rate is applied. At step 718, the controller 116 provides control signals to the gas lift compressor 108 and/or the gas lift control valve 109 to automatically set the gas lift injection rate to the predicted minimum gas lift injection rate. In some embodiments, the controller 116 may first prompt a petroleum engineer user via the user interface 615 to confirm any changes to the gas lift injection rate, or changes to the gas injection rate that are greater than a predefined threshold value, before automatically implementing the changes. In other embodiments, the method 700 may be performed on a computing apparatus or system that is distinct from the controller 116, and step 718 may instead entail presenting the predicted gas lift injection targets to a petroleum engineer for analysis or implementation. For the illustrated embodiment, the controller 116 may optionally wait a period of time before returning to step 704, in which the controller 116 receives current production data for the oil well operating at the modified gas lift injection rate.

In some embodiments, at step 706, the controller 116 may compare the current liquid production rate to the predicted optimal liquid production rate determined at step 716 of the previous iteration of the method 700. In such embodiments, responsive to the controller 116 determining that the current liquid production rate varies from (e.g., is greater than or is less than) the predicted optimal liquid production rate by more than a predefined threshold value, the controller 116 may optionally conduct fine-tuning of the trained regression models. During fine-tuning, the parameters of the regression models 160A and 160B may be adjusted to ensure that the model predictions align with the actual production data. In this manner, the quality of the predictions of the trained regression models may improve over time as the models continue to learn the particular relationships between the gas lift injection rate and the liquid production rate for the oil well throughout operation. It is presently recognized that this fine-tuning results in creation of a more representative GLPC for the well performance or a more regressed VLP correlation, which may improve prediction performance.

Nomenclature and Units

D i = inner ⁢ diameter ⁢ of ⁢ tubing ⁢ ( inch ) h = reservoir ⁢ thickness ⁢ ( ft ) k = permeability ⁢ ( md ) ρ o = oil ⁢ density ⁢ ( ° ⁢ API ) p inj = gas ⁢ lift ⁢ injection ⁢ pressure ⁢ ( psi ) p wh = wellhead ⁢ pressure ⁢ ( psi ) q g = gas ⁢ production ⁢ rate ⁢ ( Mscf / D ) q g , inj = gas ⁢ lift ⁢ injection ⁢ rate ⁢ ( MMscf / D ) q l = liquid ⁢ production ⁢ rate ⁢ ( STB / D ) q o = oil ⁢ production ⁢ rate ⁢ ( STB / D ) q w = water ⁢ production ⁢ rate ⁢ ( STB / D ) ft = feet in = inches md = millidarcy ° ⁢ API = degrees ⁢ API ⁢ gravity psi = pounds ⁢ per ⁢ square ⁢ inch Mscf / D = thousand ⁢ standard ⁢ cubic ⁢ feet ⁢ per ⁢ day MMscf / D = million ⁢ standard ⁢ cubic ⁢ feet ⁢ per ⁢ day STB / D = stock ⁢ tank ⁢ barrels ⁢ per ⁢ day SCF / STB = standard ⁢ cubic ⁢ foot ⁢ per ⁢ stock ⁢ tank ⁢ barrel

Embodiments presented herein include a data-driven workflow to address challenges in optimizing gas lift injection rates. The disclosed embodiments help to reduce well interventions, avoid placing wells in a shut-in state, accelerate the optimization process, reduce human bias, and increase hydrocarbon production. The workflow outlines the steps to build machine-learning models, using well surface parameter and subsurface data as inputs to predict the applicability of gas lift injection to the oil well, the optimal range of gas lift injection rate, and the optimal liquid production rate. The predictive models provide highly accurate recommendations for gas lift injection optimization. For the example embodiment discussed herein, the applicability of gas lift injection can be predicted with very high accuracy, achieving an F1 score of 0.97. For the example embodiment discussed herein, these models can predict the minimum gas lift injection rate with a mean absolute percentage error (MAPE) of 6%, the maximum gas lift injection rate with a MAPE of 4%, and the optimal liquid production rate with a MAPE of 2%. For the example embodiment discussed herein, a gradient boosting regression model was the best-performing model for estimating both the minimum and maximum gas lift injection rates, while a random forest regression model was the best-performing model for estimating the optimal liquid production rate. For the example embodiment discussed herein, six hours were spent generating and performing nodal analysis of the synthetic dataset, 1 hour was spent building and tuning the machine-learning model, and less than one second was spent to predict the results for hundreds of cases. This demonstrates the usefulness of the present technique in facilitating accurate and rapid production optimization. Compared with the traditional method, the proposed method is significantly faster. Additionally, it was observed that using a smaller set of samples (e.g., 200 instead of 1000 samples per dataset) can still yield suitable prediction performance. Present techniques enable the aforementioned gas lift injection predictions without a well intervention job, and thus represent a technological improvement in petroleum engineering, with less time consumption, less site disruption, less production disruption, and greater efficiency of such processes.

It should be understood that although the terms first, second, etc. may be used herein to describe various steps or calculations, these steps or calculations should not be limited by these terms. These terms are only used to distinguish one step or calculation from another. For example, a first calculation could be termed a second calculation, and, similarly, a second step could be termed a first step, without departing from the scope of this disclosure. It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may in fact be executed substantially concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

With the above embodiments in mind, it should be understood that the embodiments might employ various computer-implemented operations involving data stored in computer systems. These operations are those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. Further, the manipulations performed are often referred to in terms, such as producing, identifying, determining, or comparing. Any of the operations described herein that form part of the embodiments are useful machine operations. The embodiments also relate to a device or an apparatus for performing these operations. The apparatus can be specially constructed for the required purpose, or the apparatus can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines can be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.

A module, an application, a layer, an agent, or other method-operable entity could be implemented as hardware, firmware, or a processor executing software, or combinations thereof. It should be appreciated that, where a software-based embodiment is disclosed herein, the software can be embodied in a physical machine such as a controller. For example, a controller could include a first module and a second module. A controller could be configured to perform various actions, e.g., of a method, an application, a layer, or an agent.

The embodiments can also be embodied as computer readable code on a tangible non-transitory computer readable medium. The computer readable medium is any data storage device that can store data, which can be thereafter read by a computer system. Examples of the computer readable medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and other optical and non-optical data storage devices. The computer readable medium can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion. Embodiments described herein may be practiced with various computer system configurations including hand-held devices, tablets, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers and the like. The embodiments can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a wire-based or wireless network.

Although the method operations were described in a specific order, it should be understood that other operations may be performed in between described operations, described operations may be adjusted so that they occur at slightly different times or the described operations may be distributed in a system which allows the occurrence of the processing operations at various intervals associated with the processing.

In various embodiments, one or more portions of the methods and mechanisms described herein may form part of a cloud-computing environment. In such embodiments, resources may be provided over the Internet as services according to one or more various models. Such models may include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). In IaaS, computer infrastructure is delivered as a service. In such a case, the computing equipment is generally owned and operated by the service provider. In the PaaS model, software tools and underlying equipment used by developers to develop software solutions may be provided as a service and hosted by the service provider. SaaS typically includes a service provider licensing software as a service on demand. The service provider may host the software, or may deploy the software to a customer for a given period of time. Numerous combinations of the above models are possible and are contemplated.

Various units, circuits, or other components may be described or claimed as “configured to” perform a task or tasks. In such contexts, the phrase “configured to” is used to connote structure by indicating that the units/circuits/components include structure (e.g., circuitry) that performs the task or tasks during operation. As such, the unit/circuit/component can be said to be configured to perform the task even when the specified unit/circuit/component is not currently operational (e.g., is not on). The units/circuits/components used with the “configured to” language include hardware—for example, circuits, memory storing program instructions executable to implement the operation, etc. Additionally, “configured to” can include generic structure (e.g., generic circuitry) that is manipulated by software and/or firmware (e.g., an FPGA or a general-purpose processor executing software) to operate in manner that is capable of performing the task(s) at issue. “Configured to” may also include adapting a manufacturing process (e.g., a semiconductor fabrication facility) to fabricate devices (e.g., integrated circuits) that are adapted to implement or perform one or more tasks.

The foregoing description, for the purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the embodiments and its practical applications, to thereby enable others skilled in the art to best utilize the embodiments and various modifications as may be suited to the particular use contemplated. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the embodiments are not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.

Other objects, features, and advantages of the disclosure will become apparent from the foregoing figures, detailed description, and examples. It should be understood, however, that the figures, detailed description, and examples, while indicating specific embodiments of the disclosure, are given by way of illustration only and are not meant to be limiting. Additionally, it is contemplated that changes and modifications within the spirit and scope of the disclosure will become apparent to those skilled in the art from the detailed description. In further embodiments, features from specific embodiments may be combined with features from other embodiments. For example, features from one embodiment may be combined with features from any of the other embodiments. In further embodiments, additional features may be added to the specific embodiments described herein.

Claims

What is claimed is:

1. A computing apparatus, comprising:

at least one memory;

at least one processor configured to execute instructions stored in the memory to perform actions comprising:

receiving, from one or more data sources, well data, reservoir data, and historical production data related to an oil well;

generating synthetic datasets from the well data, the reservoir data, and the historical production data, each synthetic dataset including a plurality of samples;

for each sample of the plurality of samples of each synthetic dataset, performing nodal analysis of the sample to generate a respective gas lift performance curve and to calculate corresponding target values, the corresponding target values including applicability of gas lift injection, a target minimum gas lift injection rate, a target maximum gas lift injection rate, and a target optimal liquid production rate for the sample;

generating engineered features from the well data, the reservoir data, and the historical production data;

selecting features from the well data, the reservoir data, the historical production data, and the engineered features;

constructing classification and regression models using the selected features as inputs to the models;

using a training portion of the plurality of samples of the synthetic datasets and the corresponding target values to train the classification and regression models to generate a trained classification model and a trained regression model, the trained classification model configured to predict applicability of gas lift injection to the oil well and the trained regression model configured to predict optimal minimum gas lift injection rates, optimal maximum gas lift injection rates, and optimal liquid production rates for the oil well.

2. The computing apparatus of claim 1, wherein the at least one processor is further configured to execute instructions stored in the memory to perform actions comprising:

using a testing portion of the plurality of samples of the synthetic datasets and the corresponding target values to evaluate the trained classification model and the trained regression model; and

selecting a best-performing trained classification model to predict applicability of gas lift injection to the oil well, and selecting a best-performing trained regression models to predict optimal minimum gas lift injection rates, optimal maximum gas lift injection rates, and optimal liquid production rates for the oil well.

3. The computing apparatus of claim 2, wherein the at least one processor is further configured to execute the instructions stored in the memory to perform actions comprising:

receiving current production data from one or more components of the oil well;

providing portions of the well data, reservoir data, and the current production data as inputs to features of the selected trained classification model and the selected trained regression models;

in response to the inputs, receiving, from the selected trained classification model, a binary output indicating whether or not gas lift injection is applicable for use in the oil well; and

in response to the inputs, receiving, from the selected trained regression models, a predicted minimum gas lift injection rate and a predicted maximum gas lift injection rate for the oil well, as well as a predicted optimal liquid production rate for the oil well when the predicted minimum gas lift injection rate is applied.

4. The computing apparatus of claim 3, wherein the at least one processor is further configured to execute the instructions stored in the memory to perform actions comprising:

providing control signals to a gas lift compressor and/or a gas lift control valve of the oil well to provide gas lift injection at the predicted minimum gas lift injection rate.

5. The computing apparatus of claim 2, wherein constructing the classification and the regression models comprises:

defining a logistic regression model as the classification model; and

defining a gradient boosting regression model and a random forest regression model as the regression models.

6. The computing apparatus of claim 1, wherein each synthetic dataset includes at least 200 samples.

7. The computing apparatus of claim 1, wherein:

the well data is selected from the group consisting of well inclination, the inner diameter of the tubing or tubing identification, the inner diameter of the wellbore casing or casing identification, the depths at which the gas lift valves are installed (GLV depths), and the pressure decrease across the gas lift valves (dp Loss Actoss Valve);

the reservoir data is selected from the group consisting of reservoir depth, reservoir pressure, reservoir thickness, permeability, skin, and oil density; and

the historical production data is selected from the group consisting of wellhead pressure, gas lift injection pressure, gas lift injection rate, water cut, and gas-oil ratio (GOR).

8. The computing apparatus of claim 1, wherein the at least one processor is further configured to execute the instructions stored in the memory to perform actions comprising:

determining, based on the nodal analysis, a sensitivity of the liquid production rate to changes in the gas lift injection rate for each sample; and

assigning a sensitivity value to each sample based on the determined sensitivity;

wherein the respective gas lift performance curve is generated by plotting the sensitivity of the liquid production to changes in the gas lift injection rate.

9. A method for enhancing oil production in an oil well, the method comprising:

providing a controller in signal communication with a gas lift compressor and/or a gas lift control valve, the controller comprising at least one memory and at least one processor configured to execute instructions stored in the memory, the at least one memory storing a trained optimal gas lift injection range model, the trained optimal gas lift injection range model constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values, the selected features comprising raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well, the synthetic data sets generated from the well data, reservoir data, and historical production data, the corresponding target values calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset, the corresponding target values comprising a target minimum gas lift injection rate and a target maximum gas lift injection rate;

determining, using the trained optimal gas lift injection range model, a predicted optimal range of gas lift injection values for the oil well, the predicted optimal range extending from a predicted minimum gas lift injection rate to a predicted maximum gas lift injection rate; and

providing control signals to the gas lift compressor and/or the gas lift control valve so as to provide gas lift at the predicted minimum gas lift injection rate.

10. The method of claim 9, wherein the at least one memory further stores a trained optimal liquid production model, the trained optimal liquid production model constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values, the selected features comprising raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well, the synthetic data sets generated from the well data, reservoir data, and historical production data, the corresponding target values calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset, the corresponding target values comprising a target optimal liquid production rate; the method further comprising:

determining, using the trained optimal liquid production model, a predicted optimal liquid production rate when gas lift is provided to the oil well at the predicted minimum gas lift injection rate.

11. The method of claim 10, wherein the at least one memory further stores a trained gas lift injection applicability classification model, the trained gas lift injection applicability classification model constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values, the selected features comprising raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well, the synthetic data sets generated from the well data, reservoir data, and historical production data, the corresponding target values calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset, the corresponding target values comprising applicability of gas lift injection; the method further comprising:

determining, using the trained gas lift injection applicability classification model, whether gas lift injection is applicable for use in the oil well.

12. The method of claim 11, further comprising:

receiving, at the controller, well data, reservoir data, and current production data relating to the oil well; and

fine-tuning, at the controller, the trained optimal gas lift injection range model, the trained optimal liquid production model, and the trained gas lift injection applicability classification model based on an actual production rate of the well and the predicted optimal production rate as a result of providing gas lift at the predicted minimum gas lift injection rate.

13. The method of claim 12, further comprising:

providing at least a portion of the well data, the reservoir data, and the current production data as inputs to features of the trained gas lift injection applicability classification model; and

determining, using the trained gas lift injection applicability classification model, a binary output indicating whether or not gas lift injection continues to be applicable for use in the oil well.

14. The method of claim 9, wherein:

the reservoir data is selected from the group consisting of reservoir depth, reservoir pressure, reservoir thickness, permeability, skin, and oil density; and

the historical production data is selected from the group consisting of wellhead pressure, gas lift injection pressure, gas lift injection rate, water cut, and gas-oil ratio (GOR).

15. An oil production system comprising:

a tubing disposed in a wellbore and fluidly coupling an interior of the wellbore to a wellhead;

a gas lift compressor operable to receive and compress a gas from a gas source to produce a compressed gas;

a gas lift control valve operable to receive the compressed gas produced by the gas lift compressor and direct the compressed gas into the wellbore on the outside of the tubing at a gas lift injection rate; and

a controller in signal communication with the gas lift compressor and/or the gas lift control valve and communicatively connected to one or more data sources, the controller configured to receive well data, reservoir data, and historical production data from the one or more data sources, the controller comprising at least one memory and at least one processor configured to execute instructions stored in the memory, the at least one memory storing a trained optimal gas lift injection range model, the trained optimal gas lift injection range model constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values, the selected features comprising raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well, the synthetic data sets generated from the well data, reservoir data, and historical production data, the corresponding target values calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset, the corresponding target values comprising a target minimum gas lift injection rate and a target maximum gas lift injection rate;

the controller further configured to determine, using the trained optimal gas lift injection range model, a predicted optimal range of gas lift injection values for the oil well, the predicted optimal range extending from a predicted minimum gas lift injection rate to a predicted maximum gas lift injection rate, and provide control signals to the gas lift compressor and/or the gas lift control valve to automatically modify the gas lift injection rate to be within the predicted optimal range of gas lift injection rate so as to facilitate an optimal liquid production rate of the produced fluids.

16. The system of claim 15, wherein the one or more data sources is selected from the group consisting of the gas lift compressor and/or the gas lift control valve, a well and reservoir data source, a fluid analyzer, one or more pressure sensors, one or more temperature sensors, and one or more flow rate sensors.

17. The system of claim 16, wherein:

the reservoir data is selected from the group consisting of reservoir depth, reservoir pressure, reservoir thickness, permeability, skin, and oil density; and

the historical production data is selected from the group consisting of wellhead pressure, gas lift injection pressure, gas lift injection rate, water cut, and gas-oil ratio (GOR).

18. The system of claim 17, wherein the at least one memory further stores a trained optimal liquid production model, the trained optimal liquid production model constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values, the selected features comprising raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well, the synthetic data sets generated from the well data, reservoir data, and historical production data, the corresponding target values calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset, the corresponding target values comprising a target optimal liquid production rate;

wherein the controller is further configured to determine, using the trained optimal liquid production model, a predicted optimal liquid production rate when gas lift is provided to the oil well at the predicted minimum gas lift injection rate.

19. The system of claim 18, wherein the at least one memory further stores a trained gas lift injection applicability classification model, the trained gas lift injection applicability classification model constructed using selected features as model inputs and trained using a training portion of a plurality of samples of synthetic data sets and corresponding target values, the selected features comprising raw features and engineered features determined based on well data, reservoir data, and historical production data related to the oil well, the synthetic data sets generated from the well data, reservoir data, and historical production data, the corresponding target values calculated based on nodal analysis of each sample of the plurality of samples of each synthetic dataset, the corresponding target values comprising applicability of gas lift injection;

wherein the controller is further configured to determine, using the trained gas lift injection applicability classification model, whether gas lift injection is applicable for use in the oil well.

20. The system of claim 19, wherein the controller is further configured to:

receive current production data from the one or more data sources;

provide at least a portion of the well data, the reservoir data, and the current production data as inputs to features of the trained gas lift injection applicability classification model; and

determine, using the trained gas lift injection applicability classification model, a binary output indicating whether or not gas lift injection continues to be applicable for use in the oil well.

Resources