US20250382872A1
2025-12-18
18/743,148
2024-06-14
Smart Summary: Raw flow rate data is collected from sensors on a pipeline in a well, along with additional information from other sensors. This data is then cleaned up and organized into a more usable format. Two different models are used to refine the data further, creating two separate sets of cleansed flow rate information. A third model combines all the cleaned data to produce high-quality flow rate information. Finally, this reliable flow rate data is sent to the well's control system for monitoring and management. 🚀 TL;DR
A method includes obtaining raw multiphase flow rate data from a first set of sensors disposed on a pipeline of a well and obtaining auxiliary data from a second set of sensors disposed on the pipeline. The method further includes preprocessing the raw multiphase flow rate data to form a preprocessed multiphase flow rate dataset (“preprocessed dataset”), and determining, with a first model processing the preprocessed dataset, a first cleansed flow rate dataset. The method further includes determining, with a second model processing the preprocessed dataset and the auxiliary data, a second cleansed flow rate dataset. The method further includes determining high quality flow rate data with a third model processing the preprocessed dataset, the first cleansed flow rate dataset, and the second cleansed flow rate dataset. The method further includes transmitting the high quality flow rate data to a control system of the well.
Get notified when new applications in this technology area are published.
E21B47/10 » CPC main
Survey of boreholes or wells Locating fluid leaks, intrusions or movements
E21B2200/20 » CPC further
Special features related to earth drilling for obtaining oil, gas or water Computer models or simulations, e.g. for reservoirs under production, drill bits
E21B2200/22 » CPC further
Special features related to earth drilling for obtaining oil, gas or water Fuzzy logic, artificial intelligence, neural networks or the like
As oil, gas, and water are produced from a well, they typically flow as a non-homogeneous mixture of phases through a pipeline from the wellhead to a separator. A multiphase flow meter (MPFM) is a device that may be installed on a pipeline to measure the rate at which each phase (oil, gas, water) is flowing. Multiphase flow rate measurements are essential for reservoir monitoring and play a significant role in production optimization from oil and gas fields, especially in an offshore environment.
Virtual Flow Metering (VFM) techniques build models from the time series of sensor data from sensors on the pipeline to infer the flow rates. VFM models may be used as a standalone solution of multiphase flow rate monitoring, or in a combination with a multiphase flow meter (MPFM) as a back-up system such that it can use the information from a MPFM to further improve the flowrate estimates. However, the performance of VFM models heavily relies on the quality of the ground truth labels, i.e., the target measurements, such as oil flow rate, used during training.
Ideally, the model-building phase of a VFM uses a clean dataset with minimum error and no missing data points. However, this is only possible using synthetic datasets, and almost impossible when using real-world datasets from the sensors of the pipeline, due to the imperfection of the sensors and data conditioning. For example, a reference physical flow meter in the pipeline may sometimes malfunction and produce null readings or extreme readings, or even be offline and produce no measurements. In addition, the database for storing the reference physical flow meter readings may miss some readings or record wrong values. In fact, the data preparation and cleansing take significant effort from both domain experts and data scientists to acquire a good dataset for use in VFM model building and training.
Therefore, it is desirable to establish an efficient and robust method and system that can analyze, quality control and recover high quality flow rate data from the noisy meter readings in the raw database for use in VFM model building and training.
This summary is provided to introduce a selection of concepts that are further described below in the detailed description. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in limiting the scope of the claimed subject matter.
In one aspect, embodiments disclosed herein relate to a method. The method includes obtaining raw multiphase flow rate data from a first set of sensors disposed on a pipeline of a well and obtaining auxiliary data from a second set of sensors disposed on the pipeline. The method further includes preprocessing the raw multiphase flow rate data to form a preprocessed multiphase flow rate dataset, and determining, with a first model processing the preprocessed multiphase flow rate dataset, a first cleansed flow rate dataset. The method further includes determining, with a second model processing the preprocessed multiphase flow rate dataset and the auxiliary data, a second cleansed flow rate dataset, and determining, with a third model processing the preprocessed multiphase flow rate dataset, the first cleansed flow rate dataset, and the second cleansed flow rate dataset, high quality flow rate data. The method further includes transmitting the high quality flow rate data to, at least, a control system of the well, wherein operation of the well is based on the high quality flow rate data.
In one aspect, embodiments disclosed herein relate to a system. The system includes a first set of sensors disposed on a pipeline of a well, a second set of sensors disposed on the pipeline, a set of models, comprising a first model, a second model and a third model; and a computer. The computer is configured to obtain raw multiphase flow rate data from a first set of sensors disposed on a pipeline of a well and obtain auxiliary data from a second set of sensors disposed on the pipeline. The computer is further configured to preprocess the raw multiphase flow rate data to form a preprocessed multiphase flow rate dataset, and determine, with a first model processing the preprocessed multiphase flow rate dataset, a first cleansed flow rate dataset. The computer is further configured to determine, with a second model processing the preprocessed multiphase flow rate dataset and the auxiliary data, a second cleansed flow rate dataset, and determine, with a third model processing the preprocessed multiphase flow rate dataset, the first cleansed flow rate dataset, and the second cleansed flow rate dataset, high quality flow rate data. The computer is further configured to transmit the high quality flow rate data to, at least, a control system of the well, wherein operation of the well is based on the high quality flow rate data.
In one aspect, embodiments herein relate to a non-transitory machine-readable medium comprising a plurality of machine-readable instructions executed by one or more processors. The plurality of machine-readable instructions cause the one or more processors to perform a method. The method includes obtaining raw multiphase flow rate data from a first set of sensors disposed on a pipeline of a well and obtaining auxiliary data from a second set of sensors disposed on the pipeline. The method further includes preprocessing the raw multiphase flow rate data to form a preprocessed multiphase flow rate dataset, and determining, with a first model processing the preprocessed multiphase flow rate dataset, a first cleansed flow rate dataset. The method further includes determining, with a second model processing the preprocessed multiphase flow rate dataset and the auxiliary data, a second cleansed flow rate dataset, and determining, with a third model processing the preprocessed multiphase flow rate dataset, the first cleansed flow rate dataset, and the second cleansed flow rate dataset, high quality flow rate data. The method further includes transmitting the high quality flow rate data to, at least, a control system of the well, wherein operation of the well is based on the high quality flow rate data.
Other aspects and advantages of the claimed subject matter will be apparent from the following description and the appended claims.
Specific embodiments of the disclosed technology will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
FIG. 1 depicts a pipeline, in accordance with one or more embodiments.
FIG. 2 depicts a multiphase fluid flowing in a pipe, in accordance with one or more embodiments.
FIG. 3 depicts a multiphase flow meter, in accordance with one or more embodiments.
FIG. 4 depicts a system, in accordance with one or more embodiments.
FIG. 5 depicts a neural network, in accordance with one or more embodiments.
FIG. 6 depicts a flowchart, in accordance with one or more embodiments.
FIG. 7 illustrates an example of inferred flow rates in accordance with one or more embodiments.
FIG. 8 depicts a flowchart, in accordance with one or more embodiments.
FIG. 9 depicts a computing system, in accordance with one or more embodiments.
In the following detailed description of embodiments of the disclosure, numerous specific details are set forth in order to provide a more thorough understanding of the disclosure. However, it will be apparent to one of ordinary skill in the art that the disclosure may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
Throughout the application, ordinal numbers (e.g., first, second, third, etc.) may be used as an adjective for an element (i.e., any noun in the application). The use of ordinal numbers is not to imply or create any particular ordering of the elements nor to limit any element to being only a single element unless expressly disclosed, such as using the terms “before,” “after,” “single,” and other such terminology. Rather, the use of ordinal numbers is to distinguish between the elements. By way of an example, a first element is distinct from a second element, and the first element may encompass more than one element and succeed (or precede) the second element in an ordering of elements.
It is to be understood that the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to an “earth property” can include reference to one or more of such earth properties.
Terms such as “approximately,” “substantially,” etc., mean that the recited characteristic, parameter, or value need not be achieved exactly, but that deviations or variations, including for example, tolerances, measurement error, measurement accuracy limitations and other factors known to those of skill in the art, may occur in amounts that do not preclude the effect the characteristic was intended to provide.
It is to be understood that one or more of the steps shown in the flowchart may be omitted, repeated, and/or performed in a different order than the order shown. Accordingly, the scope disclosed herein should not be considered limited to the specific arrangement of steps shown in the flowchart.
Although multiple dependent claims are not introduced, it would be apparent to one of ordinary skill that the subject matter of the dependent claims of one or more embodiments may be combined with other dependent claims.
In the following description of FIGS. 1-9, any component described with regard to a figure, in various embodiments disclosed herein, may be equivalent to one or more like-named components described with regard to any other figure. For brevity, descriptions of these components will not be repeated with regard to each figure. Thus, each and every embodiment of the components of each figure is incorporated by reference and assumed to be optionally present within every other figure having one or more like-named components. Additionally, in accordance with various embodiments disclosed herein, any description of the components of a figure is to be interpreted as an optional embodiment which may be implemented in addition to, in conjunction with, or in place of the embodiments described with regard to a corresponding like-named component in any other figure.
Virtual flow metering (VFM) models are an attractive way of monitoring multiphase flow, e.g., the flow of oil, gas and water phases in a pipeline, as they can leverage the sensor measurements from sensors already present on the pipeline without requiring additional parts or maintenance. VFM models can be used as a stand-alone solution, or in a combination with a multiphase flow meter (MPFM) as a back-up system such that it can use the information from the MPFM to further improve the flowrate estimates. In general, embodiments disclosed herein relate to methods and systems to provide high quality data for training Virtual Flow Metering (VFM) models, and for determining flowrates using the trained VFM model.
In accordance with one or more embodiments, FIG. 1 depicts a simplified portion of a pipeline (100) of a multilateral well in an oil and gas field. Herein, an oil and gas field is broadly defined to consist of wells which produce at least some oil and/or gas. Hydrocarbon wells typically produce oil, gas, and water in combination. The relative amounts of oil, gas, and water may differ between wells and vary over any one well's lifetime.
For clarity, the pipeline (100) is divided into three sections; namely, a subsurface (102) section, a tree (104) section, and a flowline (106) section. It is emphasized that pipelines (100) and other components of wells and, more generally, oil and gas fields may be configured in a variety of ways. As such, one with ordinary skill in the art will appreciate that the simplified view of FIG. 1 does not impose a limitation on the scope of the present disclosure. As part of the subsurface (102) section, FIG. 1 shows an inflow control valve (ICV) (101). An ICV (101) is an active component usually installed during well completion. The ICV (101) may partially or completely choke flow into a well. Generally, multiple ICVs (101) are installed along the reservoir section of a wellbore. Each ICV (101) is separated from the next by a packer. Each ICV (101) can be adjusted and controlled to alter flow within in the well and, as the reservoir depletes, prevent unwanted fluids from entering the wellbore.
The subsurface (102) section of the pipeline (100) has a subsurface safety valve (SSSV) (103). The SSSV (103) is designed to close and completely stop flow in the event of an emergency. Generally, an SSSV (103) is designed to close on failure. That is, the SSSV (103) requires a signal to stay open and loss of the signal results in the closing of the valve. Also shown as part of the subsurface (102) section is a permanent downhole monitoring system (PDHMS) (124). The PDHMS (124) consists of a plurality of sensors, gauges, and controllers to monitor subsurface flowing and shut-in pressures and temperatures. As such, a PDHMS (124) may indicate, in real-time, the state or operating condition of subsurface equipment and the fluid flow.
Turning to the tree (104) section of FIG. 1 is a master valve (MV) (105), a surface safety valve (SSV) (107), and a wing valve (WV) (109). The MV (105) controls all flow from the wellbore. For safety considerations, a MV (105) is usually considered so important that two master valves (MVs) (second not shown) are used wherein one acts as a backup. Like unto the SSSV (103), the SSV (107) is a valve installed on the upper portions of the wellbore to provide emergency closure and stoppage of flow. Again, SSVs (107) are designed to close on failure. One or more WVs (109) may be located on the side of the tree (104) section, or on temporary surface flow equipment (not shown). WVs (109) may be used to control and isolate production fluids and/or be used for treatment or well-control purposes.
Also shown in FIG. 1 is a control valve (CV) (111) and a pressure gauge (PG) (113). The CV (111) is a valve that controls a process variable, such as pressure, flow, or temperature, by modulating its opening. The PG (113) monitors the fluid pressure at the tree (104) section.
Turning to the flowline (106) section, the flowline (106) transports (108) the fluid from the well to a storage or processing facility (not shown). A choke valve (119) is disposed along the flowline (106). The choke valve (119) is used to control flow rate and reduce pressure for processing the extracted fluid at a downstream processing facility. In particular, effective use of the choke valve (119) prevents damage to downstream equipment and promotes longer periods of production without shut-down or interruptions. The choke valve (119) is bordered by an upstream pressure transducer (115) and a downstream pressure transducer (117) which monitor the pressure of the fluid entering and exiting the choke valve (119), respectively. The flowline (106) shown in FIG. 1 has a block and bleed valve system (121) which acts to isolate or block the flow of fluid such that it does not reach other downstream components. The flowline (106) may also be outfitted with one or more temperature sensors (123).
The various valves, pressure gauges and transducers, and sensors depicted in FIG. 1 may be considered field devices of an oil and gas field. As shown, these field devices may be disposed both above and below the surface of the Earth. These field devices are used to monitor and control components and sub-processes of an oil and gas field. It is emphasized that the oil and gas field devices depicted in FIG. 1 are non-exhaustive. Additional devices, such as electrical submersible pumps (ESPs) (not shown) may be present in an oil and gas field with their associated sensing and control capabilities. For example, an ESP may monitor the temperature and pressure of a fluid local to the ESP and may be controlled through adjustments to ESP speed or frequency.
The field devices may be distributed, local to the sub-processes and associated components, global, connected, etc. The field devices may be of various control types, such as a programmable logic controller (PLC) or a remote terminal unit (RTU). For example, a programmable logic controller (PLC) may control valve states, pipe pressures, warning alarms, and/or pressure releases throughout the oil and gas field. In particular, a programmable logic controller (PLC) may be a ruggedized computer system with functionality to withstand vibrations, extreme temperatures, wet conditions, and/or dusty conditions, for example, around a pipeline (100). With respect to an RTU, an RTU may include hardware and/or software, such as a microprocessor, that connects sensors and/or actuators using network connections to perform various processes in the automation system. As such, a distributed control system may include various autonomous controllers (such as remote terminal units) positioned at different locations throughout the oil and gas field to manage operations and monitor sub-processes. Likewise, a distributed control system may include no single centralized computer for managing control loops and other operations.
In accordance with one or more embodiments, FIG. 1 depicts the virtual flowmeter (VFM) model (125). The VFM model includes functionality for field device monitoring and data collection, and to estimate multiphase flow rates from field device data (i.e., a VFM estimate), so as to continuously provide accurate multiphase flow rate measurements regardless of the state of any MPFM. To emphasize that the VFM model (125) can monitor the various field devices of a well and/or an oil and gas field, dashed lines connecting various field devices to the VFM model (125) are shown in FIG. 1.
FIG. 2 depicts a simplified view of a cross-section of a flowline (106) carrying a multiphase fluid. As seen, the multiphase fluid may have multiple constituents such as gas (202), water (204), and oil (206). The various constituents of the multiphase fluid may be distributed within the flowline (106) in a myriad of ways. As a non-limiting example, gas (202) may be enclosed by liquids (water or oil) forming bubbles (210). Or, in contrast, liquid droplets, such as oil droplets (216) and water droplets (212), may be dispersed in the gas (202) to form a mist. In general, the state of the multiphase fluid may be described using broad classifications. That is, the multiphase fluid may be categorized as “bubbly,” “annular,” “churn,” “mist,” “stratified,” or other designations (flow classes) based on the distribution of the constituents and their relative quantities. The state of the multiphase fluid may be transient such that any assignment of flow class may change with time.
Oil and gas field devices, like those shown in FIG. 1 (and others not shown), monitor and govern the behavior of the components and sub-processes of the well and/or the oil and gas field. Therefore, the productivity of the well and/or the oil and gas field is directly affected, and may be altered by, at least some, of the field devices. Generally, complex interactions between oil and gas field components and sub-processes exist such that configuring field devices for optimal production is a difficult and laborious task. Further, the state and behavior of oil and gas fields is transient over the lifetime of the constituent wells requiring continual changes to the field devices to enhance production.
To inform and optimize the settings of the field devices of a pipeline (100) to maximize hydrocarbon production, it is beneficial, if not critical, to determine the instantaneous state of the multiphase flow. To this end, the pipeline (100) depicted in FIG. 1 is outfitted with a multiphase flow meter (MPFM) (127). A MPFM (127) is a device installed on the flowline (106) to measure the rate at which each phase—oil, gas, water—is flowing. That is, the MPFM (127) may detect the instantaneous amount of gas, oil, and water flowing in the pipeline (100). As such, the MPFM (127) indicates additional quantities such as percent water cut (% WC) and the gas-to-oil ratio (GOR).
In general, a MPFM (127) cannot directly measure the flow rate of the individual phases in a fluid. Rather, a MPFM (127) is a collection of sensors, transmitters, mechanical devices, flow conduits, and programmed relationships that are used to determine the individual phase flow rates. FIG. 3 depicts a MPFM (127) in accordance with one or more embodiments. A MPFM (127) may be disposed inline with a flowline (106), or proximate to a flowline (106), such that the MPFM (127) can receive the multiphase fluid (301) through a flow inlet (302) and return the fluid through a flow outlet (304). The MPFM (127) of FIG. 3 includes one or more pressure sensors (306) and temperature sensors (308) to measure the pressure (P) and temperature (T), respectively, at various locations within the MPFM (127). The pressure (P) and temperature (T) measurements are used, in part, to determine the thermophysical state and the thermophysical properties of the multiphase fluid (301). For example, the pressure (P) and temperature (T) values may be used with a functional or tabulated equation of state (EoS) to determine other properties of the multiphase fluid (301).
The MPFM (127) of FIG. 3 uses a gamma densitometer (309) to measure the bulk density of the multiphase fluid (301). The gamma densitometer (309) emits a beam of photons from a nuclear source (310). The emitted photons are attenuated by the multiphase fluid (301) and the amount of attenuation is determined using a nuclear detector (312) that measures the number of received photons. The amount of attenuation is greatly affected by the bulk density of the multiphase fluid (301). Further, because gas has a significantly lower density compared to water and oil, the gamma densitometer (309) can be used to accurately determine the liquid (water and oil) and gas fractions of the multiphase fluid (301).
The MPFM (127) depicted in FIG. 3 further includes a Venturi section. The Venturi section is composed of a Venturi inlet (314), a nozzle (316), a throat section (318), and a diffuser (320). The constriction in flow that occurs in the Venturi section acts to increase the bulk flow velocity of the multiphase fluid (301) and is associated with a decrease in pressure (P). The MPFM (127) is configured with pressure sensors (306) to determine the difference in pressure (DP) (307) of the multiphase fluid (301) between the Venturi inlet (314) and the throat section (318). The Venturi section is used to measure mass flow rates.
The MPFM (127) of FIG. 3 further includes a blind tee or static mixer (322). The blind tee (322) serves to condition the flow of the multiphase fluid (301) such that it is homogeneous before entering the main body of the MPFM (127). The blind tee (322) is disposed immediately upstream of the Venturi section and creates a mixing effect that stabilizes intermittent flow regimes commonly encountered in multiphase fluids (301) associated with wells. The blind tee (322) also mitigates any flow interaction with field devices located upstream from the MPFM (127) such as the choke valve (119).
The MPFM (127) is outfitted with a flow computer (324). The flow computer (324) receives the readings from the sensors (e.g., temperature sensor (308), pressure sensor (306)) of the MPFM (127). That is, the flow computer (324) acts, in part, as a data acquisition unit. Upon collecting the sensor data, the flow computer (324) calculates the individual flow rates of the oil, gas, and water present in the multiphase fluid (301). The flow computer (324) can transmit the computed flow rates and acquired sensor data to an external system such as the VFM model (125). Generally, the flow computer (324) makes use of programmed relationships to determine the phase flow rates from the acquired sensor data. The programmed relationships may include or make use of analytical or tabulated equations of state (EoS) data, phenomenological models, physics-based relationships, bounded correlations, and governing equations (e.g., conservation of mass). In one or more embodiments, the flow computer (324) may be a computer system as depicted in FIG. 9, which will be described in greater detail later in the disclosure.
It is emphasized that the MPFM (127) depicted in FIG. 3 is provided only as an example. In practice, many different types of MPFMs (127) exist. MPFMs (127) may differ in the types of sensors used, the data they collect, and the programmed relationships used to convert the measured flow properties to phase flow rates. For example, in some instances, a MPFM (127) may further be outfitted with a capacitance sensor to determine the fraction of oil, water, and gas in the multiphase fluid (301). In other instances, a MPFM (127) may use any combination of an X-ray source and detector, electrodes, strain gauges, magnetic resonance, and optical sensors and computer vision-based algorithms. One with ordinary skill in the art will recognize that the above description of a MPFM (127) or the components that may make up a MPFM (127) are non-exhaustive and should not be construed to impose a limitation on the instant disclosure.
In general, the VFM model (125) disclosed herein can work with any type of MPFM (127). In accordance with one or more embodiments, there may be a VFM model (125) for each MPFM (127). Again, for clarity, the VFM model (125) will be discussed herein in reference to a single MPFM (127). In general, a VFM model (125) operates by using one or more numerical models to estimate the individual phase flow rates of a multiphase fluid using readily available field device data. In accordance with one or more embodiments, the VFM model (125) receives VFM inputs (inputs from field devices) and processes the VFM inputs with a model to determine the individual phase flow rates. When the phase flow rates are estimated using the VFM model (125), they are often referred to as VFM model determined phase flow rates.
Likewise, to train the VFM model (125), modelling data is required which consists of pairs of VFM inputs (inputs from field devices) and the associated phase flow rates as the targets. In accordance with one or more embodiments, the associated phase flow rates are high quality flow rate data obtained by processing the raw multiphase flow rate data obtained from a set of sensors, such as the MPFM, as discussed in detail below. In accordance with one or more embodiments, the VFM inputs are auxiliary data from a set of field devices, where the auxiliary data are data that are not flow rate, and include, but are not limited to: wellhead pressure, upstream wellhead temperature, downstream wellhead pressure, Venturi differential pressure, choke valve position, ESP frequency, and ESP motor current. The VFM inputs are collected using field devices appropriately disposed on the pipeline (100).
In other words, in general, the modelling data consists of the expected input and desired output for the machine-learned model. In accordance with one or more embodiments, the modelling data is acquired from one or more existing pipelines or from previously collected historical pipeline data.
FIG. 4 illustrates a system (400) for providing data suitable for training a VFM model (125), according to one or more embodiments. It is noted that the elements shown in FIG. 4 are abstractions and that, in practice, an element may not be unique or independent from other elements of the system. Further, the functionality of one or more elements may be shared between any number of elements. The system (400) of FIG. 4 acquires raw multiphase flow rate data (402) and produces high quality flow rate data (404), through the use of a first model (406), a second model (408), and a third model (410). As will be further described below, the first model (406), the second model (408) and the third model (410) are not VFM models. The purpose of the first model (406), the second model (408) and the third model (410) is to produce high quality flow rate data (404) from raw multiphase flow rate data (402), so that the high quality flow rate data (404) can be used by VFM model building (424) to build a VFM model (125).
Raw multiphase flow rate data (402) are acquired, for example for a first time period T1, where T1 is a historical time period. FIG. 4 illustrates that the raw multiphase flow rate data (402) is acquired from a database (412). According to one or more embodiments, the raw multiphase flow rate data (402) in database (412) has been acquired from a MPFM (127) installed on a pipeline (100). Alternatively, according to other embodiments, the raw multiphase flow rate data (402) may be acquired directly from a flowmeter, such as the MPFM (127), without being stored in the database (412).
According to one or more embodiments, the raw multiphase flow rate data (402) is pre-processed by at least one pre-processing algorithm (414a, 414b, 414c) to perform a myriad of pre-processing steps. According to one or more embodiments, more than one pre-processing algorithm (414a, 414b, 414c) may be combined together to form a single pre-processing algorithm. According to one or more embodiments, the pre-processing algorithms (414a, 414b, 414c) may be used to preform basic data cleansing or imputation to make the raw multiphase flow rate data (402) suitable for use with the first model (406), the second model (408) and/or the third model (410). These pre-processing steps may include, but are not limited to, outlier detection and replacement of outliers, imputation, and normalization. One with ordinary skill in the art will recognize that many pre-processing (or processing) steps exist for dealing with a raw multiphase flow rate data (402). As such, one with ordinary skill in the art will appreciate that not all pre-processing (or processing) steps can be enumerated herein and that zero or more pre-processing (or processing) steps may be applied with the methods disclosed herein without imposing a limitation on the instant disclosure.
According to one or more embodiments, the pre-processing (or processing) steps are optional depending on a type of model used for at least one of the models (406, 408, 410). Some examples of when a pre-processing algorithm (414a 414b 414c) may be included in the system (400) are given here, and are not intended as imposing a limitation on the instant disclosure. Gaussian autoregressive (AR) models usually require the normally distributed error of the data. Hence, the raw multiphase flow rate data (402) may be normalized prior to being provided as an input to an AR model. As another example, Decline Curve Analysis (DCA) models expect the input data, to follow certain patterns, such as exponential decline. Therefore, if a DCA model is being used for one of the models (406, 408, 410) the raw multiphase flow rate data (402) may be pre-processed to meet this requirement. Alternatively, some other modeling techniques (e.g. moving average smoothing) do not have such requirements, and therefore the pre-processing steps are optional.
According to one or more embodiments, the pre-processing algorithms (414a, 414b, 414c) may be used to process the raw multiphase flow rate data (402) using both domain knowledge from subject matter experts and model knowledge from data scientists. The pre-processing algorithms (414a, 414b, 414c) may further include exploratory data analysis (EDA) and data quality and control by a combined effort from subject matter experts and data scientists.
According to one or more embodiments, the pre-processing algorithms (414a, 414b, 414c) output a preprocessed multiphase flow rate dataset (402a) for use by the plurality of models (406, 408, 410).
The preprocessed multiphase flow rate dataset (402a) is provided as input to a first model (406). The first model (406), and the data used to train it, will be described in greater detail later in the instant disclosure. However, for now, it is stated that the first model (406) is configured to receive the preprocessed multiphase flow rate dataset (402a) and, upon processing, output a first cleansed flow rate dataset (416).
According to one or more embodiments, the first model (406) is a classical model, where a classical model is defined herein as a model that uses past flow rates to predict future flow rates. The first model (406) is described further below.
The preprocessed multiphase flow rate dataset (402a) is also provided as input to a second model (408). The second model (408), and the data used to train it, will be described in greater detail later in the instant disclosure. However, for now, it is stated that the second model (408) is configured to receive the preprocessed multiphase flow rate dataset (402a) and auxiliary data (422) from field devices (418) for the same time period T1 and, upon processing, output a second cleansed flow rate dataset (420). According to one or more embodiments, the field devices (418) are disposed on the pipeline (100), as discussed previously with respect to FIG. 1. According to one or more embodiments, the field devices (418) are any sensors that do not directly return a flow rate. As detailed previously, the field devices (418) can include temperature and pressure sensors (on the flowline or downhole), ESP frequency, or even sensor data (e.g., temperature and pressure) originating from the MPFM, if the MPFM returns those values in addition to the flow rate measurements.
According to one or more embodiments, the second model (408) is a virtual sensing model, as described further below.
The preprocessed multiphase flow rate dataset (402a), the first cleansed flow rate dataset (416) and the second cleansed flow rate dataset (420) are provided as inputs to a third model (410). The third model (410), and the data used to train it, will be described in greater detail later in the instant disclosure. However, for now, it is stated that the third model (410) is configured to receive the preprocessed multiphase flow rate dataset (402a), and upon processing, infer high quality flow rate data (404) from the preprocessed multiphase flow rate dataset (402a), where the third model (410) is guided by the first cleansed flow rate dataset (416) and the second cleansed flow rate dataset (420).
According to one or more embodiments, the third model (410) is a guided model, as described in detail below. According to one or more embodiments, the third model (410) performs an average or a weighted average of the preprocessed multiphase flow rate dataset (402a), the first cleansed flow rate dataset (416), and the second cleansed flow rate dataset (420) to determine the high quality flow rate data (404). The third model (410) is discussed further below.
According to one or more embodiments, the high quality flow rate data is transmitted to, at least, a control system of the well, wherein operation of the well is based on the inferred flow rate data. According to one or more embodiments, the control system may tune operation configurations, such as driving-gas injection or choke position. According to one or more embodiments, the high quality flow rates may be used for more accurate production planning.
According to one or more embodiments, the high quality flow rate data is transmitted to a logging while drilling system or a reservoir monitoring system.
According to one or more embodiments, the high quality flow rate data (404) may be provided to a VFM model building module (424), where it is used as training data along with the auxiliary data (422) inputs for the first period T1, so as to train a VFM model (125) to determine predicted flow rate data based on newly obtained auxiliary data, for example for a second time period T2 after the first time period T1.
The trained VFM model (125) differs from the first model (406), the second model (408) and the third model (410) in that it only receives the newly obtained auxiliary data as an input, and by processing the newly obtained auxiliary data, the VFM model (125) determines the predicted flow rate data. The first model, by contrast, receives preprocessed multiphase flow rate dataset (402a) as an input, and generates the first cleansed flow rate dataset (416) as an output. The second model (408), by contrast, receives both the preprocessed multiphase flow rate dataset (402a) and auxiliary data (422) from field devices (418) as inputs, and generates the second cleansed flow rate dataset (420) as an output. The third model (410), by contrast, receives the first cleansed flow rate dataset (416), the second cleansed flow rate dataset (420) and the preprocessed multiphase flow rate dataset (402a) as inputs, and generates the high quality flow rate data (404) as an output. In other words, the preprocessed multiphase flow rate dataset (402a), the first cleansed flow rate dataset (416), and the second cleansed flow rate dataset (420) each represent a cleaned version of the original or raw multiphase flow rate data (e.g., from a MPFM) where each version of the cleaned data is determined using a different process and is based on, at least, the raw multiphase flow rate data. For example, in one or more embodiments, the preprocessed multiphase flow rate dataset (420) is determined by applying an imputation method on the raw multiphase flow rate data to replace missing values or value determined to be outliers. The cleaned versions of the multiphase flow rate data, that is, the preprocessed multiphase flow rate dataset (402a), the first cleansed flow rate dataset (416), and the second cleansed flow rate dataset (420) are then jointly considered by the third model (410) to determine the high quality flow rate data (404). Thus, in one or more embodiments, the preprocessing method, the first model (406), the second model (408), and the third model (410) are used on historical flow rate data and associated auxiliary data, in the above-described arrangement, to clean or generate high quality flow rate data (404) that can be used as a target to train an VFM model (125). The accuracy of a trained VFM model (125) depends on the quality of the training data. Thus, the use of high quality flow rate data (404) in the training process of the VFM model (125) ensures that the VFM model (125) can accurately predict multiphase flow rates given newly obtained auxiliary data (i.e., non-historical data). In one or more embodiments, a preprocessing method, first model (406), second model (408), and third model (410) are applied to historical data including raw multiphase flow rate data and auxiliary data to clean the raw multiphase flow rate data (output as high quality flow rate data (404)). Further, in one or more embodiments, the high quality flow rate data (404) is used to train a VFM model (125) to predict multiphase flow rates given auxiliary data.
According to one or more embodiments, based on the predicted flow rate determined by the trained VFM model (125), it may be determined if there is a malfunction on the sensors used to provide raw multiphase flow rate data. For example, the predicted flow rate may be compared to a newly acquired raw multiphase flow rate obtained for the time period T2. Based on the comparison, it may be determined that the sensor that provided the raw multiphase flow rate (e.g. MPFM) has provided a faulty reading.
According to one or more embodiments, once it has been determined that there is a malfunction in the sensor that provided the newly acquired raw multiphase flow rate, the predicted flow rate data from the VFM model (125) may be used instead of the newly obtained raw multiphase flow rate data.
According to one or more embodiments, devices disposed on the pipeline for which the VFM model is used can then be adjusted based on the output of the VFM model (125) (predicted flow rate data) so as to optimize production of the well.
According to one or more embodiments, the first model (406), the second model (408), and the third model (410) are machine-learned models. Therefore, before providing details of these models (406, 408, 410), a cursory introduction to machine-learned models is provided here.
Machine learning (ML), broadly defined, is the extraction of patterns and insights from data. The phrases “artificial intelligence,” “machine learning,” “deep learning,” and “pattern recognition” are often convoluted, interchanged, and used synonymously. This ambiguity arises because the field of “extracting patterns and insights from data” was developed simultaneously and disjointedly among a number of classical arts like mathematics, statistics, and computer science. For consistency, the term machine learning, or machine learned, will be adopted herein. However, one skilled in the art will recognize that the concepts and methods detailed hereafter are not limited by this choice of nomenclature.
Machine-learned model types may include, but are not limited to, neural networks, random forests, generalized linear models, and Bayesian regression. Machine-learned model types are usually associated with additional “hyperparameters” which further describe the model. For example, hyperparameters providing further detail about a neural network may include, but are not limited to, the number of layers in the neural network, choice of activation functions, inclusion of batch normalization layers, and regularization strength. Commonly, in the literature, the selection of hyperparameter surrounding a model is referred to as selecting the model “architecture”. Consequently, in many circumstances, a machine-learned model may be specified by indicating its type and associated hyperparameters.
In some embodiments, the ML model may be a neural network (NN). Thus, a cursory introduction to an NN is provided herein. However, note that many variations of an NN exist. Therefore, one of ordinary skill in the art will recognize that any variation of an NN (or any other ML model) may be employed without departing from the scope of this disclosure. Further, it is emphasized that the following discussion of an NN is a basic summary and should not be considered limiting.
A diagram of a neural network is shown in FIG. 5. At a high level, a neural network (500) may be graphically depicted as being composed of nodes (502), where here any circle represents a node, and edges (504), shown here as directed lines. The nodes (502) may be grouped to form layers (505). FIG. 5 displays four layers (508, 510, 512, 514) of nodes (502) where the nodes (502) are grouped into columns, however, the grouping need not be as shown in FIG. 5. The edges (504) connect the nodes (502). Edges (504) may connect, or not connect, to any node(s) (502) regardless of which layer (505) the node(s) (502) is in. That is, the nodes (502) may be sparsely and residually connected. A neural network (500) will have at least two layers (505), where the first layer (508) is considered the “input layer” and the last layer (514) is the “output layer”. Any intermediate layer (510, 512) is usually described as a “hidden layer”. A neural network (500) may have zero or more hidden layers (510, 512) and a neural network (500) with at least one hidden layer (510, 512) may be described as a “deep” neural network or as a “deep learning method.” In general, a neural network (500) may have more than one node (502) in the output layer (514). In this case the neural network (500) may be referred to as a “multi-target” or “multi-output” network.
Nodes (502) and edges (504) carry additional associations. Namely, every edge is associated with a numerical value. The edge numerical values, or even the edges (504) themselves, are often referred to as “weights” or “parameters”. While training a neural network (500), numerical values are assigned to each edge (504). Additionally, every node (502) is associated with a numerical variable and an activation function. Activation functions are not limited to any functional class, but traditionally follow the form
A = f ( ∑ i ∈ ( incoming ) [ ( node value ) i ( edge value ) i ] ) ,
where i is an index that spans the set of “incoming” nodes (502) and edges (504) and ƒ is a user-defined function. Incoming nodes (502) are those that, when viewed as a graph (as in FIG. 5), have directed arrows that point to the node (502) where the numerical value is being computed. Some functions for ƒ may include the linear function ƒ(x)=x, sigmoid function
f ( x ) = 1 1 + e - x ,
and rectified linear unit function ƒ(x)=max(0, x), however, many additional functions are commonly employed. Every node (502) in a neural network (500) may have a different associated activation function. Often, as a shorthand, activation functions are described by the function ƒ by which it is composed. That is, an activation function composed of a linear function ƒ may simply be referred to as a linear activation function without undue ambiguity.
When the neural network (500) receives an input, the input is propagated through the network according to the activation functions and incoming node (502) values and edge (504) values to compute a value for each node (502). That is, the numerical value for each node (502) may change for each received input. Occasionally, nodes (502) are assigned fixed numerical values, such as the value of 1, that are not affected by the input or altered according to edge (504) values and activation functions. Fixed nodes (502) are often referred to as “biases” or “bias nodes” (505), displayed in FIG. 5 with a dashed circle.
In some implementations, the neural network (500) may contain specialized layers (505), such as a normalization layer, or additional connection procedures, like concatenation. One skilled in the art will appreciate that these alterations do not exceed the scope of this disclosure.
As noted, the training procedure for the neural network (500) comprises assigning values to the edges (504). To begin training the edges (504) are assigned initial values. These values may be assigned randomly, assigned according to a prescribed distribution, assigned manually, or by some other assignment mechanism. Once edge (504) values have been initialized, the neural network (500) may act as a function, such that it may receive inputs and produce an output. As such, at least one input is propagated through the neural network (500) to produce an output. Generally, a training dataset is provided the neural network for training. The training dataset is composed of inputs and associated target(s), where the target(s) represent the “ground truth”, or the otherwise desired output. The neural network (500) output is compared to the associated input data target(s). The comparison of the neural network (500) output to the target(s) is typically performed by a so-called “loss function”; although other names for this comparison function such as “error function” and “cost function” are commonly employed. Many types of loss functions are available, such as the mean-squared-error function, however, the general characteristic of a loss function is that the loss function provides a numerical evaluation of the similarity between the neural network (500) output and the associated target(s). The loss function may also be constructed to impose additional constraints on the values assumed by the edges (504), for example, by adding a penalty term, which may be physics-based, or a regularization term. Generally, the goal of a training procedure is to alter the edge (504) values to promote similarity between the neural network (500) output and associated target(s) over the data set. Thus, the loss function is used to guide changes made to the edge (504) values, typically through a process called “backpropagation”.
While a full review of the backpropagation process exceeds the scope of this disclosure, a brief summary is provided. Backpropagation consists of computing the gradient of the loss function over the edge (504) values. The gradient indicates the direction of change in the edge (504) values that results in the greatest change to the loss function. Because the gradient is local to the current edge (504) values, the edge (504) values are typically updated by a “step” in the direction indicated by the gradient. The step size is often referred to as the “learning rate” and need not remain fixed during the training process. Additionally, the step size and direction may be informed by previously seen edge (504) values or previously computed gradients. Such methods for determining the step direction are usually referred to as “momentum” based methods.
Once the edge (504) values have been updated, or altered from their initial values, through a backpropagation step, the neural network (500) will likely produce different outputs. Thus, the procedure of propagating at least one input through the neural network (500), comparing the neural network (500) output with the associated target(s) with a loss function, computing the gradient of the loss function with respect to the edge (504) values, and updating the edge (504) values with a step guided by the gradient, is repeated until a termination criterion is reached. Common termination criteria are: reaching a fixed number of edge (504) updates, otherwise known as an iteration counter; a diminishing learning rate; noting no appreciable change in the loss function between iterations; reaching a specified performance metric as evaluated on the data or a separate hold-out dataset. Once the termination criterion is satisfied, and the edge (504) values are no longer intended to be altered, the neural network (500) is said to be “trained.”
According to one or more embodiments, the first model (406) is a classical model. The classical models are trained (or “fitted”) by sliding a window over the preprocessed multiphase flow rate dataset. The classical model will predict a current flow rate using the last N flow rate measurements, where N is a positive integer specified by a user.
According to one or more embodiments, the first model comprises a plurality of classical models, where each classical model is formed for a particular phase (water, oil, and gas). Each classical model can be informed by the past flow rates of its own phase or the past flow rates of all of the phases.
According to one or more embodiments, the first model (406) is a physics or statistical model, a machine-learned model, as described above, or a hybrid of a physics model and a machine-learned model. According to one or more embodiment, the first model (406) may comprise of one or multiple models. Example physics models included exponential smoothing, and Decline Curve Analysis (DCA). Example machine learned models include Auto Regressive Integrated Moving Average (ARIMA) and its variants, and DeepAR.
While the below give details of some model architectures for the first model, it would be clear to the person of ordinary skill in the art that other model architectures and types could be used to implement the first model.
According to one or more embodiments, the first model (406) is an ARIMA model expressed as ARIMA (p,D,q), where the parameters p, D and q denote the structure of the forecasting model. The ARIMA model is a combination of auto-regression AR(p), moving average MA(q) and differencing degree D. In particular, an ARIMA model may include an autoregressive component AR(p) that captures the relationship between the value of a time series at a timestep and the values of the times series at one or more prior timesteps. The ARIMA model may further include computing the difference D between the value of the time series at a timestep and the value of the time series at a previous timestep and finally, further include a moving average component MA(q), that captures the relationship between the value of a time series at a timestep and the values of residual errors at prior timesteps, a residual error being defined as the differences between the exact values of the time series and the predicted values of the time series that are being computed by the ARIMA model.
The mathematical formula of the ARIMA model, denoted ARIMA (p,D,q), can be described by the formula below.
( 1 - ∑ i = 1 p φ i L i ) ( 1 - L ) D χ t = ( 1 + ∑ i = 1 q θ i L i ) ϵ t
where L denotes the lag operator, θi are the parameters of the autoregressive part of the model, θi are the parameters of the MA part, and ∈t are residual error terms.
An autoregressive model, such as an ARIMA model, may be implemented within a machine-learned model as described above. According to an embodiment the ARIMA model may be implemented by an RNN.
The first model (406) may be trained on a historical dataset comprising historical flow rate data, for example for a time period T1. Details of the training of models is described below with reference to FIG. 6.
The trained first model (406) is trained to predict the first cleansed flow rate dataset (416) at time T1 as a cleaned replacement for abnormal preprocessed multiphase flow rate dataset (402a), which, for example, may comprise corrupted or missing data. Some modeling technology (e.g., ARIMA) may allow additional forecasting information like prediction confidence in user-defined intervals.
According to one or more embodiments, the second model (408) is a virtual sensing model. According to one or more embodiments, the second model (408) may comprise one or more virtual sensing models. The second model (408) is developed to infer second cleansed flow rate dataset (420) from auxiliary data (422) acquired from field devices (418). The second model (408) may be a physics model such as simulation or empirical models, a machine learning model or a hybrid of the two. Example models include random forest regression, XGBoost regression, deep fully connected network (DFCN) regression, or long short-term memory (LSTM) regression.
The second model (408), operating as a virtual sensing model, is trained on historical auxiliary data, for example for time period T1, as input features and historical target flow rate data, for example at time T1, as ground truth labels. According to one or more embodiments, the historical auxiliary data is obtained from the field devices (418) on a pipeline (100). According to one or more embodiments, the historical target flow rate data is obtained from an MPFM (127).
The trained second model (408) represents the correlation between the target flow rates from the MPFM (127) and the auxiliary data from the field devices (418), thus it reflects the implicit characteristics of the system (400) in observation. The trained second model (408) can then determine the second cleansed flow rate data (420) from auxiliary data (422) and preprocessed multiphase flow rate dataset (402a).
As aforementioned, the first model (406) and second model (408) can extract information from raw multiphase flow rate data (402) and auxiliary data (422) to infer the first cleansed flow rate dataset (416) and the second cleansed flow rate dataset (420). However, since the data are noisy in practice, the first model (406) and the second model (408) built from this noisy data can only predict with some uncertainty. Meanwhile, the raw multiphase flow rate data (402) is subject to its own variance.
According to one or more embodiments, the third model (410) performs an average or a weighted average of the preprocessed multiphase flow rate dataset (402a), the first cleansed flow rate dataset (416), and the second cleansed flow rate dataset (420) to determine the high quality flow rate data (404). By performing an average, the output of the third model (410) may have lower variance than the output from the first model (406) or second model (408) only.
According to one or more embodiments, the third model (410) is a guided model used to recover the robust estimation of the flow rate data, by incorporating the outputs of the first model (406) and the second model (408) and the preprocessed multiphase flow rate dataset (402a). The third model (410) utilizes the outputs of the first cleansed flow rate dataset (416) and the second cleansed flow rate dataset (420) to guide the estimation of the high quality flow rate data (404) from the preprocessed multiphase flow rate dataset (402a), so it can reduce the uncertainty and improve the robustness of flow rates estimation.
According to one or more embodiments, the third model (410) is a Gaussian mixture model or a neural network, trained to predict the flow rates estimation with optimal variance update.
According to one or more embodiments, a data quality control operation can be performed on the high quality flow rate data (404), such as median values and variation, before using the high quality flow rate data (404) in the VFM model building (424). This can be achieved by using certain criteria or another classification model, such as another machine-learned model (not shown in FIG. 4).
The high quality flow rate data (404) can be used to train the VFM model (125) during the VFM building stage (424). For example, the high quality flow rate data (404) may be used as target output data and sensor data, such as the auxiliary data (422), may be used as input data, where both the high quality flow rate data (404) and the auxiliary data (422) are for the time period T1. In this way, the VFM model (125) can be trained to infer multiphase flow rate data for a pipeline.
FIG. 6 depicts the general process of selecting and training a model, such as a machine-learned model, in accordance with one or more embodiments. The process shown in FIG. 6 may be applied to train the first model (406), the second model (408), the third model (410) or the VFM model (125).
To start, as shown in Block 602, modelling data is received. The modelling data consists of input and target pairs. For example, to train the first model (406), historical multiphase flow rate data may be provided, where the target is a given flow rate measurement from the historical multiphase flow rate data, and the inputs are a number N of flow measurements from the historical multiphase flow rate data obtained at a time before the target.
Likewise, to train the second model (408), the modelling data consists of pairs of historical auxiliary data inputs, such as those from field devices disposed on the pipeline, and the associated flow rate data, such as those measured by a MPFM, for a time period T1. In accordance with one or more embodiments, the auxiliary data inputs include, but are not limited to: wellhead pressure, upstream wellhead temperature, downstream wellhead pressure, Venturi differential pressure, choke valve position, ESP frequency, and ESP motor current.
Likewise, to train the third model (410), the modelling data consists of pairs of historical flow rate data inputs for a time period T1 and the associated target phase flow rates.
Likewise, to train the VFM model (125), the modelling data consists of pairs of sensor data inputs and the associated high quality flow rate data (404) for the time period T1. In accordance with one or more embodiments, the sensor data inputs include, but are not limited to: wellhead pressure, upstream wellhead temperature, downstream wellhead pressure, Venturi differential pressure, choke valve position, ESP frequency, and ESP motor current.
According to one or more embodiments, the same historical dataset may be used to train multiple models such as the first model (406), the second model (408), the third model (410) or the VFM model (125).
In other words, in general, the modelling data consists of the expected input and desired output for the machine-learned model. In accordance with one or more embodiments, the modelling data is acquired from one or more sensors, one or more field devices, or one or more MPFMs (127) disposed on a pipeline. Returning to FIG. 6, in one or more embodiments, the modelling data is preprocessed as depicted by Block 604. Preprocessing, at a minimum, comprises altering the modelling data so that it is suitable for use with machine-learned models. For example, numericalizing categorical data or removing data entries with missing values. Other typical preprocessing methods are normalization and imputation. Information surrounding the preprocessing steps is saved for potential later use. For example, if normalization is performed then a computed mean vector and variance vector are retained. This allows future modelling data to be preprocessed identically. Values computed and retained during preprocessing are referred to herein as preprocessing parameters. One with ordinary skill in the art will recognize that a myriad of preprocessing methods beyond numericalization, removal of modelling data entries with missing values, normalization, and imputation exist. Descriptions of a select few preprocessing methods herein do not impose a limitation on the preprocessing steps encompassed by this disclosure.
As shown in Block 606, the modelling data is split into training, validation, and test sets. In some embodiments, the validation and test set may be the same such that the data is effectively only split into two distinct sets. In some instances, Block 606 may be performed before Block 604. In this case, it is common to determine the preprocessing parameters, if any, using the training set and then to apply these parameters to the validation and test sets.
In Block 608, one or more machine-learned model types and associated architectures are selected. For example, in one or more embodiments, the first model (406), the second model (408) and/or the third model (410) are composed of more than one model. In this case, each of the one or more models are selected. For each machine-learned model, once the machine-learned model type and hyperparameters have been selected, the machine-learned model is trained using the training set of the modelling data according to Block 610. Common training techniques, such as early stopping, adaptive or scheduled learning rates, and cross-validation may be used during training without departing from the scope of this disclosure.
During training, or once trained, the performance of the trained machine-learned model is evaluated using the validation set as depicted in Block 612. Recall, that in some instances, the validation set and test set are the same. Generally, performance is measured using a function which compares the predictions of the trained machine-learned model to the given targets. A commonly used comparison function is the mean-squared-error function, which quantifies the difference between the predicted value and the actual value when the predicted value is continuous, however, one with ordinary skill in the art will appreciate that many more comparison functions exist and may be used without limiting the scope of the present disclosure.
Block 614 represents a decision: if the trained machine-learned model performance, as measured by a comparison function on the validation set (Block 612), is not suitable, the machine-learned model architecture may be altered (i.e., return to Block 608) and the training process is repeated. There are many mays to alter the machine-learned model architecture in search of suitable trained machine-learned model performance. These include, but are not limited to: selecting a new architecture from a previously defined set; randomly perturbing or randomly selecting new hyperparameters; using a grid search over the available hyperparameters; and intelligently altering hyperparameters based on the observed performance of previous models (e.g., a Bayesian hyperparameter search). Once suitable performance is achieved, the training procedure is complete and the generalization error of the trained machine-learned model(s) is estimated according to Block 616.
Generalization error is an indication of the trained machine-learned model's performance on new, or un-seen data. Typically, the generalization error is estimated using the comparison function, as previously described, using the modelling data that was partitioned into the test set.
As depicted in Block 618, the trained machine-learned model(s) is used “in production”—which means the trained machine-learned model(s) is used to process a received input without having a paired target for comparison. It is emphasized that the inputs received in the production setting, as well as for the validation and test sets, are preprocessed identically to the manner defined in Block 604 as denoted by the connection (622), represented as a dashed line in FIG. 6, between Blocks 618 and 604.
In accordance with one or more embodiments, the performance of the trained VFM (125) model is continuously monitored in the production setting. As an example, the VFM determined flow rates (420), which are the output of the VFM model (125) can be compared to the MPFM (127) determined phase flow rates. That is, the MPFM determined phase flow rates can be compared to the VFM determined phase flow rates from the VFM model (125) using a comparison function to monitor performance. If model performance is suspected to be degrading, the model may be updated. An update may include retraining the model, by reverting to Block 608, with the newly acquired modelling data from the in-production recorded values appended to the training data. An update may also include returning to Block 604 to recalculate any preprocessing parameters, again, after appending the newly acquired modelling data to the existing modelling data.
While the various blocks in FIG. 6 are presented and described sequentially, one of ordinary skill in the art will appreciate that some or all of the blocks may be executed in different orders, may be combined or omitted, and some or all of the blocks may be executed in parallel. Furthermore, the blocks may be performed actively or passively.
FIG. 7 illustrates an example of inferred flow rates according with one or more embodiments. FIG. 7 plots the raw multiphase flow rate data (402), and the corresponding first cleansed flow rate data (416), second cleansed flow rate data (420) and high quality flow rate data (404).
FIG. 8 depicts a flowchart outlining a method, in accordance with one or more embodiments. FIG. 8 illustrates an example of determining high quality flow rate data for raw multiphase flow rate data, which may have been obtained for a first time period. It will be clear to a person of ordinary skill in the art that the method of FIG. 8 may be applied to raw multiphase flow rate data obtained over multiple time periods and for different phases. In Block 802 raw multiphase flow rate data (402) is obtained from a first set of sensors disposed on a pipeline of a well. As such, the first set of sensors are said to be those sensors that return the flowrate of one or more phases (e.g., oil, water, gas) flowing in a pipeline. In some embodiments, the first set of sensors includes a MPFM. As discusses above, the MPFM can include sensors, such as temperature and pressure sensors, when determining the flow rates of one or more phases. Generally, in the context of the first set of sensors, intermediate measurements such as temperature and pressure (if made available by a MPFM) are not considered; only the determined flow rates. That is, in some instances, a MPFM may give a user or operator access to internal sensor measurements in addition to the determined flow rates. In one or more embodiments, the flow rates are considered to be acquired using the first set of sensors (such as the MPFM as a whole) and the internal or intermediate sensor measurement can be considered as acquired from a second set of sensors, described below.
In Block 804, auxiliary data (422) is obtained from a second set of sensors or filed devices (418) disposed on the pipeline. Sensors included in the second set of sensors can be disposed at any location on the pipeline. For example, these sensors can be installed at the surface (e.g., on a flowline) or as part of a PDHMS, where the PDHMS as described above consists of a plurality of sensors, gauges, and controllers to monitor subsurface flowing and shut-in pressures and temperatures. In general, sensors considered part of the second set of sensors are any sensor that does not directly return a flow rate. Examples of such sensors can be used to measure wellhead pressure, upstream wellhead temperature, downstream wellhead pressure, Venturi differential pressure, choke valve position, electrical submersible pump (ESP) frequency, ESP motor current, and the like.
In Block 806, the raw multiphase flow rate data (402) is preprocessed to form a preprocessed multiphase flow rate dataset (402a). Preprocessing can include, for example, normalization and imputation.
In Block 808, with a first model (406) processing the preprocessed multiphase flow rate dataset (402a), a first cleansed flow rate dataset (416) is determined. In one or more embodiments, the first model is a “classical model.” In general, a classical model is defined herein as a model that uses past flow rates to predict future flow rates. For example, the first model can be an ARIMA model that uses a predefined number of past flow rates to predict one or more future flow rates.
In Block 810, with a second model (408) processing the preprocessed multiphase flow rate dataset (402a) and the auxiliary data (422), a second cleansed flow rate dataset (420) is determined. In one or more embodiments, the second model is a virtual sensing model. In general, a virtual sensing model infers second cleansed flow rate dataset (420) from auxiliary data (422) acquired from field devices (418). The second model (408) may be a physics model such as simulation or empirical models, a machine learning model or a hybrid of the two.
In Block 812, with a third model (410) processing the preprocessed multiphase flow rate dataset (402a), the first cleansed flow rate dataset (416), and the second cleansed flow rate dataset (420), high quality flow rate data (404) is determined. In one or more embodiments, the third model is an aggregation function (e.g., a weighted average, a median, etc.), where the preprocessed multiphase flow rate dataset (402a), the first cleansed flow rate dataset (416), and the second cleansed flow rate dataset (420) are aggregated (e.g., averaged) to form the high quality flow rate data (404). In other embodiments, the third model is a Gaussian mixture model.
In Block 814, the high quality flow rate data (404) is transmitted to, at least, a control system of the well, wherein operation of the well is based on the high quality flow rate data (404). For example, based on the high quality flow rate data (404), a choke valve of a pipeline can be adjusted to optimize production from the well.
As another example, the high quality flow rate data (404) can be used as target to train, using a supervised learning process, a virtual flow meter (VFM) model (125), where a paired input and target consists of an input data point comprised by the auxiliary data and an associated high quality flow rate datapoint comprised by the high quality flow rate data (404). Once trained, the VFM model (125) can be used to predict flow rate data based on newly obtained auxiliary data. Additionally, the trained VFM model (125) can be used to determine whether there is a malfunction in the first set of sensors (e.g., MPFM) based on the predicted flow rate data. Then the predicted flow rate data (from the VFM model (125)) can be used instead of newly obtained raw multiphase flow rate data (from the first set of sensors) in response to the determination that there is a malfunction in the first set of sensors. Such a malfunction or fault can be determined by comparing one or more flowrates determined using the first set of sensors (e.g., MPFM) to the predicted flow rate data produced by the trained VFM model (125). In instances where there is a significant difference (e.g., absolute error) between the predicted flow rate data from the VFM model (125) and the multiphase flow rate data from the first set of sensors, in one or more embodiments, the first set of sensors can be said to contain a malfunction or fault. In one or more embodiments, the absolute error is compared to a predefined threshold, where an absolute error exceeding the predefined threshold indicates a malfunction or fault in the first set of sensors (e.g., MPFM).
FIG. 9 further depicts a block diagram of a computer system (902) (e.g., the pressure control system) used to provide computational functionalities associated with the methods, functions, processes, flows, and procedures as described in this disclosure, according to one or more embodiments. The illustrated computer (902) is intended to encompass any computing device such as a server, desktop computer, laptop/notebook computer, wireless data port, smart phone, personal data assistant (PDA), tablet computing device, one or more processors within these devices, or any other suitable processing device, including both physical or virtual instances (or both) of the computing device. Additionally, the computer (902) may include a computer that includes an input device, such as a keypad, keyboard, touch screen, or other device that can accept user information, and an output device that conveys information associated with the operation of the computer (902), including digital data, visual, or audio information (or a combination of information), or a GUI.
The computer (902) can serve in a role as a client, network component, a server, a database or other persistency, or any other component (or a combination of roles) of a computer system for performing the subject matter described in the instant disclosure. In some implementations, one or more components of the computer (902) may be configured to operate within environments, including cloud-computing-based, local, global, or other environment (or a combination of environments).
At a high level, the computer (902) is an electronic computing device operable to receive, transmit, process, store, or manage data and information associated with the described subject matter. According to some implementations, the computer (902) may also include or be communicably coupled with an application server, e-mail server, web server, caching server, streaming data server, business intelligence (BI) server, or other server (or a combination of servers).
The computer (902) can receive requests over network (930) from a client application (for example, executing on another computer (902) and responding to the received requests by processing the said requests in an appropriate software application. In addition, requests may also be sent to the computer (902) from internal users (for example, from a command console or by other appropriate access method), external or third-parties, other automated applications, as well as any other appropriate entities, individuals, systems, or computers.
Each of the components of the computer (902) can communicate using a system bus (903). In some implementations, any or all of the components of the computer (902), both hardware or software (or a combination of hardware and software), may interface with each other or the interface (904) (or a combination of both) over the system bus (903) using an application programming interface (API) (912) or a service layer (99) (or a combination of the API (912) and service layer (913). The API (912) may include specifications for routines, data structures, and object classes. The API (912) may be either computer-language independent or dependent and refer to a complete interface, a single function, or even a set of APIs. The service layer (913) provides software services to the computer (902) or other components (whether or not illustrated) that are communicably coupled to the computer (902). The functionality of the computer (902) may be accessible for all service consumers using this service layer. Software services, such as those provided by the service layer (913), provide reusable, defined business functionalities through a defined interface. For example, the interface may be software written in JAVA, C++, or other suitable language providing data in extensible markup language (XML) format or another suitable format. While illustrated as an integrated component of the computer (902), alternative implementations may illustrate the API (912) or the service layer (913) as stand-alone components in relation to other components of the computer (902) or other components (whether or not illustrated) that are communicably coupled to the computer (902). Moreover, any or all parts of the API (912) or the service layer (913) may be implemented as child or sub-modules of another software module, enterprise application, or hardware module without departing from the scope of this disclosure.
The computer (902) includes an interface (904). Although illustrated as a single interface (904) in FIG. 9, two or more interfaces (904) may be used according to particular needs, desires, or particular implementations of the computer (902). The interface (904) is used by the computer (902) for communicating with other systems in a distributed environment that are connected to the network (930). Generally, the interface (904) includes logic encoded in software or hardware (or a combination of software and hardware) and operable to communicate with the network (930). More specifically, the interface (904) may include software supporting one or more communication protocols associated with communications such that the network (930) or interface's hardware is operable to communicate physical signals within and outside of the illustrated computer (902).
The computer (902) includes at least one computer processor (905). Although illustrated as a single computer processor (905) in FIG. 9, two or more processors may be used according to particular needs, desires, or particular implementations of the computer (902). Generally, the computer processor (905) executes instructions and manipulates data to perform the operations of the computer (902) and any algorithms, methods, functions, processes, flows, and procedures as described in the instant disclosure.
The computer (902) also includes a memory (906) that holds data for the computer (902) or other components (or a combination of both) that can be connected to the network (930). The memory may be a non-transitory computer readable medium. For example, memory (906) can be a database storing data consistent with this disclosure. Although illustrated as a single memory (906) in FIG. 9, two or more memories may be used according to particular needs, desires, or particular implementations of the computer (902) and the described functionality. While memory (906) is illustrated as an integral component of the computer (902), in alternative implementations, memory (906) can be external to the computer (902).
The application (907) is an algorithmic software engine providing functionality according to particular needs, desires, or particular implementations of the computer (902), particularly with respect to functionality described in this disclosure. For example, application (907) can serve as one or more components, modules, applications, etc. Further, although illustrated as a single application (907), the application (907) may be implemented as multiple applications (907) on the computer (902). In addition, although illustrated as integral to the computer (902), in alternative implementations, the application (907) can be external to the computer (902).
There may be any number of computers (902) associated with, or external to, a computer system containing computer (902), wherein each computer (902) communicates over network (930). Further, the term “client,” “user,” and other appropriate terminology may be used interchangeably as appropriate without departing from the scope of this disclosure. Moreover, this disclosure contemplates that many users may use one computer (902), or that one user may use multiple computers (902).
Although only a few example embodiments have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the example embodiments without materially departing from this invention. Accordingly, all such modifications are intended to be included within the scope of this disclosure as defined in the following claims.
1. A method, comprising:
obtaining raw multiphase flow rate data from a first set of sensors disposed on a pipeline of a well;
obtaining auxiliary data from a second set of sensors disposed on the pipeline;
preprocessing the raw multiphase flow rate data to form a preprocessed multiphase flow rate dataset;
determining, with a first model processing the preprocessed multiphase flow rate dataset, a first cleansed flow rate dataset;
determining, with a second model processing the preprocessed multiphase flow rate dataset and the auxiliary data, a second cleansed flow rate dataset;
determining, with a third model processing the preprocessed multiphase flow rate dataset, the first cleansed flow rate dataset, and the second cleansed flow rate dataset, high quality flow rate data; and
transmitting the high quality flow rate data to, at least, a control system of the well, wherein operation of the well is based on the high quality flow rate data.
2. The method of claim 1, further comprising:
training, using a supervised learning process with a plurality of paired inputs and targets, a virtual flow meter (VFM) model, wherein a paired input and target consists of an input data point comprised by the auxiliary data and an associated high quality flow rate datapoint comprised by the high quality flow rate data.
3. The method of claim 2, further comprising:
determining, with the trained VFM model, predicted flow rate data based on newly obtained auxiliary data;
determining whether there is a malfunction in the first set of sensors based on the predicted flow rate data; and
using the predicted flow rate data instead of newly obtained raw multiphase flow rate data in response to the determination that the malfunction is the first set of sensors.
4. The method of claim 2, further comprising:
determining, with the trained VFM model, predicted flow rate data based on newly obtained auxiliary data; and
adjusting, one or more devices disposed on the pipeline based on the predicted flow rate data so as to optimize production of the well.
5. The method of claim 1, wherein operation of the well is controlled by adjusting one or more devices disposed on the pipeline.
6. The method of claim 1,
wherein the auxiliary data comprises one of wellhead pressure, upstream wellhead temperature, downstream wellhead pressure, Venturi differential pressure, choke valve position, electrical submersible pump (ESP) frequency, and ESP motor current.
7. The method of claim 1, wherein at least one of the first model, the second model and the third model are machine-learned models.
8. The method of claim 1, wherein the first model is an auto regressive integrated moving average (ARIMA) model.
9. The method of claim 1, wherein the second model is a physics model, a machine-learned model, or a hybrid of a physics model and a machine-learned model.
10. The method of claim 1 wherein the third model is a Gaussian mixture model.
11. A system comprising:
a first set of sensors disposed on a pipeline of a well;
a second set of sensors disposed on the pipeline;
a set of models, comprising a first model, a second model and a third model; and
a computer configured to:
obtain raw multiphase flow rate data from a first set of sensors disposed on a pipeline of a well;
obtain auxiliary data from a second set of sensors disposed on the pipeline;
preprocess the raw multiphase flow rate data to form a preprocessed multiphase flow rate dataset;
determine, with a first model processing the preprocessed multiphase flow rate dataset, a first cleansed flow rate dataset;
determine, with a second model processing the preprocessed multiphase flow rate dataset and the auxiliary data, a second cleansed flow rate dataset;
determine, with a third model processing the preprocessed multiphase flow rate dataset, the first cleansed flow rate dataset, and the second cleansed flow rate dataset, high quality flow rate data; and
transmit the high quality flow rate data to, at least, a control system of the well, wherein operation of the well is based on the high quality flow rate data.
12. The system of claim 11, the computer further configured to:
train, using a supervised learning process with a plurality of paired inputs and targets, a virtual flow meter (VFM) model, wherein a paired input and target consists of an input data point comprised by the auxiliary data and an associated high quality flow rate datapoint comprised by the high quality flow rate data.
13. The system of claim 12, the computer further configured to:
determine, with the trained VFM model, predicted flow rate data based on newly obtained auxiliary data;
determine whether there is a malfunction in the first set of sensors based on the predicted flow rate data; and
use the predicted flow rate data instead of newly obtained raw multiphase flow rate data in response to the determination that the malfunction is the first set of sensors.
14. The system of claim 12, the computer further configured to:
determine with the trained VFM model, predicted flow rate data based on newly obtained auxiliary data; and
adjust, one or more devices disposed on the pipeline based on the predicted flow rate data so as to optimize production of the well.
15. The system of claim 11, wherein operation of the well is controlled by adjusting one or more devices disposed on the pipeline.
16. The system of claim 11,
wherein the auxiliary data comprises one of wellhead pressure, upstream wellhead temperature, downstream wellhead pressure, Venturi differential pressure, choke valve position, electrical submersible pump (ESP) frequency, and ESP motor current.
17. The system of claim 11, wherein the first model is an auto regressive integrated moving average (ARIMA) model.
18. The system of claim 11, wherein the second model is a physics model, a machine-learned model, or a hybrid of a physics model and a machine-learned model.
19. The system of claim 11 wherein the third model is a Gaussian mixture model.
20. A non-transitory machine-readable medium comprising a plurality of machine-readable instructions executed by one or more processors, the plurality of machine-readable instructions causing the one or more processors to perform a method comprising:
obtaining raw multiphase flow rate data from a first set of sensors disposed on a pipeline of a well;
obtaining auxiliary data from a second set of sensors disposed on the pipeline;
preprocessing the raw multiphase flow rate data to form a preprocessed multiphase flow rate dataset;
determining, with a first model processing the preprocessed multiphase flow rate dataset, a first cleansed flow rate dataset;
determining, with a second model processing the preprocessed multiphase flow rate dataset and the auxiliary data, a second cleansed flow rate dataset;
determining, with a third model processing the preprocessed multiphase flow rate dataset, the first cleansed flow rate dataset, and the second cleansed flow rate dataset, high quality inferred flow rate data; and
transmitting the high quality flow rate data to, at least, a control system of the well, wherein operation of the well is based on the high quality flow rate data.