US20260009329A1
2026-01-08
18/765,034
2024-07-05
Smart Summary: A logging tool is used in a well to collect data at different depths. This data helps to figure out the elemental composition of the well using a machine learning model. Another model then predicts the amount of total organic carbon (TOC) at each depth. The TOC information is used to assess the hydrocarbon content of the well. Finally, a plan for operating the well is created based on the hydrocarbon content. 🚀 TL;DR
Methods and systems for predicting total organic carbon (TOC) throughout a well. The method includes deploying a logging tool in the well and obtaining logging data from the well using the logging tool, where the logging data has one or more data values at each depth in a set of depths. The method further includes determining, with a first machine learning model, elemental data for the well at each depth in the set of depths based on the logging data and determining, with a second machine learning model, TOC at each depth in the set of depths based on the logging data and elemental data. The method further includes determining a hydrocarbon content of the well based on the determined TOC and executing a well operation plan based on the hydrocarbon content.
Get notified when new applications in this technology area are published.
E21B49/0875 » CPC main
Testing the nature of borehole walls; Formation testing; Methods or apparatus for obtaining samples of soil or well fluids, specially adapted to earth drilling or wells; Obtaining fluid samples or testing fluids, in boreholes or wells; Well testing, e.g. testing for reservoir productivity or formation parameters determining specific fluid parameters
E21B2200/20 » CPC further
Special features related to earth drilling for obtaining oil, gas or water Computer models or simulations, e.g. for reservoirs under production, drill bits
E21B2200/22 » CPC further
Special features related to earth drilling for obtaining oil, gas or water Fuzzy logic, artificial intelligence, neural networks or the like
E21B49/08 IPC
Testing the nature of borehole walls; Formation testing; Methods or apparatus for obtaining samples of soil or well fluids, specially adapted to earth drilling or wells Obtaining fluid samples or testing fluids, in boreholes or wells
The present disclosure relates to systems and methods for predicting hydrocarbon well characteristics. Particularly, the disclosure relates to determining total organic carbon (TOC) using a nested machine learning-based workflow using both wireline data and elemental data. The TOC is used to estimate source rock richness, net thickness, and hydrocarbon capacity and production from one or more wells of an oil and gas field.
Petroleum source rock may be any rock with the sufficient organic matter content to generate and release enough hydrocarbons to form a commercial accumulation of oil or gas. Source rocks commonly include shales and limestones/mudstones. Estimation of source rock thickness, net thickness, and hydrocarbon capacity and production is important for petroleum exploration. However, calculation of these quantities is often based on a measurement of total organic carbon (TOC) obtained through laboratory analysis on a rock sample (e.g., core), where such rock samples are sampled from a borehole infrequently and/or only in select well intervals. Further, rock samples are sometimes contaminated in that prior processes such as the use of drilling fluid in the wellbore alter the measured TOC to an incorrect value. The use of infrequent and possibly erroneous TOC measurements result in inaccurate estimates of source rock thickness, net thickness, and hydrocarbon generation negatively affecting hydrocarbon exploration, well planning, and oil and gas field production forecasts.
This summary is provided to introduce a selection of concepts that are further described below in the detailed description. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in limiting the scope of the claimed subject matter.
In one aspect, embodiments disclosed herein relate to a method for predicting total organic carbon (TOC) throughout a well. The method includes deploying a logging tool in the well and obtaining logging data from the well using the logging tool, where the logging data has one or more data values at each depth in a set of depths. The method further includes determining, with a first machine learning model, elemental data for the well at each depth in the set of depths based on the logging data and determining, with a second machine learning model, TOC at each depth in the set of depths based on the logging data and elemental data. The method further includes determining a hydrocarbon content of the well based on the determined TOC and executing a well operation plan based on the hydrocarbon content.
In one aspect, embodiments disclosed herein relate to a non-transitory computer-readable memory with computer-executable instructions stored thereon that, when executed on a processor, cause the processor to perform various steps. The steps include obtaining logging data from a well using a logging tool deployed in the well, where the logging data has one or more data values at each depth in a set of depths. The steps further include determining, with a first machine learning model, elemental data for the well at each depth in the set of depths based on the logging data and determining, with a second machine learning model, total organic carbon (TOC) at each depth in the set of depths based on the logging data and elemental data. The steps further include determining a hydrocarbon content of the well based on the determined TOC.
In one aspect, embodiments disclosed herein relate to a system including a logging tool configured to obtain logging data from a well, a computer processor, and a non-transitory computer readable medium. The non-transitory computer readable medium stores instructions that when executed by the computer processor cause the processor to perform various operations. The operations include obtaining logging data from the well using the logging tool, where the logging data has one or more data values at each depth in a set of depths. The operations further include determining, with a first machine learning model, elemental data for the well at each depth in the set of depths based on the logging data and determining, with a second machine learning model, total organic carbon (TOC) at each depth in the set of depths based on the logging data and elemental data. The operations further include determining a hydrocarbon content of the well based on the determined TOC.
Other aspects and advantages of the claimed subject matter will be apparent from the following description and the appended claims.
Specific embodiments of the disclosed technology will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
FIG. 1 depicts a drilling operation in accordance with one or more embodiments.
FIG. 2 shows an example of logging data, elemental data, and total organic carbon (TOC) data in accordance with one or more embodiments.
FIG. 3 depicts a system in accordance with one or more embodiments.
FIG. 4 depicts a flowchart in accordance with one or more embodiments.
FIG. 5 depicts an example of identifying and removing contaminated samples from TOC data in accordance with one or more embodiments.
FIG. 6 depicts a neural network in accordance with one or more embodiments.
FIG. 7 depicts a flowchart in accordance with one or more embodiments.
FIG. 8 depicts an example of predicted elemental data in accordance with one or more embodiments.
FIG. 9 shows a computer system in accordance with one or more embodiments.
In the following detailed description of embodiments of the disclosure, numerous specific details are set forth in order to provide a more thorough understanding of the disclosure. However, it will be apparent to one of ordinary skill in the art that the disclosure may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
Throughout the application, ordinal numbers (e.g., first, second, third, etc.) may be used as an adjective for an element (i.e., any noun in the application). The use of ordinal numbers is not to imply or create any particular ordering of the elements nor to limit any element to being only a single element unless expressly disclosed, such as using the terms “before”, “after”, “single”, and other such terminology. Rather, the use of ordinal numbers is to distinguish between the elements. By way of an example, a first element is distinct from a second element, and the first element may encompass more than one element and succeed (or precede) the second element in an ordering of elements.
In the following description of FIGS. 1-9, any component described with regard to a figure, in various embodiments disclosed herein, may be equivalent to one or more like-named components described with regard to any other figure. For brevity, descriptions of these components will not be repeated with regard to each figure. Thus, each and every embodiment of the components of each figure is incorporated by reference and assumed to be optionally present within every other figure having one or more like-named components. Additionally, in accordance with various embodiments disclosed herein, any description of the components of a figure is to be interpreted as an optional embodiment which may be implemented in addition to, in conjunction with, or in place of the embodiments described with regard to a corresponding like-named component in any other figure.
It is to be understood that the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a well” includes reference to one or more of such well.
Terms such as “approximately,” “substantially,” etc., mean that the recited characteristic, parameter, or value need not be achieved exactly, but that deviations or variations, including for example, tolerances, measurement error, measurement accuracy limitations and other factors known to those of skill in the art, may occur in amounts that do not preclude the effect the characteristic was intended to provide.
It is to be understood that one or more of the steps shown in the flowcharts may be omitted, repeated, and/or performed in a different order than the order shown. Accordingly, the scope disclosed herein should not be considered limited to the specific arrangement of steps shown in the flowcharts.
The subject matter of the dependent claims of one or more embodiments may be combined with other dependent claims except where otherwise contradictory.
Embodiments disclosed herein relate to a system including machine learning models, and a method of its use, to determine total organic carbon (TOC) associated with the subsurface of the Earth or rock proximate a well.
Traditional measurements of TOC require a laboratory analysis of a core (e.g., rock sample) extracted from the well. Thus, the laboratory analysis provides a TOC measurement for each extracted core and relates each TOC measurement to the depth at which its corresponding core was extracted. Additionally, extracted cores are sometimes contaminated by prior processes such as the use of drilling fluid in the wellbore such that the measured TOC is erroneous. The use of infrequent (i.e., few cores, low spatial resolution) and possibly erroneous TOC measurements result in inaccurate estimates of source rock thickness, net thickness, and hydrocarbon generation. Inaccurate estimates negatively affect hydrocarbon exploration, well planning, and oil and gas field production forecasts.
As disclosed herein, TOC data is determined using logging data acquired with a logging tool such as a wireline tool and/or logging-while-drilling tool. The TOC data is determined at the same spatial sampling as the logging data. That is, logging data is acquired at a set of depths and for each depth in the set of depths the TOC is determined.
Logging data is acquired in greater abundance (i.e., at more depths) than cored samples from a well. Thus, embodiments disclosed herein provide, as a benefit, the determination of TOC at more depths (or locations in a well) compared to TOC measurements using extracted cores. In other words, TOC data determined as disclosed herein have a high spatial resolution compared to TOC measurements acquired through conventional coring and subsequent laboratory analysis. The TOC data, having a high spatial resolution, is used to accurately determine source rock thickness, net thickness, and hydrocarbon capacity and production.
A general overview of the subsurface activities associated with a drilling process are provided in FIG. 1. For brevity, above surface equipment, or other offshore rig platform and equipment, used in a drilling operation are not depicted as well sites may be configured in many ways. However, exclusion of well site configurations is not intended to be limiting. As seen, a drilling operation at a well site may include drilling a wellbore (102) into a subsurface region (106) including various formations. To drill a new section of wellbore (102), typically, a drill bit (110) with drilling fluid nozzle is connected to the down-hole end of a drill string (108), which is a series of drill pipes connected to form a conduit, and is rotated from the surface (104) while pushing the drill bit (110) against the rock forming a wellbore (102) in the ground and through the subsurface (106). In some implementations, the drill bit (110) may be rotated by a combined effect of surface rotation and with a down-hole drilling motor (not shown).
While cutting rock with a drill bit (110), typically, a drilling fluid (112) is circulated (with a pump) through the drill string (108), out of the drilling fluid nozzle of the drill bit (110), and back to the surface (104) through the presumably annular space between the wellbore (102) and the drill string (108). Moreover, the drill string (108) may contain a bottom hole assembly (BHA) (114) disposed at the distal end, or down-hole portion, of the conduit. To guide the drill bit (110), monitor the drilling process, and collect data about the subsurface (106) formations, among other objectives, the BHA (114) of the drill string (108) may be outfitted with “logging-while-drilling” (LWD) tools, “measurement-while-drilling-tools” (MWD), and a telemetry module. An MWD or LWD tool is generally a sensor, or measuring device, which collects information in an associated log during the drilling process. The measurements and/or logs may be transmitted to the surface (104) using any suitable telemetry system known in the art. The BHA (114) and the drill string (108) may contain other drilling tools known in the art but not specifically stated. By means of example, common logs, or information collected by LWD tools, may include, but are not limited to, the density of the subsurface (106) formation, the effective porosity of the subsurface (106) formation, and temperature.
Depending on the depth of hydrocarbon bearing formation and other geological complexes, a well can have several hole sizes before it reaches its target depth. A steel pipe, or casing (109), may be lowered in each hole and a cement slurry may be pumped from the bottom up through the presumably annular space between the casing (109) and the wellbore (102) to fix the casing (109), seal the wellbore (102) from the surrounding subsurface (106) formations, and ensure proper well integrity throughout the lifecycle of the well.
Upon finishing drilling the wellbore (102), the well may undergo a “completions” process to facilitate accessibility to the well and access the desired hydrocarbons. In some implementations, the final wellbore (102) can be completed using either cased and cemented pipe, which is later perforated to access the hydrocarbon, or it may be completed using a multi-stage open-hole packers assembly.
The open-hole space between sets of packers can be stimulated through an injection port in each stage allowing the hydrocarbons to flow to the surface. In the case of poor cementation, or improper sealing of packers with surrounding subsurface (106) formations, during completions, the well integrity can become compromised leading to the loss of the well or negative environmental impacts.
After drilling, a wireline tool may be deployed in the wellbore to collect data about the subsurface (106) formations similar to an MWD or LWD tool. That is, a wireline tool comprises one or more sensors, or measuring devices, that collect information about a surrounding subsurface (106).
LWD and MWD tools can acquire data continuously during the drilling process that can be used to inform real-time or near real-time drilling decisions (e.g., geosteering). However, the LWD and MWD tools may be bandwidth restricted such at a wireline tool can provide data with greater density (i.e., more data at each depth or data sampled with higher spatial resolution.
As discussed below, embodiments disclosed herein use “logging data.” Logging data can be acquired using any combination of LWD, MWD, and wireline tools. That is, in one or more embodiments, the logging data is acquired using a LWD tool. In other embodiments, the logging data is acquired using a wireline tool.
In one aspect, embodiments disclosed herein relate to a method for determining TOC data using logging data. More specifically, one or more embodiments disclosed herein predict TOC at a given depth in the wellbore by first predicting elemental data at the given depth using a first machine learning model operating on the logging data at the given depth and then predicting TOC at the given depth using both the logging data and the predicted elemental data.
FIG. 2 provides an example of some logs that may be included in logging data (215). Some logs, such as a gamma ray log (202), a directional photoelectric log (direction right shown) (204), effective porosity log (206), and bulk density log (208) are shown, however, many more logs may be acquired using one or more of a LWD, MWD, and wireline tool and included in the logging data (215). Additional logs may include directional density logs and sonic logs, such as compressional and shear sonic logs. Each log is a record of log values (210) at an associated well depth (212). Here, it is noted that the term well depth (212), or more simply the depth of the wellbore (102), refers to the distance along the wellbore (102) and does not necessarily correspond with the orthogonal distance from the surface (104) where the orthogonal distance is measured along an axis oriented perpendicular to the surface (104), also known as the true vertical depth. By way of example, a portion of a wellbore (102) may be oriented horizontally, or parallel to the surface (104), such that its orthogonal distance remains fixed over the horizontal portion, however, the well depth (212) measures the distance along the wellbore (102) and is not stagnant over any horizontal portion of the wellbore (102). Additionally, the well depth (212) is continuous and strictly monotonically increasing as directed from the surface (104) to the most down-hole portion of the wellbore (102) even if the orthogonal distance, or true vertical depth, decreases.
FIG. 2 also depicts an example of elemental data values (220), specifically observed or laboratory-measured values (represented with solid diamond markers) and determined elemental data (“elemental data”) (240) using the system and method disclosed herein. Some elements are known to have high affinity to source rock depositional environment (e.g., anoxic/dysoxic). Examples of such elements are: Al, Fe, Ti, Ca, Mn, Si, Mg, K, P, Mo, Ni, U, Th, and V. In accordance with one or more embodiments, extracted cores from a wellbore are analyzed to determine the concentration, relative concentration, or presence of one or more elements such as those listed above. Thus, observed or laboratory-measured elemental data includes a measurement of concentration, relative concentration, or presence of one or more elements at one or more depths in a wellbore (corresponding to the depth/location of the extracted core). In the example of FIG. 2, data points (solid diamond markers) depict the quantity (measured from “low” (L) to “high” (H) using a given scale for each element) of Ni and Zn at three depths (212). Additional elements can be included without departing from the scope of this disclosure. As seen, the observed or laboratory-measured elemental data is sparse compared to the logging data (215) because the measurements are performed on extracted cores that are sampled infrequently compared to the spatial sampling resolution of the logging data (215). In one or more embodiments, observed elemental data is determined for a core using inductively coupled plasma-mass spectrometry (ICP-MS), x-ray fluorescence (XRF), or the like. One illustrative machine for performing ICP-MS is the PE SCIEX ELAN 6000 ICP-MS system. To obtain XRF data, a rock sample may be pressed using a hydraulic pressing machine, and x-ray data obtained. One illustrative machine for XRF processing is the BRUKER S8 TIGER.
FIG. 2 also depicts TOC values (230). Specifically, TOC measurements (solid diamond markers) and determined TOC data (“TOC data”) (245). TOC data points (solid diamond markers) depict measured TOC using cores or rock samples extracted at various depths. Similar to the observed or laboratory-measured elemental data, TOC data points are sparse compared to the logging data (215) because the measurements are performed on extracted cores that are sampled infrequently compared to the spatial sampling resolution of the logging data (215). Laboratory analysis to measure TOC using an extracted core can include a pyrolysis method. This can be done, for example, using the Delsi-Nermag Rock Eval II Plus TOC module. A pyrolysis method may include heating a sample in an inert atmosphere (such as helium) to determine the free hydrocarbons and hydrocarbon- and oxygen-containing compounds (such as CO2) that are volatilized during the cracking of the kerogen.
The term “sampled well intervals” or “sampled intervals of a well” refers to well intervals from which rock samples, such as, for example, conventional core chips/plugs, side wall cores ditch cuttings, have been taken to measure, e.g., through laboratory analysis, the elemental data and/or TOC for the sample/interval. Thus, as seen in FIG. 2, a well may have “unsampled well intervals” or large segments (in terms of depth) where no laboratory-based measurement of elemental data and/or TOC exists. As discussed in greater detail below, the depths at which one or more cores are extracted to measure elemental data and TOC need not be the same nor align with a depth corresponding to values of the logging data (215). As such, a depth alignment between one or more of logging data, elemental data, and TOC may need to be performed to associate these quantities at a common depth (212).
As discussed below, and in accordance with one or more embodiments, the logging data (215) is processed using a series of machine learning models to produce elemental data (240) and TOC data (245). As seen in FIG. 2, the elemental data (240) and TOC data (245) have the same spatial sampling at the logging data (215). That is, given logging data (215), elemental data (240) and TOC data (245) are determined. Notably, laboratory measurements of elemental data and TOC need not be present. Thus, TOC data can be determined for an unsampled well interval including and unsampled well interval that spans the entire depth of the well.
In one or more embodiments, the determination of TOC data (310) from logging data (302) using a series of machine learning models is effectuated as a machine learning (ML)-based TOC determination system (300). The ML-based TOC determination system (300) is depicted in FIG. 3. In one or more embodiments, the ML-based TOC determination system (300) includes, or is, a computer system as described with respect to FIG. 9. As seen in FIG. 3, the ML-based TOC determination system (300) includes two machine learning models, namely, a first machine learning model (304) and a second machine learning model (308) connected in series. The first machine learning model (304) receives, as input, logging data (302). The logging data (302) can be as that depicted in FIG. 2, for example, containing logs such as a gamma ray log (202), a directional photoelectric log (direction right shown) (204), effective porosity log (206), and bulk density log (208), sonic logs, etc., where each log is a record of log values (210) at an associated well depth (212). The logging data (302) can be acquired using one or more of a MWD, LWD, and wireline tool.
The first machine learning model (304) receives, as input, the logging data (302) and returns, as output, elemental data (306), where the elemental data includes a prediction of quantity, concentration, and/or presence of one or more elements at each depth associated with the logging data (302). Then, the second machine learning model (308) receives, as input, the logging data (302) and the elemental data (306) and returns, as output, TOC data (310), where there is a determined TOC value for each depth associated with the logging data (302). Thus, the ML-based TOC determination system (300) obtains logging data (302) including one or more log values at each depth in a set of depths and determines TOC data including TOC for each depth in the set of depths.
In one or more embodiments, ML-based TOC determination system (300) is in electrical communication with, or otherwise has access to, a database of modelling data (320). As discussed below with reference to FIG. 4, the modelling data (320) can be used to train the first and second machine learning models (304, 308). The modelling data (320) includes modelling logging data (322), modelling elemental data (324), and modelling TOC data (326). The modelling data (320) is collected from wells within the same geological setting as the well associated with the logging data (302) processed by the ML-based TOC determination system (300). For example, an oil and gas field may include a plurality of wells (e.g., 10 wells) with certain wells being in relative proximity to another (e.g., within a radius of 200-300 meters). Immediately proximate wells (e.g., those within the specified radius) of another well may be considered within the same geological setting. Proximate wells may also be known as offset wells. The modelling data (320) can further include logging data, observed or laboratory-measured elemental data, and TOC measurements for sampled portions of the well considered by the ML-based TOC determination system (300), if available.
In accordance with one or more embodiments, the ML-based TOC determination system (300) issues a command (e.g., Command Y(350)) to an external system based on the determined TOC data (310). The external system may be, for example, a wellbore planning system, a geosteering system, a production forecast system, or the like. The command can execute or instruct the external system to perform a task such as altering the trajectory of a wellbore during drilling or the placement of one or more valves in a completed well.
In accordance with one or more embodiments, the general process for developing and using the first and second machine learning models (304, 308) to determine elemental data (306) and TOC data (310) using logging data (302) is provided in the flowchart of FIG. 4. As shown in Block 402, the process starts by collecting modelling logging data (322), modelling elemental data (324), and modelling TOC data (326) from at least one well. The modelling logging data (322), modelling elemental data (324), and modelling TOC data (326) are referred to collectively as modelling data (320).
The modelling data (320) are pre-processed as shown in Block 404. Pre-processing, at a minimum, comprises altering the data so that it is suitable for use with machine learned models. For example, numericalizing categorical data or removing data entries with missing values. Other typical pre-processing methods are normalization and imputation. Normalization is the process of transforming data, e.g., modelling logging data, with an intention to aid the machine learning models. An example normalization process is to determine the mean (μ) and standard deviation (σ) of the log values (210) for each log in the modelling logging data (322). That is, if a gamma ray log (202) and a directional photoelectric log (204) are used throughout the wells, these logs are appended across wells and the mean (μ) and standard deviation (σ) of both the gamma ray log (202) and the directional photoelectric log (204) are calculated. Using the mean (μ) and standard deviation (σ) of each log, the mean (μ) is subtracted from every log value (210) in said log and the result is divided by the standard deviation (σ) of said log as shown in the following equation:
N V i , lo g = value i , l og - μ l o g σ l o g . ( 1 )
In EQ. 1, NVi,log represents the normalized value of a single value, indexed by i, for a specific log, log. For example, log could be the gamma ray log (202) such that μlog and σlog represent the mean (μ) and standard deviation (σ) of all the log values (210) in the gamma ray log (202), respectively, over all the wells for which modeling data was collected. valuei,log is a single value from the log values (210) of the selected log; for example, the value of the gamma ray log (202) at a certain depth for a single well. The index i is unique to each well depth (212) and each well, in the case of multiple wells. As such, EQ. 1 transforms valuei,log to a normalized value, NVi,log. One with ordinary skill in the art will appreciate that there are many normalization processes available, and the inclusion of a single example, namely, that shown in EQ. 1, does not limit the scope of this disclosure.
Imputation is the process of replacing missing values, corrupted values, or outlier values in a set of data with a substitute value so that the data may be used in a machine-learned model. One imputation strategy may be to replace values with the nearest acceptable value in the data set. Here, “nearest” is taken with respect to well depth (212) with an additional note that acceptable substitute values are limited to the well for which the value is being replaced, in the case of multiple wells. As a concrete example, consider gamma ray logs (202) collected from two wells. A portion of the gamma ray log (202) of the first well may look like {(depth: 7020 ft, gamma: 70 gAPI), (depth: 7030 ft, gamma: 72 gAPI), (depth: 7050 ft, gamma: 74 gAPI)} and a portion of the gamma ray log (202) from the second well may look like {(depth: 7020 ft, gamma: 52 gAPI), (depth: 7030 ft, gamma: NAN gAPI), (depth: 7050 ft, gamma: 45 gAPI)}, where “NAN” stands for “not a number” and indicates a missing or corrupted value. Using the nearest value imputation strategy described above, the missing gamma ray log (202) value found in the second well would be replaced by the value of 52 gAPI. This is because this substitute value is at the nearest well depth (212) to the missing value within the same well.
Note that the modelling elemental data (324) and the modelling TOC data (326) may undergo a normalization and imputation process. The normalization and imputation methods applied to the modelling elemental data (324) and the modelling TOC data (326) may be independent, or different, than those applied to the modelling logging data (322). Likewise, different and independent normalization and imputation processes may be applied to the individual logs of the modelling logging data (322).
Information surrounding the pre-processing steps is saved for potential later use. For example, if the normalization is performed according to EQ. 1 for each log in the modelling logging data (322), then the mean (μ) and standard deviation (σ) of each log is saved or stored-likely with a computer medium. This allows future logs to be pre-processed identically.
One with ordinary skill in the art with recognize that a myriad of pre-processing methods beyond numericalization, removal of modeling data entries with missing values, normalization, and imputation exist. Descriptions of a select few pre-processing methods herein do not impose a limitation on the pre-processing steps encompassed by this disclosure.
In accordance with one or more embodiments, pre-processing includes aligning instances of modelling logging data (322), modelling elemental data (324), and modelling TOC data (326) according to depth. As noted, for a given well included in the modelling data (320), the logging data (215), observed or laboratory-measured elemental data, and measured TOC values need not occur at the same depth(s). Returning to FIG. 2, FIG. 2 depicts a first interval of depths (250) where the observed elemental data, TOC measurement, and logging data occur, or are measured at, the same depth value. FIG. 2 also depicts a second interval of depths (260) and a third interval of depths (270) where a least one of the observed elemental data, measured TOC, and log values do not occur at the same depth. In one or more embodiments, a depth alignment procedure is applied to assign a common depth to a set of logging values, observed elemental data, and measured TOC. In one or more embodiments, the depth alignment procedure consists of comparing a difference of depths between the depths associated with logging values, observed elemental data, and measured TOC if the difference is less than a predefined threshold. For example, the second interval of depths (260) may have a difference of depths less than the predefined threshold such that the logging values, observed elemental data, and measured TOC within this depth interval can be considered to occur at the same depth and thus be associated as a data instance in the modelling data (320). In one or more embodiments, the depth alignment procedure generates one or more of interpolated elemental data and interpolated TOC data using interpolation at one or more depths with logging values. Such a procedure may be applicable to, for example, the third interval of depths (270) where the is poor alignment between the observed elemental data and measured TOC (e.g., having a difference of depths greater than the predefined threshold).
In other words, one or more of logging data (215), observed elemental data, and measured TOC values can be resampled for depth alignment forming data instances in the modelling data (320). For example, two data points in the modelling logging data (322) may have depths of 5200 ft and 5200.50 ft while the closest depths of data points in the modelling elemental data (324) are 5198.25 ft and 5201.20 ft. In this case, the modelling elemental data (324) is resampled at the rate of the modelling logging data (322) to obtain approximate corresponding values required for the training process.
Keeping with Block 404 of FIG. 4, contaminated TOC values are identified and the associated data instances in the modelling data (320) are removed. It is known that some TOC measurements (through laboratory analysis) are affected by drilling fluids and additives, irrespective of whether the drilling fluid is water- or oil-based. Affected samples return erroneous TOC measurements and are thus considered “contaminated.” To improve the accuracy of the first and second machine-learning models (304, 308), contaminated TOC measurements should not be used in training. Thus, embodiments of this disclosure further relate to a method for identifying contaminated TOC measurements and removing the associated data instances from the modelling data (320).
The modelling TOC data is quality checked by filtering the TOC data to remove values from contaminated samples by applying an outlier detection method or correlation analysis in view of the depth-aligned modelling elemental data (324). That is, correlations and/or patterns are identified between the modelling elemental data (324) and modelling TOC data (326) once depth aligned and grouped into data instances. Thus, erroneous, or contaminated, TOC values are identified as outliers or pattern disruptors.
In one or more embodiments, contaminated TOC samples are identified by comparison of the associated elemental data to one or more predefined elemental thresholds. According to one example, TOC values exceeding a first elemental threshold value of Molybdenum (Mo), Nickel (Ni), Strontium (Sr), Zinc (Zn), Tantanium (Ta), Uranium (U), Vanadium (V), Sulphur(S), and Zrontium (Zr) as well as values falling below a second elemental threshold value of Manganese (Mn), Aluminum (Al), and Titanium (Ti) can be maintained in the dataset. Values falling outside of these elemental thresholds can then be removed. The first and second elemental threshold values may be determined as a function of acquired data related to TOC information using, for example, a outlier detection method.
FIG. 5 shows a schematic diagram of the effect of quality checking of TOC values in view of associated elemental data. From FIG. 5, it can be seen that some TOC values have been removed based on the values of the associated elemental data. In other words, FIG. 5 depicts an example where elemental data are used to quality-check the actual TOC measurements to ensure that contaminated samples (with respect to the laboratory-measured TOC values) are removed by utilizing the elements that are sensitive to the organic presence or anoxic depositional environment of the source rocks. This is done to increase the confidence of having accurate modelling TOC data (326) for use in training one or more of the first and second machine learning models (304, 306). This also helps in avoiding unreliable false flags in the pyrolysis data in terms of quality. FIG. 5 depicts an example of actual TOC data plotted along with associated elemental data. An example of a correlation observable in FIG. 5 is that MnO has a negative correlation with TOC. Thus, low values of MnO corresponds to high values of TOC. As such, an elemental threshold can be defined such that only TOC values with sufficient negative correlation to MnO are accepted as valid while those not honoring the pattern are removed as invalid (i.e., contaminated).
Returning to FIG. 4, in Block 406 the modeling data is split into training, validation, and test datasets. In some embodiments, the validation and test dataset may be the same such that the data is effectively only split into two distinct datasets. Block 406 may be performed before Block 404. In this case, it is common to define the pre-processing parameters, such as the mean (μ) and standard deviation (σ), with the training set and then apply these parameters to the validation and test datasets.
In accordance with one or more embodiments, and as depicted in Block 408, a machine-learned model type and architecture are selected for the first and second machine learning models (304, 308). Machine learning, broadly defined, is the extraction of patterns and insights from data. The phrases “artificial intelligence,” “machine learning,” “deep learning,” and “pattern recognition” are often convoluted, interchanged, and used synonymously throughout the literature. This ambiguity arises because the field of “extracting patterns and insights from data” was developed simultaneously and disjointedly among a number of classical arts like mathematics, statistics, and computer science. For consistency, the term machine learning, or machine-learned, will be adopted herein, however, one skilled in the art will recognize that the concepts and methods detailed hereafter are not limited by this choice of nomenclature.
Machine-learned model types may include, but are not limited to, neural networks, random forests, generalized linear models, and Bayesian regression. Machine-learned model types are usually associated with additional “hyperparameters” which further describe the model. For example, hyperparameters providing further detail about a neural network may include, but are not limited to, the number of layers in the neural network, choice of activation functions, inclusion of batch normalization layers, and regularization strength. Commonly, in the literature, the selection of hyperparameter surrounding a model is referred to as selecting the model “architecture”. In short, Block 408 references selecting a machine-learned model type and a set of governing hyperparameters for the first and second machine learning models (304, 308).
Once a machine-learned model type and hyperparameters have been selected, the first and second machine learning models (304, 308) are trained using the training dataset of the modeling data according to Block 410. Common training techniques, such as early stopping, adaptive or scheduled learning rates, and cross-validation may be used during training without departing from the scope of this disclosure.
In one or more embodiments, the first and second machine learning models (304, 308) are trained independently. That is, the first machine learning model (304) is trained to predict elemental data given modelling logging data (322) from the training dataset and the training is informed or guided by a comparison of the predicted elemental data and the modelling elemental data (324). Similarly, the second machine learning model (308) is trained to predict TOC data given modelling logging data (322) and modelling elemental data (324) from the training dataset and the training is informed or guided by a comparison of the predicted TOC data and the modelling TOC data (326).
In one or more embodiments, the first and second machine learning models (304, 308) are trained jointly. In joint training the first machine learning model (304) receives, as input, the modelling logging data (322) of the training dataset and returns, as output, predicted elemental data, and the second machine learning model receives, as input, the modelling logging data (322) and of the training dataset and the predicted elemental data and returns, as output, predicted TOC data. The joint training is guided by a first comparison of the modelling elemental data (324) of the training dataset and the predicted elemental data and a second comparison of the modelling TOC data (326) of the training dataset and the predicted TOC data. That is, in joint training, the first and second comparisons can be used to guide the training of the first and second machine learning models (304, 308). As an example, the second comparison, which makes use of the output of the second machine learning model (308) can be used to inform or guide the training of the first machine learning model (304).
During training, or once trained, the performance of the trained machine learning models (304, 308) is evaluated using the validation dataset as depicted in Block 412. Recall that in some instances, the validation set and test set are the same. In one or more embodiments, the validation dataset is formed from interpolated elemental data and interpolated TOC data. That is, in one or more embodiments, observed or laboratory-measured elemental data and actual TOC measurements (after filtering or identification and removal of contaminated samples) are reserved for the training dataset and a validation dataset is formed, as needed, by interpolating elemental data and TOC data to obtain data instances at depths where logging data exits (and/or modelling logging data) without observed elemental data and measured TOC.
Generally, performance is measured using a function which compares the predictions of the trained machine learning models (304, 308) to the values on record. A commonly used comparison function is the mean-squared-error function, which quantifies the difference between the predicted value and the actual value, however, one with ordinary skill in the art will appreciate that many more comparison functions exist and may be used without limiting the scope of the present disclosure.
Block 414 represents a decision: if the performance of the trained machine learning models, as measured by a comparison function, is not suitable, the machine-learned model type and architecture of one or more of the machine learning models are altered, as shown in Block 408, and the training process is repeated. There are many ways to alter the machine-learned model type and architecture in search of suitable trained machine learning models performance. These include, but are not limited to: selecting a new model type from a previously defined set of model types; randomly perturbing or randomly selecting new hyperparameters; using a grid search over the available hyperparameters; intelligently altering the model type or hyperparameters based on the observed performance of previous models (e.g., a Bayesian hyperparameter search). Once suitable performance is achieved, the training procedure is complete and the generalization error of the trained machine learning models (304, 308) is estimated according to Block 416.
Generalization error is an indication of the trained machine learning models' performance on new, or un-seen data. Typically, the generalization error is estimated using the comparison function, as previously described, to compare the predicted TOC data to the actual TOC values using the test dataset.
As depicted in Block 418, the trained machine learning models (304, 308) are used “in production”-which means the trained machine learning models (304, 308) are used, as in the ML-based TOC determination system (300) of FIG. 3, to determine TOC data (310) from logging data (302). It is emphasized that the logging data (302) used to determine TOC data (310) in the production setting, as well as for the validation and test datasets, are pre-processed identically to the manner defined in Block 404 as denoted by the connection (422), represented as a dashed line in FIG. 4, between Blocks 418 and 404.
As shown in Block 420, the performance of the trained machine learning models (304, 308) is continuously monitored in the production setting. Performance monitoring includes statistical comparisons logging data (302) of new wells to the modelling logging data (322) of the training dataset to identify data drift. If model performance is suspected to be degrading, as observed through data drift or newly acquired performance metrics, the model may be updated. An update may include retraining one or more of the first and second machine learning models (304, 308), by reverting to Block 408, with the newly acquired modelling data (320) appended to the training dataset. An update may also include returning to Block 404 to recalculate any pre-processing parameters, again, after appending the newly acquired modeling data to the existing modeling data.
While the various blocks in FIG. 4 are presented and described sequentially, one of ordinary skill in the art will appreciate that some or all of the blocks may be executed in different orders, may be combined or omitted, and some or all of the blocks may be executed in parallel. Furthermore, the blocks may be performed actively or passively.
In some embodiments, the selected machine-learned model type for one or more of the first and second machine learning models (304, 308) is a neural network.
A diagram of a neural network (NN) (600) is shown in FIG. 6. At a high level, a NN (600) may be graphically depicted as being composed of nodes (602), where here any circle represents a node, and edges (604), shown here as directed lines. The nodes (602) may be grouped to form layers (605). FIG. 6 displays four layers (608, 610, 612, 614) of nodes (602) where the nodes (602) are grouped into columns, however, the grouping need not be as shown in FIG. 6. The edges (604) connect the nodes (602). Edges (604) may connect, or not connect, to any node(s) (602) regardless of which layer (605) the node(s) (602) is in. That is, the nodes (602) may be sparsely and residually connected. A neural network (600) will have at least two layers (605), where the first layer (608) is considered the “input layer” and the last layer (614) is the “output layer.” Any intermediate layer (610, 612) is usually described as a “hidden layer.” A neural network (600) may have zero or more hidden layers (610, 612) and a neural network (600) with at least one hidden layer (610, 612) may be described a “deep” neural network or a “deep learning method.” In general, a neural network (600) may have more than one node (602) in the output layer (614). In this case the neural network (600) may be referred to as a “multi-target” or “multi-output” network.
Nodes (602) and edges (604) carry additional associations. Namely, every edge (604) is associated with a numerical value. The edge numerical values, or even the edges (604) themselves, are often referred to as “weights” or “parameters.” While training a neural network (600), numerical values are assigned to each edge (604). Additionally, every node (602) is associated with a numerical variable and an activation function. Activation functions are not limited to any functional class, but traditionally follow the form
A = f ( ∑ i ∈ ( incoming ) [ ( node value ) i ( edge value ) i ] ) , ( 2 )
where i is an index that spans the set of “incoming” nodes (602) and edges (604) and ƒ is a user-defined function. Incoming nodes (602) are those that, when viewed as a graph (as in FIG. 6), have directed arrows that point to the node (602) where the numerical value is being computed. Some functions for ƒ may include the linear function ƒ(x)=x, sigmoid function
f ( x ) = 1 1 + e - x ,
and rectified linear unit (ReLU) function ƒ(x)=max (0,x), however, many additional functions are commonly employed. Every node (602) in a neural network (600) may have a different associated activation function. Often, as a shorthand, activation functions are described by the function ƒ by which it is composed. That is, an activation function composed of a linear function ƒ may simply be referred to as a linear activation function without undue ambiguity.
When the neural network (600) receives a network input, the network input is propagated through the network according to the activation functions and incoming node (602) values and edge (604) values to compute a value for each node (602) according to EQ. 3. That is, the numerical value for each node (602) may change for each received input. Occasionally, nodes (602) are assigned fixed numerical values, such as the value of 1, that are not affected by the input or altered according to edge (604) values and activation functions. Fixed nodes (602) are often referred to as “biases” or “bias nodes” (606), displayed in FIG. 6 with a dashed circle.
In some implementations, the neural network (600) may contain specialized layers (605), such as a normalization layer, a regularization layer (e.g. dropout layer), and a concatenation layer. One skilled in the art will appreciate that these alterations do not exceed the scope of this disclosure.
As noted, the training procedure for the neural network (600) comprises assigning values to the edges (604). To begin training, the edges (604) are assigned initial values. These values may be assigned randomly, assigned according to a prescribed distribution, assigned manually, or by some other assignment mechanism. Once edge (604) values have been initialized, the neural network (600) may act as a function, such that it may receive inputs and produce an output. As such, at least one input is propagated through the neural network (600) to produce an output. Generally, a dataset, known as a training dataset, is provided to the neural network (600) in order for the network to learn edge (604) values (i.e., learn the network parameters). The training dataset is composed of inputs and associated target(s), where the target(s) represent the “ground truth”, or the otherwise desired output. The neural network (600) output is compared to the associated input data target(s). The comparison of the neural network (600) output to the target(s) is typically performed by a so-called “loss function”; although other names for this comparison function such as “error function,” “misfit function,” and “cost function” are commonly employed. Many types of loss functions are available, such as the mean-squared-error function, however, the general characteristic of a loss function is that the loss function provides a numerical evaluation of the similarity between the neural network (600) output and the associated target(s). The loss function may also be constructed to impose additional constraints on the values assumed by the edges (604), for example, by adding a penalty term, which may be physics-based, or a regularization term. Generally, the goal of a training procedure is to alter the edge (604) values to promote similarity between the neural network (600) output and associated target(s) over the training dataset. Thus, the loss function is used to guide changes made to the edge (604) values, typically through a process called “backpropagation.”
While a full review of the backpropagation process exceeds the scope of this disclosure, a brief summary is provided. Backpropagation consists of computing the gradient of the loss function over the edge (604) values. The gradient indicates the direction of change in the edge (604) values that results in the greatest change to the loss function. Because the gradient is local to the current edge (604) values, the edge (604) values are typically updated by a “step” in the direction indicated by the gradient. The step size is often referred to as the “learning rate” and need not remain fixed during the training process. Additionally, the step size and direction may be informed by previously seen edge (604) values or previously computed gradients. Such methods for determining the step direction are usually referred to as “momentum” based methods.
Once the edge (604) values have been updated, or altered from their initial values, through a backpropagation step, the neural network (600) will likely produce different outputs. Thus, the procedure of propagating at least one input through the neural network (600), comparing the neural network (600) output with the associated target(s) with a loss function, computing the gradient of the loss function with respect to the edge (604) values, and updating the edge (604) values with a step guided by the gradient, is repeated until a termination criterion is reached. Common termination criteria are: reaching a fixed number of edge (604) updates, otherwise known as an iteration counter; a diminishing learning rate; noting no appreciable change in the loss function between iterations; reaching a specified performance metric as evaluated on the data or a separate hold-out data set. Once the termination criterion is satisfied, and the edge (604) values are no longer intended to be altered, the neural network (600) is said to be “trained.”
FIG. 7 depicts a method in accordance with one or more embodiments. One or more steps of the method of FIG. 7 can be performed using the ML-based TOC determination system (300) as previously described. In Step 702, a logging tool is deployed in a well. As an example, the logging tool can be a wireline tool. As another example, the logging tool can be a LWD tool. The logging tool includes one or more sensors or measuring devices to measure one or more characteristics of a surrounding subsurface formation. The measured or sensed characteristics are recorded according to a depth of the logging tool as logs. In Step 704, the logging data is obtained from the well using the logging tool. The logging data includes data values at various depths included in a set of depths.
In Step 706, the first machine learning model (304) is used to determine elemental data (306) for the well at each depth in the set of depths based on the logging data (302). That is, the first machine learning model (304) receives, as input, the logging data (302) and returns, as output, elemental data (306) where the elemental data (306) has the same spatial sampling as the logging data (302). In Step 708, the second machine learning model (308) is used to determine the TOC at each depth in the set of depths based on the logging data and the elemental data. That is, the second machine learning model (308) receives, as inputs, the logging data and the elemental data previously returned by the first machine learning model (304) and returns, as output, the TOC data.
In Step 710, the hydrocarbon content of the well is determined based on the determined TOC data. For example, a volume of hydrocarbon generated and expelled from a source rock upon complete maturation can be calculated using the following equation:
mass ( kg ) = 10000 · Area ( m 2 ) · Source Rock Net Thickness ( m ) · ρ · TOC ( wt . % ) * HI ( mgHC gTOC ) , ( 3 )
where ρ is the bulk density of the source rock under consideration.
In Step 712, a well operation plan is executed based on the hydrocarbon content. For example, the well operation plan can define a set of operation parameters of the well (e.g., choke valve settings, pump pressures and frequencies, etc.) to maximize hydrocarbon production from the well. The well operation plan can define the set of operation parameters over the life of the well.
In one or more embodiments, the logging data (302) used in the steps of FIG. 7 is for unsampled intervals of a well using a first and second machine learning model (304, 308) trained with one or more of: logging, observed elemental, and measured TOC for one or more sample intervals of the well; and modelling data acquired from one or more wells in a similar geological setting as the well. That is, in one or more embodiments, TOC data is determined for unsampled intervals of a well and the TOC data is further determined, or regularized, according to predicted elemental data for the unsampled intervals. Additionally, the modelling TOC data (whether from other wells or from one or more sampled intervals of the well) used to train one or more of the first and second machine learning models (304, 308) is filtered, or quality controlled, using the modelling elemental data. That is, only TOC data that has been validated in view of the corresponding elemental data is used to train the first and second machine-learning models (304, 308).
In accordance with one or more embodiments, the TOC data—determined as discussed above—are used to determine the source rock richness, since not all rocks are hydrocarbon source rock. High values of some elemental data such as Mo, and low values of others such as Mn, reflect the organic matter deposition environment, hence confirming anoxic environment, indicating well-preserved organic matter. Thus, the integrated elemental data and the TOC data confirm the interval of source rock.
Further, it is noted that the logging data is used to evaluate the source rock thickness only after the quality control process previously described (see FIG. 5). The equivalent top to base of logging data corresponding to the similar values of TOC in the sampled intervals is used to detect the source rock net thickness in the unsampled intervals. This also corresponds to the equivalent source rock richness. Both the organic matter information and net thickness information are used to calculate the volume of hydrocarbon generated and expelled.
FIG. 8 depicts an example of elemental data (306) determined using the first machine learning model (304) based on logging data (302). In particular, FIG. 8 depicts “actual” and “predicted” values for the element Ti at various depths of a well. In particular, the “predicted” values are determined using the first machine learning model (304) operating on the logging data (302) and the “actual” values at corresponding depths are formed as a linear interpolation of measurements of Ti determined using a laboratory analysis on extracted cores. As seen, the predicted values accurately mimic the actual values.
In summary, embodiments disclosed herein relate to a workflow based on machine learning (in particular, a nested use or series of machine learning models) to predict rock total organic content (TOC) by integrating logging data and elemental data. The elemental data is: 1) used to identify and remove (i.e., filter out) erroneous or contaminated TOC values measured using a laboratory-based method on exacted cores (i.e., rock samples) before training the machine learning models using the TOC values; and 2) is determined using a first machine learning model from the logging data and used with the logging data to determine TOC data.
Petroleum source rock is any piece of rock with the sufficient organic matter content to generate and release enough hydrocarbons to form a commercial accumulation of oil or gas. Source rocks are commonly shales and limestones/mudstones. The evaluation of petroleum source rock richness, net thickness, and hydrocarbon generation is a very challenging, but crucial process in petroleum exploration. TOC is traditionally measured on core samples by a pyrolysis laboratory process. In general, only certain well intervals can be sampled due to logistic, time, and effort requirements. Hence, TOC measurements are limited.
Embodiments disclosed herein determine TOC data for a well using logging data that are regularized with elemental data that have high affinity to source rock anoxic/dysoxic depositional environment. In other words, TOC measurements are, in a manner, quality-checked with elemental data that have high affinity to source rock depositional environment to confirm the potential source rock interval. There are 14 of such elements: Al, Fe, Ti, Ca, Mn, Si, Mg, K, P, Mo, Ni, U, Th, and V. In general, the elemental data are also limited for the same reasons as the TOC. For example, element measurements are typically measured in the laboratory on rock samples using the mass spectrometry/spectroscopy and X-ray fluorescence equipment, requiring a great deal of cost, time, and effort.
As such, a ML-based TOC determination system is proposed that first predicts 14 elemental measurements for the entire well, using logging data. This produces a borehole scale elemental data (i.e., elemental data with the same spatial sampling as the logging data (high spatial resolution)) from the original core scale (i.e., length of intervals between extracted cores). The predicted elemental values are then integrated with the logging data to predict TOC at the borehole scale. With the predicted elemental data serving as a quality control filter and additional discriminating information for the TOC measurements, this workflow guarantees a more accurate TOC prediction for the uncored and unsampled intervals of a well. The more accurate TOC predictions are then used to calculate the source rock richness and net thickness for the entire well (including sampled and unsampled intervals).
The described techniques may be integrated with other technologies such as, for example, software for exploration data analysis (e.g., Techlog), for further analysis of the results. For example, the TOC data and/or elemental data could be used to determine the extent of source rock in the area to validate source rock gross depositional environment (GDE) maps.
FIG. 9 further depicts a block diagram of a computer system (902) used to provide computational functionalities associated with the algorithms, methods, functions, processes, flows, and procedures as described in this disclosure, according to one or more embodiments. The illustrated computer (902) is intended to encompass any computing device such as a server, desktop computer, laptop/notebook computer, wireless data port, smart phone, personal data assistant (PDA), tablet computing device, one or more processors within these devices, or any other suitable processing device, including both physical or virtual instances (or both) of the computing device. Additionally, the computer (902) may include a computer that includes an input device, such as a keypad, keyboard, touch screen, or other device that can accept user information, and an output device that conveys information associated with the operation of the computer (902), including digital data, visual, or audio information (or a combination of information), or a GUI.
The computer (902) can serve in a role as a client, network component, a server, a database or other persistency, or any other component (or a combination of roles) of a computer system for performing the subject matter described in the instant disclosure. In some implementations, one or more components of the computer (902) may be configured to operate within environments, including cloud-computing-based, local, global, or other environment (or a combination of environments).
At a high level, the computer (902) is an electronic computing device operable to receive, transmit, process, store, or manage data and information associated with the described subject matter. According to some implementations, the computer (902) may also include or be communicably coupled with an application server, e-mail server, web server, caching server, streaming data server, business intelligence (BI) server, or other server (or a combination of servers).
The computer (902) can receive requests over network (930) from a client application (for example, executing on another computer (902) and responding to the received requests by processing the said requests in an appropriate software application. In addition, requests may also be sent to the computer (902) from internal users (for example, from a command console or by other appropriate access method), external or third-parties, other automated applications, as well as any other appropriate entities, individuals, systems, or computers.
Each of the components of the computer (902) can communicate using a system bus (903). In some implementations, any or all of the components of the computer (902), both hardware or software (or a combination of hardware and software), may interface with each other or the interface (904) (or a combination of both) over the system bus (903) using an application programming interface (API) (912) or a service layer (913) (or a combination of the API (912) and service layer (913). The API (912) may include specifications for routines, data structures, and object classes. The API (912) may be either computer-language independent or dependent and refer to a complete interface, a single function, or even a set of APIs. The service layer (913) provides software services to the computer (902) or other components (whether or not illustrated) that are communicably coupled to the computer (902). The functionality of the computer (902) may be accessible for all service consumers using this service layer. Software services, such as those provided by the service layer (913), provide reusable, defined business functionalities through a defined interface. For example, the interface may be software written in JAVA, C++, or other suitable language providing data in extensible markup language (XML) format or another suitable format. While illustrated as an integrated component of the computer (902), alternative implementations may illustrate the API (912) or the service layer (913) as stand-alone components in relation to other components of the computer (902) or other components (whether or not illustrated) that are communicably coupled to the computer (902). Moreover, any or all parts of the API (912) or the service layer (913) may be implemented as child or sub-modules of another software module, enterprise application, or hardware module without departing from the scope of this disclosure.
The computer (902) includes an interface (904). Although illustrated as a single interface (904) in FIG. 9, two or more interfaces (904) may be used according to particular needs, desires, or particular implementations of the computer (902). The interface (904) is used by the computer (902) for communicating with other systems in a distributed environment that are connected to the network (930). Generally, the interface (904) includes logic encoded in software or hardware (or a combination of software and hardware) and operable to communicate with the network (930). More specifically, the interface (904) may include software supporting one or more communication protocols associated with communications such that the network (930) or interface's hardware is operable to communicate physical signals within and outside of the illustrated computer (902).
The computer (902) includes at least one computer processor (905). Although illustrated as a single computer processor (905) in FIG. 9, two or more processors may be used according to particular needs, desires, or particular implementations of the computer (902). Generally, the computer processor (905) executes instructions and manipulates data to perform the operations of the computer (902) and any algorithms, methods, functions, processes, flows, and procedures as described in the instant disclosure.
The computer (902) also includes a memory (906) that holds data for the computer (902) or other components (or a combination of both) that can be connected to the network (930). The memory may be a non-transitory computer readable medium. For example, memory (906) can be a database storing data consistent with this disclosure. Although illustrated as a single memory (906) in FIG. 9, two or more memories may be used according to particular needs, desires, or particular implementations of the computer (902) and the described functionality. While memory (906) is illustrated as an integral component of the computer (902), in alternative implementations, memory (906) can be external to the computer (902).
The application (907) is an algorithmic software engine providing functionality according to particular needs, desires, or particular implementations of the computer (902), particularly with respect to functionality described in this disclosure. For example, application (907) can serve as one or more components, modules, applications, etc. Further, although illustrated as a single application (907), the application (907) may be implemented as multiple applications (907) on the computer (902). In addition, although illustrated as integral to the computer (902), in alternative implementations, the application (907) can be external to the computer (902).
There may be any number of computers (902) associated with, or external to, a computer system containing computer (902), wherein each computer (902) communicates over network (930). Further, the term “client,” “user,” and other appropriate terminology may be used interchangeably as appropriate without departing from the scope of this disclosure. Moreover, this disclosure contemplates that many users may use one computer (902), or that one user may use multiple computers (902).
Although only a few example embodiments have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the example embodiments without materially departing from this invention. Accordingly, all such modifications are intended to be included within the scope of this disclosure as defined in the following claims.
1. A method for predicting total organic carbon (TOC) throughout a well, the method comprising:
deploying a logging tool in the well;
obtaining logging data from the well using the logging tool, wherein the logging data comprises one or more data values at each depth in a set of depths;
determining, with a first machine learning model, elemental data for the well at each depth in the set of depths based on the logging data;
determining, with a second machine learning model, TOC at each depth in the set of depths based on the logging data and elemental data;
determining a hydrocarbon content of the well based on the determined TOC; and
executing a well operation plan based on the hydrocarbon content.
2. The method according to claim 1, wherein the first machine learning model comprises a neural network.
3. The method according to claim 1, wherein the logging tool is a wireline tool.
4. The method according to claim 1, further comprising training the first and second machine learning models, wherein training the first and second machine learning models comprises:
obtaining modelling data comprising modelling logging data, modelling elemental data, and modelling TOC data from at least one of a sampled interval of the well and one or more offset wells;
identifying contaminated samples in the modelling TOC data using an outlier detection or correlation method in view of the modelling elemental data;
removing identified contaminated samples from the modelling data;
forming a training dataset comprising depth aligned examples from the modelling data; and
jointly training the first and second machine learning models, wherein:
the first machine learning model receives, as input, the modelling logging data of the training dataset and returns, as output, predicted elemental data,
the second machine learning model receives, as input, the modelling logging data and of the training dataset and the predicted elemental data and returns, as output, predicted TOC data, and
the joint training is guided by a first comparison of the modelling elemental data of the training dataset and the predicted elemental data and a second comparison of the modelling TOC data of the training dataset and the predicted TOC data.
5. The method according to claim 4, wherein the modelling elemental data are obtained from one or more of inductively coupled plasma-mass spectrometry (ICP-MS) and x-ray fluorescence (XRF).
6. The method according to claim 4, wherein the modelling TOC data are obtained using a pyrolysis technique on one or more core samples collected from one or more of the well and offset wells within the same geological setting.
7. The method according to claim 4, wherein training the first and second machine learning models further comprises:
forming a validation dataset comprising depth aligned examples from the modelling data; and
validating the first and second machine learning models using the validation dataset.
8. A non-transitory computer-readable memory comprising computer-executable instructions stored thereon that, when executed on a processor, cause the processor to perform steps comprising:
obtaining logging data from a well using a logging tool deployed in the well, wherein the logging data comprises one or more data values at each depth in a set of depths;
determining, with a first machine learning model, elemental data for the well at each depth in the set of depths based on the logging data;
determining, with a second machine learning model, total organic carbon (TOC) at each depth in the set of depths based on the logging data and elemental data; and
determining a hydrocarbon content of the well based on the determined TOC.
9. The non-transitory computer-readable memory according to claim 8, wherein the first machine learning model comprises a neural network.
10. The non-transitory computer-readable memory according to claim 8, the steps further comprising training the first and second machine learning models, wherein training the first and second machine learning models comprises:
obtaining modelling data comprising modelling logging data, modelling elemental data, and modelling TOC data from at least one of a sampled interval of the well and one or more offset wells;
identifying contaminated samples in the modelling TOC data using an outlier detection or correlation method in view of the modelling elemental data;
removing identified contaminated samples from the modelling data;
forming a training dataset comprising depth aligned examples from the modelling data; and
jointly training the first and second machine learning models, wherein:
the first machine learning model receives, as input, the modelling logging data of the training dataset and returns, as output, predicted elemental data,
the second machine learning model receives, as input, the modelling logging data and of the training dataset and the predicted elemental data and returns, as output, predicted TOC data, and
the joint training is guided by a first comparison of the modelling elemental data of the training dataset and the predicted elemental data and a second comparison of the modelling TOC data of the training dataset and the predicted TOC data.
11. The non-transitory computer-readable memory according to claim 10, wherein the modelling elemental data are obtained from one or more of inductively coupled plasma-mass spectrometry (ICP-MS) and x-ray fluorescence (XRF).
12. The non-transitory computer-readable memory according to claim 10, wherein the modelling TOC data are obtained using a pyrolysis technique on one or more core samples collected from one or more of the well and offset wells within the same geological setting.
13. The non-transitory computer-readable memory according to claim 10, wherein training the first and second machine learning models further comprises:
forming a validation dataset comprising depth aligned examples from the modelling data; and
validating the first and second machine learning models using the validation dataset.
14. A system, comprising:
a logging tool configured to obtain logging data from a well;
a computer processor; and
a non-transitory computer readable medium storing instructions that when executed by the computer processor cause the processor to perform operations comprising:
obtaining logging data from the well using the logging tool, wherein the logging data comprises one or more data values at each depth in a set of depths;
determining, with a first machine learning model, elemental data for the well at each depth in the set of depths based on the logging data;
determining, with a second machine learning model, total organic carbon (TOC) at each depth in the set of depths based on the logging data and elemental data; and
determining a hydrocarbon content of the well based on the determined TOC.
15. The system according to claim 14, wherein the first machine learning model comprises a neural network.
16. The system according to claim 14, wherein the logging tool is a wireline tool.
17. The system according to claim 14, wherein the operations further comprise training the first and second machine learning models, wherein training the first and second machine learning models comprises:
obtaining modelling data comprising modelling logging data, modelling elemental data, and modelling TOC data from at least one of a sampled interval of the well and one or more offset wells;
identifying contaminated samples in the modelling TOC data using an outlier detection or correlation method in view of the modelling elemental data;
removing identified contaminated samples from the modelling data;
forming a training dataset comprising depth aligned examples from the modelling data; and
jointly training the first and second machine learning models, wherein:
the first machine learning model receives, as input, the modelling logging data of the training dataset and returns, as output, predicted elemental data,
the second machine learning model receives, as input, the modelling logging data and of the training dataset and the predicted elemental data and returns, as output, predicted TOC data, and
the joint training is guided by a first comparison of the modelling elemental data of the training dataset and the predicted elemental data and a second comparison of the modelling TOC data of the training dataset and the predicted TOC data.
18. The system according to claim 17, wherein the modelling elemental data are obtained from one or more of inductively coupled plasma-mass spectrometry (ICP-MS) and x-ray fluorescence (XRF).
19. The system according to claim 17, wherein the modelling TOC data are obtained a pyrolysis technique on one or more core samples collected from one or more of the well and offset wells within the same geological setting.
20. The system according to claim 17, wherein training the first and second machine learning models further comprises:
forming a validation dataset comprising depth aligned examples from the modelling data; and
validating the first and second machine learning models using the validation dataset.