identification of lithofacies using kohonen self-organizing maps
TRANSCRIPT
Computers & Geosciences 28 (2002) 223–229
Identification of lithofacies using Kohonenself-organizing maps
Hsien-Cheng Changa,*, David C. Kopaska-Merkelb, Hui-Chuan Chenc
a Department of Computer Science, University of Alabama, Tuscaloosa, AL 35487, USAbGeological Survey of Alabama, P.O. Box 869999, Tuscaloosa, AL 35486, USA
cDepartment of Computer Science, University of Alabama, Tuscaloosa, AL 35487, USA
Received 30 October 2000; received in revised form 5 June 2001; accepted 10 June 2001
Abstract
Lithofacies identification is a primary task in reservoir characterization. Traditional techniques of lithofacies
identification from core data are costly, and it is difficult to extrapolate to non-cored wells. We present a low-cost
automated technique using Kohonen self-organizing maps (SOMs) to identify systematically and objectively lithofacies
from well log data. SOMs are unsupervised artificial neural networks that map the input space into clusters in a
topological form whose organization is related to trends in the input data. A case study used five wells located in
Appleton Field, Escambia County, Alabama (Smackover Formation, limestone and dolomite, Oxfordian, Jurassic). A
five-input, one-dimensional output approach is employed, assuming the lithofacies are in ascending/descending order
with respect to paleoenvironmental energy levels. To consider the possible appearance of new logfacies not seen in
training mode, which may potentially appear in test wells, the maximum number of outputs is set to 20 instead of four,
the designated number of lithofacies in the study area.
This study found eleven major clusters. The clusters were compared to depositional lithofacies identified by
manual core examination. The clusters were ordered by the SOM in a pattern consistent with environmental gradients
inferred from core examination: bind/boundstone, grainstone, packstone, and wackestone. This new approach
predicted lithofacies identity from well log data with 78.8% accuracy which is more accurate than using a
backpropagation neural network (57.3%). The clusters produced by the SOM are ordered with respect to
paleoenvironmental energy levels. This energy-related clustering provides geologists and petroleum engineers with
valuable geologic information about the logfacies and their interrelationships. This advantage is not obtained in
backpropagation neural networks and adaptive resonance theory neural networks. r 2002 Elsevier Science Ltd. All
rights reserved.
Keywords: Neural networks; Carbonate rocks; Paleoenvironmental energy; Well log
1. Introduction
Lithofacies identification is important for many
geological and engineering disciplines. Lithofacies, rock
or sediment units characterized by certain textures or
other features, can be used to correlate important
characteristics of a reservoir, such as permeability and
porosity. Identifying various lithofacies of the reservoir
rocks is a primary task for petroleum reservoir
characterization. The purpose of this paper is to describe
an automated method of predicting reservoir rock
characteristics from frequently available data and expert
geological knowledge.
Traditionally, lithofacies are identified from cores.
Core data provide direct observations of lithofacies;
*Corresponding author. Fax: +205-3480219.
E-mail addresses: [email protected] (H.-C. Chang),
[email protected] (D.C. Kopaska-Merkel), [email protected]
(H.-C. Chen).
0098-3004/02/$ - see front matter r 2002 Elsevier Science Ltd. All rights reserved.
PII: S 0 0 9 8 - 3 0 0 4 ( 0 1 ) 0 0 0 6 7 - X
however, cores are costly to collect and core recovery is
often less than 100%. Moreover, core description can be
time consuming and dependent on geologists’ experi-
ence. Therefore, a lower-cost method not requiring cores
but providing similar or improved accuracy and resolu-
tion is desirable.
For this paper we use a suite of well logs, which
provide indirect information about the subsurface and
are less expensive than using core determination. Well
log measurements can be classified to logfacies. Logfa-
cies, reflecting both rock and fluid properties, allow
discrimination among beds or sedimentary units. Log-
facies often correspond to lithofacies when they are
calibrated with core descriptions. Thus, logfacies may be
constructed as surrogates for lithofacies. Classifications
of identified logfacies can then be used to predict
lithofacies in non-cored wells or non-cored intervals in
cored wells.
Associating well log data with lithofacies can be
difficult due to the heterogeneous nature of rocks,
especially carbonate rocks. Lithofacies can be defined
using any set of rock properties. However, only
lithofacies defined by variations in properties that affect
well log response can be identified using well log data.
Moreover, some useful rock properties such as porosity
and permeability affect well log response.
Conventional computing algorithms or statistical
methods have been shown to be inadequate for certain
geological problems (Moline and Bahr, 1995), especially
in carbonate reservoir characterization. Some research-
ers in the field of geology and petroleum engineering
have recently employed artificial neural networks
(ANNs) to improve on past performance in solving
such problems (Baldwin et al., 1990; Raiche, 1991;
Rogers et al., 1992; Chang et al., 1998, 2000). ANNs are
classified into two major types on the basis of learning
modesFsupervised and unsupervised. For supervised
networks, back-propagation neural networks (BPNNs)
are the most widely used; for unsupervised networks,
Kohonen self-organizing maps (SOMs) and Adaptive
Resonance Theory (ART) networks are the two most
frequently applied (Doveton, 1994; Chang et al., 1998,
2000).
BPNNs have several significant drawbacks. For
example, outputs are confined by a predetermined
number of nodes; it is difficult to interpret relationships
among output nodes, and it is difficult to incorporate
geological knowledge into networks. Among these
problems, the restriction of output to predetermined
clusters is a major concern. When test data are located
outside the training data range, BPNN cannot classify
them; thus, the discriminating ability is not assured.
BPNN adequately deal with well-bounded and stable
problems, because training sets may cover the entire
expected input space. Unfortunately, in reservoir char-
acterization problems, variables frequently are neither
well-bounded nor stable. New lithofacies and new values
of important rock properties are often encountered. This
is particularly true of carbonate reservoirs such as the
Smackover Formation, because carbonate reservoirs
exhibit patchy heterogeneity at a variety of spatial scales
(e.g., Kopaska-Merkel and Mann, 1992).
Incorporating unsupervised networks with pattern-
recognition principles researchers have overcome some
disadvantages in BPNNs and achieved some promising
results (Baldwin et al., 1990; Chang et al., 2000). These
principles involve extracting significant features from
inputs in training mode, and in production mode,
clustering inputs based on extracted features (Looney,
1996). SOMs (Baldwin et al., 1990) and ART networks
(Chang et al., 2000) have their own strengths. SOMs
cluster inputs to ordered features, using one, two or
more dimensions, providing intuitive or explicit expla-
nation/knowledge of the output clusters. Further, the
distance between two different clusters can tell the user
how ‘‘close’’ these two clusters may be in terms of
certain physical or chemical properties such as environ-
mental energy levels. For example, in the cluster pairs 2
and 10, and 2 and 4, clusters 2 and 4 are more closely
related than 2 and 10. ART networks let the user control
the degrees of similarity among prototyped clusters
stored in networks and inputs. However, the order of
clusters in ART networks does not provide any useful
geologic information for geologists or petroleum en-
gineers.
Further, the ordering feature of SOMs provides
transition information between neighboring clusters;
for example, where grainstone grades into packstone.
This property can alleviate the ‘‘hard’’ boundary effect.
Using traditional hard-boundary clustering methods, a
datum can belong to only one cluster, even if its
characteristics are intermediate. Using this ordering
feature, geologists and petroleum engineers can more
easily understand the relationships among clusters.
Thus, we employ SOMs in this paper.
2. Kohonen self-organizing maps
Kohonen self-organizing maps are unsupervised
artificial neural networks developed by Kohonen
(1982), who intended to provide ordered feature maps
of input data after clustering (Freeman and Skapura,
1991; Ripley, 1996). That is, SOMs are capable of
mapping high-dimensional similar input data into
clusters close to each other. SOMs are two-layer, fully
connected networks with a weight matrix. SOMs are
also called ‘‘topology-preserving maps’’, assuming a
topological structure among the cluster units. This
property is observed in the brain, but is not found in
other ANNs, such as BPNN and ART neural networks.
The resulting maps provide users an intuitive and
H.-C. Chang et al. / Computers & Geosciences 28 (2002) 223–229224
familiar way of correlating and illustrating input data
sets.
Combining this topological concept and geologists’
knowledge, we proposed a five-input and
one-dimensional output SOM. One of the best-known
applications of multiple-input and one-dimensional
output SOM is the solution of the traveling salesman
problem (TSP) (Ang!eniol et al., 1988). The TSP
is a difficult constrained optimization problem that
is often solved using heuristic methods (Ritter et al.,
1992).
The architecture and algorithm of SOMs implemented
in this work are detailed in the Appendix (Fausett,
1994). The weight vector represents the exemplar of the
input patterns and the maximum number of clusters to
be formed. After several experiments our maximum
number of output clusters was set to a large number, 20,
instead of the designated number (4 in this study) of
lithofacies. One advantage of this maximum number of
output clusters is that it contains all possible distinct
logfacies in the studied areas but the number is not so
big as to separate similar inputs. This also considers the
possible appearance of new clusters not seen in the
training mode but that may potentially appear in test
wells. If test-data clusters are located within the
same cluster range as training data, the training data
possibly cover the entire data range. If the test-data
clusters are located on end-clusters (the first or the
last clusters), the training data may not contain the
proper range and other sets of training data may be
needed. Output clusters may contain empty elements
when a large number of clusters are used in training
mode. If these empty clusters appear between two
occupied clusters, there may be another potential
lithofacies not observed in the data used. If empty
clusters lie at one end of the range of clusters and span a
large number of clusters, a smaller number of clusters
may be sufficient.
Baldwin et al. (1990) implemented an eight-dimen-
sional SOM to classify eight inputs into eight large-scale
(coarse) lithofacies (e.g., limestone, sand, and shale).
Their output lithofacies are assumed to be interrelated.
In this paper, we present an architecture of SOM, five
inputs and one-dimensional output, based on the
geological knowledge that the environmental energy
level is related to lithofacies (outputs) in ascending/
descending order. In other words, the output nodes
located closer to each other in a one-dimensional form
have closer energy levels. However, in an eight-dimen-
sional form with eight output nodes, there is no
distinction of the closeness of each output; each node
has the same effects on its seven neighboring nodes.
Furthermore, our SOM is used to distinguish small-scale
(finer) lithofacies: mudstone (MS), wackestone (WS),
packstone (PS), and grainstone (GS) which are textural
classes of carbonate rocks.
3. Source of data
The cores and well logs used in this paper were
from wells (Alabama State Oil and Gas Board
Permit Numbers 3986, 3854, 4633, 4835, and 6247) in
Appleton Field, located in north-central Escambia
County, Alabama (Markland, 1992; Kopaska-Merkel
and Hall, 1993). The field produces oil from variable
carbonate strata of the Smackover Formation at a
subsea depth of approximately 3,900m. Cores were
sampled at 0.3 m intervals, except the core from well
#6247, which is continuous. There are 157 core data
points available in the training well and 241 in the test
wells.
Core examination revealed four major lithofacies in
the training well (#3986) using the Dunham classifica-
tion (Dunham, 1962). These are wackestone, packstone,
grainstone, and bind/boundstone. In Appleton Field,
bind/boundstone is stratigraphically associated with
grainstone in reef-shoal complexes. Mudstone is
not found in the training well but appears in test well
#6247.
For each data point, there are five input variables:
depth, neutron porosity, density porosity, sonic,
and ‘‘velocity-deviation’’ logs (Anselmetti and Eberli,
1999). The data preprocessing stage incorporates
prior knowledge of the Smackover FormationFdepth
of top and bottom of Smackover Formation and
approximate boundaries between the upper and middle
sections, and middle and lower sections of Smackover
Formation.
The SOM program was trained on the data from well
# 3986 and was tested on wells #3854, #4633, #4835, and
#6247 to verify performance. Well #3986 was chosen as
the training well because it has the most complete set of
core and log data for the whole thickness of the
Smackover Formation. The output clusters (logfacies)
produced from the SOM program were associated with
lithofacies previously identified by a geologist (Mark-
land, 1992).
4. Results and discussion
Eleven major, two minor, and seven ‘‘null’’ logfacies
are numbered between 0 and 19 (because a maximum of
20 output clusters are possible). The relationship
between logfacies identified by SOMs and lithofacies
determined by a geologist are shown in Table 1. These
logfacies were mapped to the four major lithofacies:
bind/boundstone, grainstone, packstone, and wackes-
tone. After calibrating with the core analysis, relation-
ship between logfacies (SOM) and permeability (from
commercial core analyses) is shown in Table 2. It is
noted that the logfacies 0, 1, 5, and 6 possess
permeability values over 100 md. The maximum
H.-C. Chang et al. / Computers & Geosciences 28 (2002) 223–229 225
permeability values found in these four logfacies are in
descending order of paleoenvironmental energy level.
The prediction and corresponding geologist’s core
descriptions (Markland, 1992) are shown in Fig. 1. The
lithofacies predicted by the SOM match those identified
by the geologist for nearly 80% of the test-well
data (Table 3). The order of clusters follows the
paleoenvironmental energy level in descending order:
bind/boundstone, grainstone, packstone, and wackes-
tone. That is, the larger the number of a given logfacies,
the lower the energy that logfacies represents. The 11
major logfacies are distributed consecutively in the
lower number logfacies, implying data in the study
area are continuously distributed in environmental
space. There are 6 empty clusters between logfacies 10
and 17. This gap resulted from the choice of 20 possible
logfacies in the training mode. However, some data are
located in logfacies, 17 and 19, justifying the use of 20
potential facies. If the analysis were re-run, using 11
(numbered from 0 to 10) distinct possible logfacies in the
training mode, then data previously assigned to logfacies
17 and 19 would be reassigned to the highest-number
logfacies, 10. Two minor logfacies contain only 3 data
points (1 and 2 points for logfacies 17 and 19,
respectively). Of these three points, two are from the
training well (#3986) and only one comes from a test
well (# 3854). These two logfacies consist of algal
laminite, which is a minor variety of bind/boundstone
that is not necessarily found in a high-energy setting.
According to the energy level trend implied by logfacies
number these should be low-energy logfacies, which is
consistent with the known environmental distribution of
algal laminite.
SOM predicted lithofacies identity from well log
data with 78.8% (190 of 241 points) accuracy.
This degree of accuracy is comparable to that (79.3%,
191 of 241 points) achieved by an ART2 neural network
on the same data (Chang et al., 1998, 2000). However,
the clusters (logfacies) produced by the SOM are
ordered with respect to paleoenvironmental energy
levels, which provides valuable geologic information.
Further, the SOM is more accurate than BPNNs
(57.3%) (Chang et al., 2000). Because this SOM predicts
lithofacies identity from well log data with a high
degree of accuracy, it may permit improved prediction
of lithofacies in non-cored intervals and non-cored
wells.
Table 1
Logfacies identified by SOMs and lithofacies determined by
geologist
Logfacies (SOM) Lithofacies (Geologist)
0 Grainstone 1/Bind/boundstone 1
1 Grainstone 2/Bind/boundstone 2
2 Grainstone 3/Packstone 1
3 Grainstone 4/Packstone 2
4 Grainstone 5
5 Grainstone 6
6 Grainstone 7/Packstone 3
7 Grainstone 8/Packstone 4
8 Wackestone 1/Packstone 5
9 Wackestone 2/Packstone 6
10 Wackestone 3/Packstone 7
11 a
12 a
13 a
14 a
15 a
16 a
17 Algal Laminite 1
18 a
19 Algal Laminite 2
aNo-valued output nodes.
Table 3
Statistics of prediction of lithofacies for test wells
Well
number
No. of
available
data points
Match
(#)
Mismatch
(#)
Match
(%)
3854 60 44 16 73.3
4633-B 74 69 5 93.2
4835-B 49 39 10 79.6
6247 58 38 20a 65.5
Total 4 wells 241 190 51 78.8
aTwenty mismatched points, including 8 data points of
mudstone, which was not seen in training well #3986.
Table 2
Relationship between logfacies (SOM) and permeability
Logfacies (SOM) Permeability (md)
0 0.01–4000
1 3–1545
2 Not permeablea
3 Not permeable
4 Not permeable
5 0.01–618
6 0.01–50
7 Not permeable
8 Not permeable
9 Not permeable
10 Not permeable
17 Not permeable
19 Not permeable
aBelow measurement limit (0.01md).
H.-C. Chang et al. / Computers & Geosciences 28 (2002) 223–229226
Fig. 1. Comparisons between geologist’s description and predictions from self-organizing map.
H.-C. Chang et al. / Computers & Geosciences 28 (2002) 223–229 227
5. Conclusions
We have presented a SOM incorporating geologists’
knowledge of paleoenvironmental energy level. This
approach possesses the following advantages over
supervised-learning BPNNs and unsupervised-learning
ART neural networks in the determination of lithofa-
cies. First, the one-dimensional topological architecture
of the SOM is consistent with the geologists’ lithofacies
knowledge, in that the geologist’s expertise is embedded
in the structure. Second, the distribution of the SOM
nodes provides ascending/descending energy informa-
tion about the lithofacies, not provided by BPNN and
ART neural networks. Third, the distances between
nodes provide information about the relative energy
level of lithofacies represented by those nodes. Finally,
the SOM may be extended to two- or three-dimensional
topology, according to the geologists’ knowledge, to
map other geophysical properties from well logs and
provide convenient visualization using two- or three-
dimensional plots.
Acknowledgements
This work is supported, in part, by the US Depart-
ment of Energy through their Alabama DOE/EPSCoR
Program.
Appendix A. Kohonen self-organizing mapFArchitec-
ture and Learning Algorithm
The basic architecture of the one-dimensional SOM
employed in this work is shown in Fig. 2. The network
consists of two layers: input (Xi; i ¼ 1yn) and output
(Yj ; j ¼ 1ym), where n denotes the number of input
nodes and m stands for the maximum number of clusters
to be formed. In the case of this study, n ¼ 5 and m ¼
20: Wij is a weight vector for input Xi and output Yj :The learning algorithm is summarized as follows:
(1) At the beginning of the trial, randomly assign
values to Wij ranging from 0 to 1.
(2) Set learning rate, topological neighborhood para-
meters and maximum number of clusters to be
formed.
(3) Input an Xi and compute Euclidean distance DðjÞ of
each cluster Yj :
DðjÞ ¼X
i
ðWij � XiÞ2:
(4) Find the minimum DðJÞ; J is the index of Yj with
minimum distance.
(5) Update weights for all units j within a specified
neighborhood of J and for all i:
WijðnewÞ ¼ WijðoldÞ þ learning rate�½Xi � WijðoldÞ�:
(6) Repeat steps (3) and (5) until all inputs have been
presented to the SOM once.
(7) Test stopping condition.
(8) Update learning rate and go to (3).
References
Ang!eniol, B., de La Croix Vaubois, G., Le Texier, J.-Y., 1988.
Self-organizing feature maps and the traveling salesman
problem. Neural Networks 1, 289–293.
Anselmetti, F.S., Eberli, G.P., 1999. The velocity-deviation log:
a tool to predict pore type and permeability trends in
carbonate drill holes from sonic and porosity or density
logs. American Association of Petroleum Geologists Bulle-
tin 83, 450–466.
Baldwin, J.L., Bateman, A.R.M., Wheatley, C.L., 1990.
Application of neural network to the problem of mineral
identification from well logs. The Log Analyst 3, 279–293.
Chang, H.-C., Chen, H.-C., Kopaska-Merkel, D.C., 1998.
Identification of lithofacies using ART neural networks and
group decision making. Proceedings of Artificial Neural
Networks in Engineering Conference, St. Louis, Missouri,
USA, pp. 855–860.
Chang, H.-C., Kopaska-Merkel, D.C., Chen, H.-C., Durrans,
S.Rocky, 2000. Lithofacies identification using multiple
adaptive resonance theory neural networks and group
decision expert system. Computers & Geosciences 26(5),
591–601.
Doveton, J.H., 1994. Applications of artificial intelligence in log
analysis. In: Geologic Log Analysis Using Computer
Methods, American Association of Petroleum Geologists,
Tulsa, OK, pp. 151–165.
Dunham, R.J., 1962. Classification of carbonate rocks accord-
ing to depositional texture. In: Ham, W.E. (Ed.), Classifica-
tion of Carbonate Rocks. American Association
of Petroleum Geologists Memoir 1. The American Associa-
tion of Petroleum Geologists, Tulsa, OK, pp. 108–121.
Fig. 2. Architecture of Kohonen one-dimensional self-organiz-
ing map.
H.-C. Chang et al. / Computers & Geosciences 28 (2002) 223–229228
Fausett, L.V., 1994. Fundamentals of Neural Networks:
Architectures, Algorithms, and Applications. Prentice-Hall
Inc., Englewood Cliffs, NJ, 461 pp.
Freeman, J.A., Skapura, D.M., 1991. Neural Networks:
Algorithms, Applications, and Programming Techniques.
Addison-Wesley, Reading, MA, 401 pp.
Kohonen, T., 1982. Self-organized formation of topologically
correct feature maps. Biological Cybernetics 43, 59–69.
Kopaska-Merkel, D.C., Hall, D.R., 1993. Reservoir
Characterization of the Smackover Formation in
Southwest Alabama. Alabama Geological Survey Bulletin
153, 111 pp.
Kopaska-Merkel, D.C., Mann, S.D., 1992. Regional variation
in microscopic and megascopic reservoir heterogeneity in
the Smackover Formation, Southwest Alabama. Transac-
tions of Gulf Coast Association of Geological Societies 42,
189–212.
Looney, C.G., 1996. Pattern Recognition Using Neural Net-
works: Theory and Algorithms for Engineers and Scientists.
Oxford University Press, New York, 458 pp.
Markland, L.A., 1992. Depositional history of the Smackover
Formation, Appleton Field, Escambia County, Alabama:
M.Sc. Thesis, University of Alabama, Tuscaloosa, AL,
145 pp.
Moline, G.R., Bahr, J.M., 1995. Estimating spatial distribu-
tions of heterogeneous subsurface characteristics by regio-
nalized classification of electrofacies. Mathematical
Geology 27, 3–22.
Raiche, A., 1991. A pattern recognition approach to geophy-
sical inversion using neural nets. Geophysical Journal
International 105, 629–648.
Ripley, B.D., 1996. Pattern recognition and neural networks.
Cambridge University Press, Cambridge, 403 pp.
Ritter, H., Martinetz, T., Schulten, K., 1992. Neural Computa-
tion and Self-organizing Maps: An Introduction. Addison-
Wesley, Reading, MA, 306 pp.
Rogers, S.J., Fang, J.H., Karr, C.L., Stanley, D.A., 1992.
Determination of lithology from well logs using a neural
network. American Association of Petroleum Geologists
Bulletin 76, 731–739.
H.-C. Chang et al. / Computers & Geosciences 28 (2002) 223–229 229