report - it.uu.se · use emgucv which is a .net wrapper to the intel opencv image process-ing...

28

Upload: buinhu

Post on 01-Sep-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Institutionen f�or informationsteknologi

Automatic Seismic Image AnalysisTool

Niklas BorwellBo LeeKlas Eurenius

Project in Computational Science: Report

Januari 2012

PROJECTREPORT

Abstract

We have created a tool in C# that looks for geological structures in 2Dseismic images, with the main motivation in its large potential for timeand cost savings in the process of finding oil. Today, still, image screeningis done manually at the same time as the amounts of aquired geologicaldata grows faster than ever.

Using computer assisted image analysis to screen seismic images is alargely unexplored field of research, so we apply well known image process-ing and analysis theory. For image preprocessing this includes intensitylevel thresholding, frequency filtering, mathematical morphology etc. Foranalysis and classification we use skeletonization, distance transformation,property measuring etc.

Considering this works as an early stage prototype, the results arevery promising. The produced software tool successes to find and classifyobjects correctly, except for some problems which occur when differentimage objects are tightly connected. Processing time for processing a”standard sized” single image ranges from 0.2 s to 10 s depending on howmuch information it contains, noting that this is for an AMD Athlon X2@ 2 GHz and that the code is not optimized.

1

Acknowledgements

We wish to thank supervisor Sergio Courtade from Schlumberger Ltd. forproviding guidance and expertise in geology and, specifically, in seismicimage interpretation. We also thank course coordinator Maya Neytchevafor making this project possible at all.

2

Contents

1 Introduction 41.1 Project description . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Confusing terminology . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background 52.1 Seismic data format . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Computer aided search . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Image interpretation . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Methodology 7

4 Theory 74.1 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.2 Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.3 Closing operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.4 Bounding box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.5 Skeleton representation . . . . . . . . . . . . . . . . . . . . . . . 94.6 Constrained distance transform . . . . . . . . . . . . . . . . . . . 94.7 Elongatedness measure . . . . . . . . . . . . . . . . . . . . . . . . 10

5 Implementation 115.1 Mark interesting areas . . . . . . . . . . . . . . . . . . . . . . . . 125.2 Merge areas into objects . . . . . . . . . . . . . . . . . . . . . . . 125.3 Classifying objects . . . . . . . . . . . . . . . . . . . . . . . . . . 135.4 Vector representation . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.4.1 Skeletonization . . . . . . . . . . . . . . . . . . . . . . . . 155.4.2 Path selection . . . . . . . . . . . . . . . . . . . . . . . . . 155.4.3 Path sampling . . . . . . . . . . . . . . . . . . . . . . . . 17

5.5 Separation problem . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6 Results 196.1 Main images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196.2 Other images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206.3 Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

7 Further development 227.1 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237.2 Use more data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237.3 Complementing segmentation methods . . . . . . . . . . . . . . . 247.4 Employ an advanced classification tool . . . . . . . . . . . . . . 247.5 Parallelization and optimization . . . . . . . . . . . . . . . . . . . 25

8 Conclusion 25

9 References 25

3

1 Introduction

As it becomes increasingly difficult to find profitable deposits of oil to explore,the accuracy of the information regarding promising geological sites becomesmore and more important. Huge amounts of seismic data are collected everyweek but the manpower and time needed to analyze all the data manuallyimposes a huge cost. This means that the need for a computerized automatedanalysis tool is in great demand.

1.1 Project description

The aim of this project is to create a prototype program that can automaticallyfind and classify interesting objects in seismic images. The prototype programshould work as a first step towards a fully functional automated analysis tool.

Our prototype program is developed to work on a special type of images,namely colour coded amplitude spectrum images. It can easily be modified bychanging a few parameters to handle different images. Automatic parameteradjustment is very complex, however, and lies outside our project scope.We were given nine time slice images as our development dataset. They are allfrom the same place, only separated by some milliseconds in depth(see Section2.1).

In the amplitude spectrum images the reflections from softer sediments areclearly differentiable from the reflections of solid rock, making interesting ar-eas easily detectable for human eyes. However, present in all these images arevirtual artefacts (see Section 2.3) that make the task of classification and dif-ferentiation much harder.

(a) Original image (b) Interestin geological features detectedand marked

Figure 1: Example of seismic image before and after processing

1.2 Confusing terminology

In this report the words channel and object are used extensively. Both wordscarry two meanings.

4

Object refers to an image object, meaning a feature in the image that hasbeen segmented (see Section 4.1). Geological objects are the actual physicalobjects represented in the unprocessed images such as channels, deposits andsimilar features that are interested to locate.

Channel will mainly refer to long elongated geological objects, like old riverbedsor valleys. Sometimes channel will refer to an information channel from themeasuring equipment that was used for a particular picture. For example, thedata we have used is the acoustic amplitude reflection channel in the seismolog-ical measuring equipment, or it might refer to the red color channel in a RGBimage. In these cases we clarify what kind of channel we refer to, and if wedon’t have a better name for it then it will be called “information channel”.

2 Background

2.1 Seismic data format

The subsurface is mapped to digital 3D volumes by sound waves - if onshoreby special vibrator trucks attached with an array of receiving geophones, ifoffshore by boats with an air gun generating vibrations. The transmitted wavesare reflected back when hitting density boundaries of, e.g., changes in rock typesor fluid contents. Using this method the signal’s time of travel is closely relatedto subsurface depth (Satterfield/Whiteley).

Upon interpreting, the recorded volume is often sliced vertically, horizontallyor along a ”horizon”. The latter represents a line or surface corresponding toa density border. Though, for this project we will solely work with horizontalslices which are more commonly known as ”time slices”. The data is coloredred and blue where high amplitude reflections occur (Figure 2).

2.2 Computer aided search

Due to modern techniques and depleting oil supplies the oil industry encountersa very rapid growth in the amount of seismic data to analyze. But the incresein the number of employed doesn’t by far reflect this increase in collected data.

As for now, the images are screened manually. In order to streamline theprocess it does seem natural to consider computer based image analysis.

2.3 Image interpretation

As mentioned in Section 2.1, red and blue parts of an image are strong reflec-tions. In general, the more reflections an image shows, the more interesting itis. The reflections build up shapes that represent different geological objects,and we learned that our image set has three different kinds of objects (Figure3):

• channels,

• deposits,

• structural reflections.

5

(a) 3D volume illustration with stacked time slices. Thetime resolution of the original data set is 2 ms.

(b) Time slice where a channel can be seen.We work with this kind of images.

(c) Vertical slice of the 3D volume. Horizonsare the red and blue lines running throughthe image horizontally.

Figure 2: 3D volume and 2D slices.

Of these, the last one requires additional explanation. Structural reflectionssomtimes occur when slicing the dataset horizontally where there is a densityregion border that is at an angle compared to the horizontal plane of the 3Dvolume, for example a bedrock slope. This means that structural reflectionsare only artificial and not real geological objects and are only present in the2D slices. They are very difficult to tell apart from other objects by looking attheir shape. The outlining done in Figure 3(b) is made by an expert. An idealseismic image analysis tool can manage to ignore structural reflections.

6

(a) Image with three different object typesfrom our dataset.

(b) Objects outlined. Red: deposit, green:structural reflection, blue: channcel.

Figure 3: Object types.

3 Methodology

This project is not based on or built on any previous work in this area, so webuild our tool using well-known image analysis algorithms programmed in highlevel languages. In general the following is our method of development:

1. Use MATLAB and its Image Processing Toolbox to try different imageanalysis methods. The Image Processing Toolbox offers a comprehensiveset of image analysis functions, and allows for a very quick startup phasein the development process.

2. When we feel confident that we have found a working method, the MAT-LAB code is migrated to C# code. For image processing and analysis weuse EmguCV which is a .NET wrapper to the Intel OpenCV image process-ing library (for C/C++). While searching for suitable image processinglibraries we found that OpenCV is, arguably, a de facto standard in im-age processing and computer vision implementations. EmguCV/OpenCVdoes actually not do everything MATLAB does, which forces us to imple-ment many algorithms ourselves.

The choice of using C# (instead of e.g. C) is solely because Schlumberger’ssoftware platform Petrel nicely allows for .NET plugins or apps. While ourtool is not a Petrel application, it had been considered in an early stage.Furthermore, making it C# will facilitate any attempts to integrate itwithin Petrel in the future.

3. Final evaluation of results and computation times. Even though speed isnot a part of our goals, the problem formulation has an inherent speedconcern - the tool needs to outperform human screening.

4 Theory

This section aims to explain certain image analysis concepts and methods usedin the implementation.

7

4.1 Objects

We often describe a binary image (i.e. an image with only two colors: blackand white) in terms of objects and background. An object in this context isdefined as a connected white area, while all black pixels are considered part ofthe background. An example of this concept is shown in Figure 4.

Figure 4: A binary image with 3 objects.

4.2 Threshold

Threshold is the most common way to select parts of an image. In a normal8-bit grayscale image, each pixel has a brightness value that ranges between 0and 255. When we use a threshold, we simply select those pixels that have abrightness value larger than a certain threshold. Then we create a new binaryimage where the selected pixels are set to white and the rest set to black. Anexample of this technique with different threshold values is shown in Figure 5.

By using color space transformations this can be extended to a more generalconcept. By converting the image to a HLS (Hue-Lightness-Saturation) colorspace, the saturation and hue will also be represented by a grayscale image.So by applying threshold to for instance the saturation channel, we can selectsaturated areas.

4.3 Closing operation

A closing operation is defined by two consecutive operations; a dilation followedby an erosion. The purpose of this operation is to connect neighbouring whiteareas in binary images while keeping as much of the original shape as possible.

The dilation step is used to extend the borders of white areas to connectneighbouring areas. The erosion step peels off the outer border of the whiteareas in the same way they were dilated to restore the original shape, whilekeeping the new connections between areas. The process is demonstrated inFigure 6.

8

(a) Original image (b) Result of a threshold at 167

(c) Result of a threshold at 200

Figure 5: Example of threshold technique

4.4 Bounding box

The bounding box of an object is the smallest rectangular box that can bedrawn (with a vertical-horizontal alignment) that encapsulates every pixel ofthe object. An example of this is shown in Figure 7

4.5 Skeleton representation

A skeleton representation is a way of representing objects with thin lines, wherethe lines should represent the inner structure of the object. The method toachieve this used in the implementation works by iteravely finding and removingpixels which are found to be non-skeleton points, i.e. a pixel which is notimportant for the inner structure. An example of a skeleton representation isshown in Figure 8.

4.6 Constrained distance transform

The distance transform is an operation which outputs an image where eachobject pixel value is equal to the distance between the pixel and the closestbackground pixel in the original image. When using constrained distance trans-form, the distance is only allowed to be calculated over certain areas.

A special case of this is used in the implementation, where we only allowdistance to be calculated on a line from an endpoint of the line. The result ofthis is a numbered line where each value represents the number of pixels betweenthe pixel and the starting point. An example of this is shown in Figure 9.

9

(a) Input image with three disconnected white areas

(b) Image after dilation. The white areas have been con-nected

(c) Image after erosion. Notice that the white areas areconnected while the major shape of the original areas arepreserved

Figure 6: Steps for the closing operation

4.7 Elongatedness measure

The elongatedness measure is a way of differentiating between elongated shapesand more compact, circular shapes. It is defined by Sonka, Hlaval och Boyle(2008, 355) as

elongatedness =area

(2d)2,

10

(a) Original image (b) Image with bounding box inred

Figure 7: Example of a bounding box

(a) Original image (b) Skeleton representation

Figure 8: Example of a skeleton representation

where d is the number of erosions that can be applied to an object before itdisappears (see Section 4.3). It is a measure of the maximum thickness of ashape and the elongatedness measure therefore gives a measure of the ratio ofthe area to the squared maximum thickness of a shape. It is very useful sinceit works for any general shape.

5 Implementation

The input image used to demonstrate the steps of the program is shown inFigure 10. The main algorithm consists of the following steps.

1. Mark interesting areas.

2. Merge areas into objects.

3. Remove uninteresting objects.

4. Classifiy objects.

11

(a) Input image with a branching lineand blue pixel as a starting point

(b) Result of constrained distance trans-form

Figure 9: Example of constrained distance transform

5. Vector representation.

The algorithm uses four user defined search parameters, which makes it possibleto adapt the algorithm for different kinds of images:

• A search criterion which defines what to look for

• A sensitivity value for the search criterion

• A merging parameter which defines how dense the interesting areas are.

• A size limit that defines how small objects we are interested in.

The main steps of the algorithm are described below.

5.1 Mark interesting areas

In the first step interesting areas are marked based on a search parameter. Thisparameter consists of two parts; one or more search criterion combined with asensitivity value for each. The search criterion in the given example is areasof blue, red and yellow color. This is achieved by converting the image to anappropriate color space and selecting areas in different information channels.This method gives the algorithm a lot of possibilities due to the wide range ofcolor space transformations available. Examples of possible search options couldbe areas with a certain color, saturated or unsaturated areas, dark or brightareas or any combination of the above. The sensitivity parameter for eachchannel is implemented through a simple binary threshold value (see Section4.2).

The output of this step is a black and white image where white areas areconsidered interesting (see Figure 11).

5.2 Merge areas into objects

In this step interesting areas are combined into objects, where object in this casemeans a connected white area (see Section 4.1). The goal is for these objectsto represent real life objects so that no separate objects are connected to eachother and no single object is divided into two separate objects. This is howeverin some cases impossible, since different objects often overlap and share the

12

Figure 10: Original image. We want to find and mark the channel and thedeposit.

same properties (see Section 4.3). The result is therefore often a compromisebetween the two. Possible solutions to this problem are discussed in Section 7.

The technique used to merge areas is a closing operation (see Section 4.3).The size of the closing operation is controlled by a search parameter whichdefines the minimum distance between two areas to be considered part of thesame object. The output of this step is a binary image consisting of objects andbackground (see Section 4.1), see Figure 12.

5.3 Classifying objects

In this step we perform some measurements on each object which are used toremove uninsteresting objects and classify the remaining. Currently we havetwo measurements; area and elongatedness (see Section 4.7).

The area measurement is used for removing objects considered too small andis controlled by a search parameter which defines a minimum area for interestingobjects. One can also consider using more measurements to define interestingobjects such as length, positon or base it on the classification.

The classification is based on the elongatedness measurement which differ-entiates channels from more round objects such as deposits. The classificationuses a threshold value; objects with an elongatedness over 10 is considered achannel and below 10 a non-channel. The value 10 was chosen by comparingthe value for several different shapes and choosing the value that best agreed

13

Figure 11: Output of step 1. Interesting areas are marked in white.

with our own observations. A lot of improvement in this area is possible anddiscussed further in Section 7. The output of this step is a list of interestingobjects with measurements and classification result (see Figure 13).

5.4 Vector representation

In this step each object is processed individually based on its classification. Cur-rently only objects classified as channels will go through individual processing.For these channels a vector representation is created, which has many potentialbenefits. Currently we use this to calculate the length of a channel, but sincewe have a mathematical representation of the channel’s path, many more ad-vanced measurements are possible. We could for instance calculate whether itis straight or curved and how it bends.

Another important possibility is related to the segmentation problem of di-viding objects described in Section 5.5. With the vector representation we cansee the direction of a channels path and by using that we could automaticallyconnect accidentally disconnected channel segments.

The vector representation should have the following properties

• Runs along the entire length of the channel.

• Is centered in the channel in each point.

• Has a variable detail level, controlled by a parameter.

14

Figure 12: Output of step 2. Interesting areas are merged into objects.

It is accopmlished through the following steps

5.4.1 Skeletonization

The first step is to create a skeleton representation of the object (see Section5.4.1). The resulting image has a centerline running through the object alongwith a number of branches (see Figure 14).

5.4.2 Path selection

The goal in this step is to select only the centerline of the skeleton, which isaccomplished by making two asumptions

1. One of the endpoints of the centerline touches the bounding box of theobject (see Section 4.4).

2. The centerline is the the longest possible line in the skeleton that agreeswith 1.

Although an implementation without the need for touching the boundingbox can be accomplished, it would make the algorithm considerably slower andwas therefore not implemented. These asumptions are clearly not accurate forall types of possible elongated shapes, but are accurate enough when consideringreal-life channels. By using these asumptions we can find the centerline fromthe skeleton in the following way

15

Figure 13: Output of step 3. Uninteresting objects are removed and remainingobjects are classified.

16

Figure 14: Skeleton representation of a channel.

1. Find all points connected to the bounding box of the object

2. Calculate a constrained distance transform image for each point found in1 (see Section 4.6)

3. Select the point with the highest value found in the distance transformimages.

4. Trace the path from the highest value back to zero.

The traced path will be the assumed centerline of the object (see Figure 15).

5.4.3 Path sampling

The final step is to sample the centerline of the object to create a simplifiedvector representation. This is done by selecting points from the centerline andmeasuring the error between the curves. The error is defined as the relativedifference between the length of the original line and the length of the vectorrepresentation. More points are then added from the centerline until the erroris below a threshold.

To improve the result, two accuracy limits were used; one for normal andone for short curve segments. This gives a total of three parameters to controlthe detail level of the vector representation: accuracy for normal segements,accuracy for short segments and length threshold for what is considered a short

17

Figure 15: Asumed centerline.

segment. Although it would be possible to control all these manually, we decidedto simplify the usage by introducing a single parameter which controls the otherthree which ranges between 0.8 and 1.

An example of a final vector representation is shown in Figure 16.

5.5 Separation problem

The most difficult problem of the algorithm appears in Section 5.2: Merging ofareas into objects. The different objects we wish to distinguish between usuallyshare the same appearance and properties which makes a direct classificationimpossible. This is further complicated by the fact that they often overlap, seeFigure17. Different objects are sometimes more connected to each other thanobjects are in themselves, which makes a perfect separation based on connect-edness also impossible.

The issue comes from the fact that we are looking at 2D slices from a com-plete 3D volume which sometimes even makes it hard for a person to distinguishbetween different objects. Possible solutions for this problem are discussed inSection 7.

18

Figure 16: Channel object with corresponding vector representation.

6 Results

In this section we consider the output of two different types of images. The firstwe call ”main images” because the program development is almost exclusivelybased on this type. The other image type is discussed in ”Other images” (Section6.2) and we use it to demonstrate some flexibility of the tool. Last, a discussionabout speed is included.

6.1 Main images

To evaluate results of the program we start with one of the nine images fromSchlumberger. Because of the separation problem (Section 5.5) we use twoversions of this image. The first version is the unmodified image (Figure 18(b)).In the other version we have by hand disconnected connected objects (Figure18(a)).

We see that in the modified image, objects are classfied correctly. There area channel object and a deposit object.

The channel and the deposit are connected to each other in the originalimage, where the connection region is equally wide as the channel. This is asegmentation problem that requires a much more complex solution than we haveimplemented. We discuss possible improvements in Section 7.

Even though the deposit and the channel are segmented as one object, thevectorization step finds the right channel path. This can be considered as a

19

Figure 17: Illustration of separation problem. The channel (orange arrow) isstrongly connected to the deposit (blue arrow) which is further connected to thestructural reflections (green arrows).

coincidental case because the correct channel path just happens to be slightlylonger than the second longest endpoint-to-endpoint path. Unfortunately thiscan not be guaranteed to hold in the general case.

6.2 Other images

Here we try another image (Figure 19) with a visible channel. The commonfeature between this image and the main image type is that objects of interesthave distinctly different color intensity compared to background objects. On theother hand notable differences are color space, contrast and size. By changingthe parameters of the algorithm, this image can also be nicely segmented andclassified for objects.

6.3 Speed

In the long run, beyond our prototype program, speed is of utmost importance.Due to time constraints we haven’t optimized our code with respect to speed,but we will here present some figures to show that doing automatic segmentationand classification can become a huge benefit in future.

The computations are done on an Athlon X2 2.0 Ghz (2005) CPU, and thecode is not in any way optimized for this CPU type. A single original size(557x475 pixels) image takes up to 10 s to process if there are objects present.

20

(a) Modified image where the different ob-ject types have been separated.

(b) Unmodified image where all objects areconnected.

(c) Object 1 Class: channel, Elongated-ness: 18.4, Size: 43 · 103, Length: 813. Ob-ject 2 Class: deposit, Elongatedness: 5.2,Size: 14 · 103, Length: 813.

(d) Channel and deposit are segmented asone image object. Note that the vector rep-resentation is the same as in (a) because oflongest path beteween endpoints.

Figure 18: Result images showing segmentation, vector representation, classifi-cation and measurements.

On the other hand an ”empty” image takes only a fraction of a second to process(see Figure 20).

We note that the images we are working with are high resolution comparedto the amount of information (useful for us) they contain. As the computationtime is linear to the image’s pixel count (see Table 1), we can make use ofdownscaling in order to gain speed. As an example, our main image (Figure18) segments the same even when downscaled to 25% of its original size. Thiswould have to be a rather dynamic algorithm because the width of channelsvary greatly.

Finally we state that these speed results show an obvious advantage of usingmachines instead of humans for image screening. Also there are many options tooptimize the performance of our implementation even further, e.g. optimizingfor CPU, optimizing the image processing algorithms, and even changing to amore modern computer.

21

(a) Input image. Alternative image type from Schlum-berger.

(b) Output image. Class: channel, Elongatedness: 28.4,Size: 35 · 103, Length: 1040

Figure 19: Results for the alternative image type.

(a) 4.5 s to process this image with objectspresent.

(b) 0.3 s to process image with nothing toclassify.

Figure 20: Computation times for images with and without objects.

7 Further development

This chapter discusses some ideas we have for further development, some of theissues we have not been able to solve yet and some ideas we have for solvingthem.

22

Resolution (pixels) Relative size (%) Time (s)278x237 25 ≈ 1417x356 50 ≈ 2557x475 100 ≈ 4 − 51114x950 200 ≈ 15 − 18

Table 1: Computation time is linear to pixel count.

7.1 3D

Our program is developed to work on 2D data slices from a large 3D data set,handling one image at a time segmenting and classifying the objects we thinkare interesting and marking them before moving on to the next one.This process is repeated for every slice in the dataset until the whole volume isanalysed.

The same methods that are used to analyse the 2D slices could be generalisedto work on the 3D volume. The reason for doing the analysis in 2D in the firstplace is inherited from manual analysis. It is difficult for humans to see througha 3D volume so the volume is sliced to allow it to be properly examined. Thislimitation does not apply to computers.

We think that analysing the data set in 3D is preferable to 2D for severalreasons. The main reason is that having the whole 3D object gives us much moreinformation and would render otherwise quite difficult operations unnecessary.

When handling 2D slices we have to take into account that the seismologicalobjects we are looking for are not flat but stretched over several time-slices, mak-ing it necessary to track the objects through the dataset in order to determinehow many different objects that are actually present in the volume.

Using 3D, this would become unnecessary as each object would be segmentedas a whole from the start. Having access to full 3D objects could also behelpful when trying to differentiate between different geological objects thatare connected to each other as image objects for some reason.

We discussed in Section 6.1 the problem of having two different kinds of ob-jects closely connected, with no image information that clearly separates them,(Figure 17). This connection however is not present in all of the images in ourdataset meaning that a human analyst can with relative certainty say that thoseare two separate objects. If we examine the 3D mock up (Figure 21), createdby interpolating our nine time slice dataset to give it a bit more volume, we canquite easily see that the deposit, even though it is connected to the channel isprobably not a part of the channel. More importantly, it will be easier for thecomputer to make this discovery. This kind of interpolation would of course beunnecessary if the whole 3D volume is used.

Structural reflections are a major problem when processing and analysing 2Dslices (see Section 2.3), handling the data volume in 3D will ease this problem.

7.2 Use more data

One major difficulty we have faced during the development of this programis lack of data. Due to restrictive security classification of the data volumeswe where only allowed to work with 9 time slices of somewhat preprocessed

23

Figure 21: 3D mock up of the channel and the deposit. In this image it ispossible to see that the deposit (blue area on the right) is probably not a partof the channel.

images (the acoustic amplitude reflection information-channel was chosen fromthe dataset).

If the program could use all attributes and information channels from thedataset we think that both segmentation and classification could be made muchmore accurate. Comparisons between as well as combination of different infor-mation channels will be possible. This information can then be used to discoverand remove noise and to help separate the kind of objects that we are not ableto separate in this version of the program.

7.3 Complementing segmentation methods

In this version of the program, we employ thresholding in the acoustic amplitudeinformation channel as the main segmentation method, but thresholding is notthe only segmentation method available that we have considered. It is by farthe most effective method for the images in our dataset though, which is whyit was chosen. There might however be characteristics in other informationchannels and images where it is desirable to employ other segmentation methodto complement thresholding. There are two additional methods for segmentationthat are often used, edge detection and texture analysis (Gonzales, Woods 2008,447-450, 675-679). Both of these methods should be considered again whenimplementing further developments.

7.4 Employ an advanced classification tool

Our prototype program uses a very simple classifier. First all objects that aretoo small to be of interest are removed. The remaining objects are then classi-fied based on how elongated they are. If the object is long enough it is classifiedas a channel. This simple classifier is the result of our very limited dataset.Implementing an advanced classifier based solely on nine images was never con-sidered. We have also held back on the amount of geometric measurementswe do in the segmented image as we don’t really know what measurements areactually useful and it will be very easy to implement them later.

24

Principal component analysis (PCA) can be used in conjunction with assuitable classifier to create a powerful classification tool. PCA transforms theparameter space into an eigenvector space based om maximum variance alongeach eigenvector. This transformation reduces the number of dimensions toconsider when classifying a dataset, making it easier and faster to do so (Jolliffe2002, 1-6).

We mentioned “suitable” classifier above instead of naming one. This isbecause different methods are suitable for different types of data. Before a clas-sifier is chosen we need to know very well how the measured data is distributed.A supervised classifier is probably the best choice though as it can be tailoredto the data. This means that the classifier needs to be taught how to behave,and for that end a large quantity of data is needed.

7.5 Parallelization and optimization

At the moment it takes about 0.2 - 10 seconds to process an image on ourtest machine (see Section 6.3). This is a long time if you intend to processthousands of pictures, so like in all applications parallelization is necessary. Ifthe program is to continue processing 2D time slices, parallelization could beas simple as to divide the images between the threads as each image can beprocessed individually.

The code we have written is not optimized for speed meaning that there ismuch room for improvement that can reduce the processing time considerably.

8 Conclusion

We have met the goals we set at the beginning of this project. We have created aprototype program that can segment, detect and, to some degree, classify inter-esting objects in seismic images. The program is tuned to work on color codedacoustic amplitude reflection images but can be adapted to different images bychanging a few parameters.

Two major concerns are still present that we have not been able to solvein the scope of this project. These are separation of geological objects thatshould not be connected and an advanced classification tool. The program isby no means a complete analysis tool and there is a lot of room for furtherdevelopment. We have given a few suggestions to this end in this report.

9 References

Dorothy Satterfield, Martin Whiteley. Seismic Interpretation and SubsurfaceMapping. http://www.derby.ac.uk/files/seismic interpretation.ppt (January 12,2012)

Gonzales, Rafael C, Woods, Richard E. 2008. Digital image processing. 3rdEfition. New Jersey: Pearson Education Inc.

Jolliffe, I.T. 2002 Principal component analysis. 2nd Edition. New York:Springer-verlag New York Inc.

25

Sonka, Milan, Hlavac, Vaclav och Boyle, Roger. 2008. Image processing, Anal-ysis, and Machine Vision. 3rd Edition. Stamford: CENGAGE Learning

26