automatedsegmentationofcolorectaltumorin3dmriusing...

Research ArticleAutomated Segmentation of Colorectal Tumor in 3D MRI Using3D Multiscale Densely Connected Convolutional Neural Network

MumtazHussainSoomro 1MatteoCoppotelli1 SilviaConforto 1MaurizioSchmid 1

Gaetano Giunta 1 Lorenzo Del Secco2 Emanuele Neri 2 Damiano Caruso 3

Marco Rengo 3 and Andrea Laghi 3

1Department of Engineering University of Roma Tre Via Vito Volterra 62 00146 Rome Italy2Department of Radiological Sciences University of Pisa Via Savi 10 56126 Pisa Italy3Department of Radiological Sciences Oncology and Pathology University La Sapienza AOU SantrsquoAndreaVia di Grottarossa 1035 00189 Rome Italy

Correspondence should be addressed to Mumtaz Hussain Soomro mumtazhussainsoomrouniroma3it and Gaetano Giuntagaetanogiuntauniroma3it

Received 23 November 2018 Revised 5 January 2019 Accepted 13 January 2019 Published 31 January 2019

Guest Editor Yuri Levin-Schwartz

Copyright copy 2019 Mumtaz Hussain Soomro et al is is an open access article distributed under the Creative CommonsAttribution License which permits unrestricted use distribution and reproduction in anymedium provided the original work isproperly cited

emain goal of this work is to automatically segment colorectal tumors in 3D T2-weighted (T2w)MRI with reasonable accuracyFor such a purpose a novel deep learning-based algorithm suited for volumetric colorectal tumor segmentation is proposed eproposed CNN architecture based on densely connected neural network contains multiscale dense interconnectivity betweenlayers of fine and coarse scales thus leveraging multiscale contextual information in the network to get better flow of informationthroughout the network Additionally the 3D level-set algorithm was incorporated as a postprocessing task to refine contours ofthe network predicted segmentation e method was assessed on T2-weighted 3D MRI of 43 patients diagnosed with locallyadvanced colorectal tumor (cT3T4) Cross validation was performed in 100 rounds by partitioning the dataset into 30 volumesfor training and 13 for testingree performance metrics were computed to assess the similarity between predicted segmentationand the ground truth (ie manual segmentation by an expert radiologistoncologist) including Dice similarity coefficient (DSC)recall rate (RR) and average surface distance (ASD) e above performance metrics were computed in terms of mean andstandard deviation (meanplusmn standard deviation) e DSC RR and ASD were 08406plusmn 00191 08513plusmn 00201 and26407plusmn 27975 before postprocessing and these performancemetrics became 08585plusmn 00184 08719plusmn 00195 and 25401plusmn 2402after postprocessing respectively We compared our proposed method to other existing volumetric medical image segmentationbaseline methods (particularly 3D U-net and DenseVoxNet) in our segmentation tasks e experimental results reveal that theproposed method has achieved better performance in colorectal tumor segmentation in volumetric MRI than the otherbaseline techniques

1 Introduction

Colon and rectum are fundamental parts of the gastrointestinal(GI) or digestive system e colon which is also called thelarge intestine starts from the small intestine and connects tothe rectum Its main function is to absorb minerals nutrientsandwater and remove waste from the body [1 2] According torecent cancer statistics colorectal cancer is diagnosed as thesecond leading cause of cancer death in the United States [3]

Nowadays magnetic resonance imaging (MRI) is themost preferable medical imaging modality in primary co-lorectal cancer diagnosis for radiotherapy treatment plan-ning [2 4 5] Usually the oncologist or radiologistdelineates colorectal tumor regions from volumetric MRIdata manually is manual delineation or segmentation istime-consuming and laborious and presents inter- andintraobserver variability erefore there exists a need forefficient automatic colorectal tumor segmentation methods

HindawiJournal of Healthcare EngineeringVolume 2019 Article ID 1075434 11 pageshttpsdoiorg10115520191075434

in clinical radiotherapy practices to segment the colorectaltumor from large volumetric data as this may save time andreduce human interventions In contrast to natural imagesmedical imaging is generally more chaotic as the shape ofthe cancerous regions may vary from slice to slice as shownin Figure 1 Hence automatic segmentation of the colorectaltumor is a very challenging task not only because its sizemay be very small but also because of its rather inconsistentbehavior in terms of shape and intensity distribution

Lately automatic segmentation of the colorectal tumorfrom volumetric MRI data based on atlas [6] and supervoxelclustering [7] has been presented with some good perfor-mance Newly deep learning-based approaches have beenwidely employed with impressive results in medical imagesegmentation [8ndash16] Trebeschi et al [8] have presented adeep learning-based automatic segmentation method tolocalize and segment the rectal tumor in multiparametricMRI by incorporating a fusion between T2-weighted (T2w)MRI and diffusion-weighted imaging (DWI) MRI Despitetheir method displaying good performance it is unclearwhether only T2w modality which provides more anatomyinformation than DWI modality could be useful for co-lorectal tumor segmentation Secondly they applied theirimplementation on 2D data as it is very common in realdata but medical data such as CT (Computed Tomography)and MRI are in 3D volumetric form e 2D convolutionalneural network (CNN) algorithms segment the volumetricMRI or CTdata in a slice-by-slice sequence [9ndash11] where 2Dkernels are used by aggregating axial coronal and sagittalplanes in a one-to-one association individually Althoughthese 2D CNN-based methods demonstrated great im-provement in segmentation accuracy [17] the inherent 2Dnature of the kernels limits their application when usingvolumetric spatial information Based on this consideration3D CNN-based algorithms [12ndash16] have been recentlypresented where 3D kernels are used instead of 2D to extractspatial information across all three volumetric dimensionsFor example Ccediliccedilek et al [12] proposed a 3D U-net volume-to-volume segmentation network that is an extension of the2D U-net [18] 3D U-net used dual paths an analysis pathwhere features are abstracted and a synthesis path orupsampling path where full resolution segmentation isproduced Additionally 3D U-net established shortcutconnections between early and later layers of the sameresolution in both the analysis and synthesis paths Chenet al [13] presented a voxel-wise residual network (Vox-ResNet) that is an extension of 2D deep residual learning[19] to 3D deep network VoxResNet provides a skip con-nection to pass features from one layer to the next layerEven if these 3D U-net and VoxResNet provide several skipconnections to make training easy the presence of these skipconnections creates a short path from the early layers to thelast one and this may end up transforming the net into a verysimple configuration with the unwanted additional burdenof producing a very high number of parameters to be ad-justed during training Huang et al [20] introduced aDenseNet that extends the concept of skip connections in[18 19] by constructing direct connections from every layerto the corresponding previous layers to ensure maximum

gradient flow between layers In [20] DenseNet was provenas an accurate and efficient method for the natural imageclassification Yu et al [16] proposed the densely connectedvolumetric convolutional neural network (DenseVoxNet)for volumetric cardiac segmentation which is an extended3D version of DenseNet [20] DenseVoxNet utilizes twodense blocks followed by pooling layers e first blocklearns high-level feature maps and the second block learnslow-level feature maps the latter is followed by a poolinglayer that further reduces the resolution of the learned high-level feature maps in the first block Finally the high-resolution feature maps are restored by incorporatingsome deconvolution layers In DenseVoxNet early layers ofthe first block learn fine-scale features (ie high-level fea-tures) based on small receptive field while coarse-scalefeatures (ie low-level features) are learned by later layersof the second block with a larger receptive field In shortfine-scale and coarse-scale features are learned in early andlater layers respectively and this may reduce the networkability to learn multiscale contextual informationthroughout network thus leading to suboptimal perfor-mance [21]

In this study a novel method to overcome the above-mentioned problems in 3D volumetric segmentation ispresented We propose a 3D multiscale densely connectedconvolutional neural network (3D MSDenseNet) a volu-metric network that is an extension of the recently proposed2D multiscale dense networks (MSDNet) for the naturalimage classification [22] In summary we have employed 3DMSDenseNet for the segmentation of the colorectal tumorwith the following contributions

(1) A multiscale training scheme with parallel 3Ddensely interconnected convolutional layers for two-dimensional depth and coarser scales is used wherelow- and high-level features are generated from eachscale individually A diagonal propagation layout isincorporated to couple the depth features with thecoarser features from the first layer thus maintaininglocal and global (multiscale) contextual informationthroughout the network to improve segmentationresults efficiently

(2) e proposed network is based on volume-to-volume learning and interference which eradicatescomputation redundancy

(3) e method is validated on colorectal tumor seg-mentation in 3D MR images and it has attainedoutperformed segmentation results in comparisonwith previous baseline methods From the encour-aging results obtained withMR images the proposedmethod could be applied for further applications ofmedical imaging

2 Methods

Figure 2 represents an overview of our proposed meth-odology We have extended the characterization of themultiscale densely connected network to colorectal tumorsegmentation with 3D volume-to-volume learning

2 Journal of Healthcare Engineering

fashion e network is divided into two paths depth pathand scaled path e depth path is similar to the densenetwork which extracts the ne-scale features with highresolution e scaled path is downsampled with a poolinglayer of power 2 In this path low-resolution features arelearned Furthermore ne-scale features from depth aredownsampled into coarse features via the diagonal pathshown in Figure 2 and concatenated to the output of theconvolution layer in the scaled path By doing this bothlocal and global contextual information is incorporated ina dense network

21 DenseNet Densely Connected Convolutional NetworkGenerally in feedforward CNN or ConvNet the output ofthe lth layer is represented as Xl which is obtained by

mapping a nonlinear transformation Hl from the output ofthe preceding layer Xlminus1 such that

Xl Hl Xlminus1( ) (1)

whereHl is composed of a convolution or pooling operationfollowed by a nonlinear activation function such as therectied linear unit (ReLU) or batch normalization-ReLU(BN-ReLU) Recent works in computer vision have shownthat a deeper network (ie with more layers) increasesaccuracy with better learning [20] However the perfor-mance of deeply modeled networks tends to decrease and itstraining accuracy is saturated with the network depth in-creasing due to the vanishingexploding gradient [20] LaterRonneberger et al [18] solved this vanishing gradientproblem in the deep network by incorporating skip

(a) (b) (c)

Figure 1 An illustration of colorectal tumor location intensity and size variation in a dierent slice of the same volume where thecancerous region is contoured by the red marker

ConcatenationC

3D Conv (1 times 1 times 1)

Upsampling

Softmax

Feature volume maps

Final output after postprocessing

by 3D level-set

Input volume

Softmaxclassification

layer

Upsampling

Layer 3

C

C C

Layer l

C

C

Layer 2

C

C C

Layer 1

C CC

C

CScaled

(S2)

Depth (S1)

Final prediction

H1l (middot)

H2l (middot)

~

H2l (middot)

~H 2

l ([X11 X1

2 X1l ])

H 2l ([X1

1 X12 X1

l ])

X2l =

3D Conv (3 times 3 times 3) + BN + ReLU

3D max pooling strided by power of 2

Identity

Figure 2 Block diagram of the proposed method

Journal of Healthcare Engineering 3

connection which propagates output features from layers ofthe same resolution in the contraction path to the outputfeatures from the layers in the expansion path Neverthelessthis skip connection allows the gradient to flow directly fromthe low-resolution path to high-resolution path whichmakes training easy but this generally produces an enor-mous feature channel in every layer and lead network toadjust a large number of parameters during training Toovercome this problem Huang et al [20] introduced adensely connected network (DenseNet) e DenseNet ex-tends the concept of skip connections by constructing adirect connection from every layer to its correspondingprevious layers to ensure maximum gradient flow betweenlayers In DenseNet featuremaps produced by the precedinglayer were concatenated as an input to the advanced layerthus providing a direct connection from any layer to thesubsequent layer such that

Xl Hl Xlminus1 Xlminus1 Xlminus1 X01113858 1113859( 1113857 (2)

where [middot middot middot] represents the concatenation operation In [20]DenseNet has emerged as an accurate and efficient methodfor the natural image classification Yu et al [16] proposeddensely connected volumetric convolutional neural network(DenseVoxNet) for volumetric cardiac segmentation whichis an extended 3D version of DenseNet [20]

22 Proposed Method (3D MSDenseNet) In 3D MSDense-Net we have two interconnected levels depth level andscaled level for simultaneous computation of high- andlow-level features respectively Let X1

0 be an original inputvolume and feature volume produced by layer l at scale sbe represented as Xs

l Considering two scales in the net-work (ie s1 and s2) we represent the depth level (hori-zontal path) and scaled level as s1 and s2 individually asshown in Figure 2 e first layer is an inimitable layerwhere the feature map of the very first convolution layer isdivided into respective scale s2 via pooling of stride ofpower 2 e high-resolution feature maps (X1

l ) in thehorizontal path (s1) produced at subsequent layers (lgt 1)are densely connected [20] However output feature mapsof subsequent layers in the vertical path (ie coarser scales2) are results of concatenation of transformed featuresmaps from previous layers in s2 and downsampled featuresmaps from previous layers of s1 propagated as the diagonalway as shown in Figure 2 In this way output features ofcoarser scale s2 at layer l in our proposed network can beexpressed as

X2l

1113957H2l X1

1 X12 X1

l1113858 1113859( 1113857

H2l X1

1 X12 X1

l1113858 1113859( 1113857

⎡⎢⎣ ⎤⎥⎦ (3)

where [middot middot middot] denotes the concatenation operator 1113957H2l (middot)

represents those feature maps from finer scale s1 which aretransformed by the pooling layer of stride of power 2 di-agonally (as shown in Figure 2) and H2

l (middot) indicates thosefeature maps from coarser scale s2 transformed by regularconvolution Here 1113957H

2l (middot) and H2

l (middot) have the same size offeature maps In our network the classifier only utilizes the

feature maps from the coarser scale at layer l for the finalprediction

23 Contour Refinement with 3D Level-Set Algorithm 3Dlevel-set based on the geodesic active contour method [23] isemployed as a postprocessor to refine the final prediction ofeach network discussed above 3D level-set adjusts thepredicted tumor boundaries by incorporating prior in-formation and a smoothing function is 3D level-setmethod identifies a relationship between computation ofgeodesic distance curves and active contours is re-lationship provides a precise detection of boundaries even inexistence of huge gradient disparities and gaps e level-setmethod based on the geodesic active contour is more elu-cidated with the mathematical derivations in [23 24] Inorder to simplify this algorithm let φ(Pl t 0) be a level-setfunction which is initialized with the provided initial surfaceat t 0 Here Pl is the probability map yielded by eachmethodis probability map Pl is employed as the startingsurface to initialize the 3D level-setereafter the evolutionof the level-set function regulates the boundaries of thepredicted tumor In the geodesic active contour the partialdifferential equation is incorporated to evolve the level-setfunction [23] such that

zφzt

αX Pl( 1113857 middot nablaφminus βY Pl( 1113857|nablaφ| + cZ Pl( 1113857κ|nablaφ| (4)

where X(middot) Y(middot) and Z(middot) denote the convection functionexpansioncontraction and spatial modifier(ie smoothing) functions respectively In addition α βand c are the constant scalar quantities e values of α βand c bring the change in the above functions behavior Forexample negative values of β lead the initial surface topropagate in the outward direction with a given speed whileits positive value conveys the initial surface towards theinward direction Evaluation of the level-set function is aniterative process therefore we have set the maximumnumber of iterations as 50 to stop the evolution process

3 Experimental Setup

31 Experimental Datasets e proposed method has beenvalidated and compared on T2-weighted 3D colorectal MRimages Data were collected from two different hospitalsnamely Department of Radiological Sciences Oncology andPathology University La Sapienza AOU SantrsquoAndrea Via diGrottarossa 1035 00189 Rome Italy and Department ofRadiological Sciences University of Pisa Via Savi 10 56126Pisa Italy MR data were acquired in a sagittal view on a30 Tesla scanner without a contrast agent e overalldataset consists of 43 volumes T2-weighted MRI and eachMRI volume consists of several slices which are varied innumber across subjects in the range 69sim122 and have di-mension as 512times 512times (69sim122) e voxel spacing wasvarying from 046times 046times 05 to 06times 06times12mmvoxelacross each subject As the data have a slight slice gap wedid not incorporate any spatial resampling e wholedataset was divided into training and testing sets for 100


repeated rounds of cross validation ie 30 volumes wereused for training and 13 for test until the combined resultshave given a numerically stable segmentation performancee colorectal MR volumes were acquired in a sagittal viewon a 30 Tesla scanner without a contrast agent All MRIvolumes went for preprocessing where they were normalizedso that they have zero mean and unit variance We croppedall the volumes with size of 195times114times 61mm Furthermoreduring training the data were augmented with randomrotations of 90deg 180deg and 270deg in the sagittal plane to enlargethe training data In addition two medical experts usingITK-snap software [25 26] manually segmented the co-lorectal tumor in all volumes ese manual delineations oftumors from each volume were then used as ground truthlabels to train the network and validate it in the test phase

32 Proposed Network Implementation Our network ar-chitecture is composed of dual parallel paths ie depth andscaled path as illustrated in Figure 2 which achieves 3Dend-to-end training by adopting the nature of the fullyconvolutional network e depth path consists of eighttransformation layers and the scaled path consists of ninetransformation layers In each path every transformationlayer is composed of a BN a ReLU followed by 3times 3times 3convolution (Conv) by following the similar fashion ofDenseVoxNet Furthermore a 3D upsampling block hasbeen utilized like DenseVoxNet Like DenseVoxNet theproposed network uses the dropout layer with a dropout rateof 02 after each Conv layer to increase the robustness of thenetwork against overfitting Our proposed method hasapproximately 07million as total parameters which is muchfewer than DenseVoxNet [16] with 18 million and 3DU-net[12] with 190 million parameters We have implementedour proposed method in the Caffe library [27] Ourimplementation code is available online at the Internet linkhttphostuniroma3itlaboratorisp4teteachingsp4bmedocumentscodemsdnzip

33 Networks Training Procedures All the networksmdash3DFCNNs [15] 3D U-net [12] and DenseVoxNet [16]mdashwereoriginally implemented in Caffe library [27] For the sake ofcomparison we have applied a training procedure which isvery similar to that utilized by 3D U-net and DenseVoxNet

Firstly we randomly initialized the weights with aGaussian distribution with μ 0 and σ 001 e stochasticgradient descent (SGD) algorithm [28] has been used torealize the network optimization We set the metaparametersfor the SGD algorithm to update the weights as batch size 4weight decay 00005 and momentum 005 We set theinitial learning rate at 005 and divided by 10 every 50 epochsSimilar learning rate policy in DenseVoxNet ie ldquopolyrdquo wasadopted for all the methods e ldquopolyrdquo-learning rate policychanges the learning rate over each iteration by following apolynomial decay where the learning rate is multiplied by theterm (1minus (iterationmaximum_iterations))power [29] wherethe term power was set as 09 and 40000 maximum iterationsMoreover to ease GPU memory the training volumes werecropped randomly with subvolumes of 32times 32times 32 voxels as

inputs to the network and the major voting strategy [30] wasincorporated to obtain final segmentation results from thepredictions of the overlapped subvolumes Finally the soft-max with cross-entropy loss was used to measure the lossbetween the predicted network output and the ground truthlabels

34 Performance Metrics In this study three evaluationmetrics were used to validate and compare the proposedalgorithm namely Dice similarity coefficient (DSC) [31]recall rate (RR) and average symmetric surface distance(ASD) [32] ese metrics are briefly explained as follows

341 Dice Similarity Coefficient (DSC) e DSC is a widelyexplored performance metric in medical image segmenta-tion It is also known as overlap index It computes a generaloverlap similarity rate between the given ground truth labeland the predicted segmentation output by a segmentationmethod DSC is expressed as

DSC Sp Sg1113872 1113873 2TP

FP + 2TP + FN2 Sp cap Sg

11138681113868111386811138681113868

11138681113868111386811138681113868

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (5)

where Sp and Sg are the predicted segmentation output andthe ground truth label respectively FP TP and FN indicatefalse positives true positives and false negatives in-dividually DSC gives a score between 0 and 1 where 1 givesthe best prediction and indicates that the predicted seg-mentation output is identical to the ground truth

342 Recall Rate (RR) RR is also referred as the true-positive rate (TPR) or sensitivity We have utilized thisterm as the voxel-wise recall rate to assess the recall per-formance of different algorithms is performance metricsmeasure misclassified and correctly classified tumor-relatedvoxels It is mathematically expressed as

recall TP

TP + FN

Sp cap Sg

11138681113868111386811138681113868

11138681113868111386811138681113868

Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (6)

It also gives a value between 0 and 1 Higher valuesindicate better predictions

343 Average Symmetric Surface Distance (ASD) ASDmeasures an average distance between the sets of boundaryvoxels of the predicted segmentation and the ground truthand is mathematically given as

ASD Sp Sg1113872 1113873 1

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868times 1113944

pkisinSp

d pk Sg1113872 1113873 + 1113944pgisinSg

d pg Sp1113872 1113873⎛⎜⎝ ⎞⎟⎠

(7)

where pk and pg represent the kth voxel from Sp and Sg setsrespectively e function d denotes the point-to-set distanceand is defined as d(pk Sg) 1113936pgisinSg pk minuspg where middot

is the Euclidean distance Lower values of ASD indicate


higher closeness between the two sets hence a better seg-mentation and vice versa

4 Experimental Results

In this section we have experimentally evaluated the ef-ficacy of multiscale end-to-end training scheme of ourproposed method where parallel 3D densely inter-connected convolutional layers for two-dimensional depthand coarser scales paths are incorporated Since this studyis focused on the segmentation of tumors by 3D networksthe use of 2D networks is out of the scope of this paperNevertheless we tried 2D networks in preliminary trialswith a short set of image data e 2D network was able tocorrectly recognize the tumor but could not segment thewhole tumor accurately especially in the presence of smallsize tumors

In this work the proposed network has been assessed on3D colorectal MRI data For more comprehensive analysisand comparison of segmentation results each dataset wasdivided into ground truth masks (ie manual segmentationdone by medical experts) and training and validation sub-sets Quantitative and qualitative evaluations and compar-isons with baseline networks are stated for the segmentationof the colorectal tumor First we have analyzed and com-pared the learning process of each method like described inSection 41 Secondly we have assessed the efficiency of eachalgorithm qualitatively Section 42 presents a comparison ofqualitative results Finally in Section 43 we have quanti-tatively evaluated the segmentation results yielded by eachalgorithm using evaluation metrics as described below inSection 34

41 LearningCurves e learning process of each method isillustrated in Figure 3 where loss versus training and lossversus validation are compared individually to somebaseline methods Figure 3 demonstrates that each methoddoes not exhibit a serious overfitting as their validation lossconsistently decreases along with decrement in training lossEach method has adopted 3D fully convolutional archi-tecture where error back propagation is carried on pervoxel-wise strategy instead of the patch-based training scheme[33] In other words each single voxel is independentlyutilized as a training sample which dramatically enlarges thetraining datasets and thus reduces the overfitting risk Incontrast to this the traditional patch-based training scheme[33] needs a dense prediction (ie many patches are re-quired) for each voxel in the 3D volumetric data and thusthe computation of these redundant patches for every voxelmakes the network computationally too complex and im-practical for volumetric segmentation

After comparing the learning curves of 3D FCNNs(Figure 3(a)) 3D U-net (Figure 3(b)) and DenseVoxNet(Figure 3(c)) the 3D U-net and DenseVoxNet convergemuch faster with the minimum error rate than the 3DFCNNs is demonstrates that both the 3D U-net andDenseVoxNet successfully overcome gradients vanishingexploding problems through the reuse of the features of early

layers till the later layers On the contrary it is also shownthat there is no significant difference between learningcurves of the 3D U-net and DenseVoxNet although theDenseVoxNet attains a steady drop of validation loss in thebeginning It further proves that the reuse of the featuresfrom successive layers to every subsequent layer by Den-seVoxNet which propagates the maximum gradients in-stead of the skipped connections employed by 3D U-net isable to propagate output features from layers with the sameresolution in the contraction path to the output featuresfrom the layers in the expansion path FurthermoreFigure 3(d) shows that the proposed method that in-corporates the multiscale dense training scheme has the bestloss rate among all the examined methods It reveals that themultiscale training scheme in our method optimizes andspeeds up the network training procedure us the pro-posed method has the fastest convergence with the lowestloss rate than all

42 Qualitative Results In this section we report thequalitative results to assess the effectiveness of each seg-mentationmethod of the colorectal tumors Figure 4(a) givesa visual comparison of colorectal tumor segmentation re-sults achieved from the examined methods In Figure 4(a)from the left to right the first two columns are the raw MRIinput volume and its cropped volume and the three fol-lowing columns are related to the segmentation resultsproduced by each method where each column representsthe predicted foreground probability the initial colorectalsegmentation results and the refined segmentation resultsby the 3D level set Moreover the segmentation resultsproduced by each method are outlined in red and over-lapped with the true ground truth which is outlined in greenIn Figure 4(b) we have overlapped the segmented 3D maskwith the true ground truth 3D mask to visually evidence thefalse-negative rate in the segmentation results It can beobserved that the proposed method (3D MSDenseNet)outperforms the other methods with the lowest false-negative rate in respect to DenseVoxNet 3D U-net and3D FCNNs It is also noteworthy that the segmentationresults obtained by each method significantly improves if a3D level set is incorporated

43 Quantitative Results Table 1 presents the quantitativeresults of colorectal tumor segmentation produced by eachmethod e quantitative results are obtained by computingmean and standard deviation of each performance metric forall the 13 test volumes We have initially compared the resultsobtained by each method without postprocessing by the 3Dlevel set considered here as baseline methods en wepresent a comparison by incorporating the 3D level set as apostprocessor to refine the boundaries of the segmentedresults obtained by these baseline algorithms In this way wehave got a total of eight settings named as in the following 3DFCNNs 3D U-net DenseVoxNet 3D MSDenseNet 3DFCNNs+ 3D level set 3D U-net + 3D level setDenseVoxNet + 3D level set and 3D MSDenseNet + 3D levelset respectively Table 1 reveals that the 3D FCNNs have the


lowest performance among all the metrics followed by 3DU-net and DenseVoxNet whereas the proposed method hasmaintained its performance by achieving the highest value ofDSC and RR and the lowest value of ASD When comparingthe methods after postprocessing every method has effec-tively improved their performance in the presence of the 3Dlevel set 3D FCNNs+3D level set has improved DSC and RRas 1644 and 1523 individually and it reduced ASD to30029 from 42613mm Similarly 3D U-net + 3D level setand DenseVoxNet + 3D level set have attained improvementsin DSC and RR as 5 and 597 and 499 and 429correspondingly Also they both have got a significant re-duction in ASD as 3D U-net + 3D level set andDenseVoxNet + 3D level set reduce ASD to 28815 from30173 and to 25249 from 27253 respectively However 3DMSDenseNet + 3D level set denotes a progress in DSC and RRas 213 and 242 respectively and it reduces ASD to 25401from 26407 Nevertheless the 3DMSDenseNet + 3D level-set

method could not attain a significant improvement by uti-lizing the postprocessing step but still outperforms among allConsidering both qualitative and quantitative results it can beobserved that the addition of the 3D level set as a post-processor improves the segmentation results of each method

5 Discussion

In this work we have tested the method 3D FCNNs+ 3D levelset [15] devised from mostly the same authors as this papertogether with two further prominent and widely exploredvolumetric segmentation algorithms namely 3D U-net [12]and 3D DenseVoxNet [16] for volumetric segmentation ofthe colorectal tumor from T2-weighted abdominal MRIFurthermore we have extended their ability for the colorectaltumor segmentation task by the incorporating 3D level set intheir original implementations In order to improve theperformance we have proposed a novel algorithm based on

3D FCNNs training3D FCNNs validation

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12Lo

ss

(a)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

3D U-Net training3D U-Net validation

(b)

DenseVoxNet trainingDenseVoxNet validation

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

(c)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

3D MSDenseNet training3D MSDenseNet validation

(d)

Figure 3 Comparison of learning curves of the examined methods (andashd) Learning curves which correspond to 3D FCNNs 3D U-netDenseVoxNet and the proposed 3D MSDenseNet methods respectively


3D multiscale densely connected neural network (3DMSDenseNet) Many studies were carried out in the literatureto develop techniques for medical image segmentation theyare mostly based on geometrical methods to address thehurdles and challenges for the segmentation of medical im-aging including statistical shape models graph cuts level setand so on [34] Recently level set-based segmentation

algorithms were commonly explored approaches for medicalimage segmentations Generally they utilize energy mini-mization approaches by incorporating different regularizationterms (smoothing terms) and prior information (ie initialcontour etc) depending on the segmentation tasks Level set-based segmentation algorithms take advantage of their abilityto vary topological properties of segmentation function [35]

User box

Ground truthPredicted segmentation

Inpu

t vol

ume

512

times 51

2 times

69

Crop

ped

volu

me

195

times 11

4 times

61

3D F

CNN

s for

egro

und

prob

abili

ty

3D F

CNN

sse

gmen

tatio

n

3D F

CNN

s + 3

Dle

vel-s

et se

gmen

tatio

n

3D U

-Net

fore

grou

ndpr

obab

ility

3D U

-Net

segm

enta

tion

3D U

-Net

+ 3

D le

vel-

set s

egm

enta

tion

Den

seV

oxN

etfo

regr

ound

pro

babi

lity

Den

seV

oxN

etse

gmen

tatio

n

Den

seV

oxN

et +

3D

leve

l-set

segm

enta

tion

3D M

SDen

seN

etfo

regr

ound

pro

babi

lity

3D M

SDen

seN

etse

gmen

tatio

n

3D M

SDen

seN

et +

3D

leve

l-set

segm

enta

tion

Sagi

ttal

Axi

alCo

rona

l

(a)

Ground truth 3D FCNN 3D FCNN + 3D level-set

3D U-Net 3D U-Net + 3Dlevel-set

DenseVoxNet DenseVoxNet + 3D level-set

3D MSDenseNet 3D MSDenseNet + 3D level-set

(b)

Figure 4 Qualitative comparison of colorectal tumor segmentation results produced by each method In (a) from left to right columns arethe raw MRI input volume and cropped volume first three columns correspond to predicted probability by 3D FCNNs and segmentationresults by 3D FCNNs (red) and 3D FCNNs+ 3D level set (red) overlapped with true ground truth (green) correspondingly Similarlysecond third and fourth three columns are related to predicted probability and segmentation results by rest of methods 3DU-net (red) 3DU-net + 3D level set (red) DenseVoxNet (red) DenseVoxNet + 3D level set (red) 3DMSDensenet (red) and 3DMSDensenet + 3D level set(red) respectively In (b) we have overlapped the 3D masks segmented by each method with the ground truth 3D mask In (b) from left toright are ground truth 3Dmask overlapping of segmented 3Dmask by 3D FCNNs (red) 3D FCNNs + 3D level set (red) 3DU-net (red) 3DU-net + 3D level set (red) DenseVoxNet (red) DenseVoxNet + 3D level set (red) 3DMSDensenet (red) and 3DMSDensenet + 3D level set(red) with the ground truth 3D mask (green points) e green points which are not covered by the segmentation results (red) of eachmethod are referred as false negatives


so it becomes attractive However they always require aninitial appropriate contour initialization to segment a desiredobject is initial contour initialization requires an expertuser intervention in the medical image segmentation Inaddition since medical images have disordered intensitydistribution and show high variability (among imagingmodalities slices etc) a segmentation based on statisticalmodels of intensity distribution is not successful Moreprecisely level set-based approaches given their simple ap-pearance model [36] and lack of generalization ability andtransferability are in some cases unable to learn alone thechaotic intensity distribution in medical images CurrentlyCNNs deep learning-based approaches (ie CNNs) have beensuccessfully explored in the medical imaging domain spe-cifically for classification detection and segmentation tasksUsually deep learning-based approaches learn a model byextracting features deeply from intricate structures and pat-terns from well-defined big training datasets where thetrained model are used for prediction In contrast to level set-based approaches deep learning-based approaches can learnappearance models automatically from the big training datawhich improves its transferability and generalization abilityHowever deep learning-based approaches are not capable toprovide an explicit way to incorporate a function to have thetendency of delivering regularization or smoothing terms likethe level-set function has erefore in order to take theadvantages of both level-set and deep learning into accountwe have incorporated 3D level set in each method that weused in our task

Moreover traditional CNNs are 2D in nature and weredesigned especially for 2D natural images whereas medicalimages like MRI or CT are inherently in the 3D formGenerally these 2D CNNs with 2D kernels have been used formedical image segmentation where volumetric segmentationwas performed in a slice-by-slice sequential order Such 2Dkernels are not able to completely make use of volumetricspatial information by sharing spatial information among thethree planes A 3D CNN architecture that utilizes 3D kernelwhich simultaneously share spatial information among threeplanes can offer a more effective solution

Another challenge of 3D CNN involves controlling thehurdles in network optimization when the network goesdeeper Deeper networks are more prone to get risk ofoverfitting due to vanishing of gradients in advance layersis has been confirmed in this work From the segmentation

results produced by 3D FCNNs we can see from Figure 4 thathow the patternsgradients have been lessened In order topreserve the gradients in next layers when the networkgoes deeper 3D U-net and DenseVoxNet reuse the fea-tures from early to next layers In this way 3D U-netovercomes the vanishing gradient problem in deep net-work by incorporating skip connection which propagatesoutput features from layers of the same resolution in thecontraction path to the output features from the layers in theexpansion path Nevertheless such a skip connection allowsthe gradient to flow directly from the low-resolution path tothe high-resolution one which makes the training easy butthis generally produces a very high number of featurechannels in every layer and leads to adjust a big number ofparameters during training To overcome this problem theDenseVoxNet extends the concept of skip connections byconstructing a direct connection from every layer to itscorresponding previous layers to ensure the maximumgradient flow between layers In simple words feature mapsproduced by the preceding layer are concatenated as an inputto the advanced layer thus providing a direct connectionfrom any layer to the subsequent layer Our results haveproven that the direct connection strategy of DenseVoxNetprovides better segmentation than the skip connectionstrategy of 3D U-net However DenseVoxNet has a deficit asthe network individually learns high-level and low-levelfeatures in early and later layers this limits the network tolearn multiscale contextual information throughout thenetwork and may lead the network to a poor performancee network we have proposed provides a multiscale densetraining scheme where high-resolution and low-resolutionfeatures are learned simultaneously thus maintaining max-imum gradients throughout the network Our experimentalanalysis reveals that reusing features throughmultiscale denseconnectivity produces an effective colorectal tumor seg-mentation Nevertheless although the proposed method hasobtained better performance in colorectal tumor segmenta-tion the algorithm presented herein has higher variance inDSC and RR values compared with the other methods asshown in Table 1 It evidences that the proposed algorithmmay not be able to compare contrast variations in a cancerousregion and variations of slice gap along the z-axis among thedatasets A better normalization and superresolution methodwith more training samples might then be required to cir-cumvent this problem

Table 1 Quantitative comparison of colorectal tumor segmentation results

MethodsPerformance metrics

DSC RR ASD (mm)3D FCNNs [15] 06519plusmn 00181 06858plusmn 01017 42613plusmn 316033D U-net [12] 07227plusmn 00128 07463plusmn 00302 30173plusmn 30133DenseVoxNet [16] 07826plusmn 00146 08061plusmn 00187 27253plusmn 290243D MSDenseNet (proposed method) 08406 plusmn 00191 08513 plusmn 00201 26407plusmn 279753D FCNNs+ 3D level set [15] 07591plusmn 00169 07903plusmn 00183 30029plusmn 298193D U-net + 3D level set 08217plusmn 00173 08394plusmn 00193 28815plusmn 26901DenseVoxNet + 3D level set 08261plusmn 00139 08407plusmn 00177 25249 plusmn 280043D MSDenseNet + 3D level set (proposed method) 08585 plusmn 00184 08719 plusmn 00195 25401plusmn 2402


6 Conclusion

In this research work a novel 3D fully convolutionalnetwork architecture (3D MSDenseNet) is presented foraccurate colorectal tumor segmentation in T2-weightedMRI volumes e proposed network provides a denseinterconnectivity among the horizontal (depth) and ver-tical (scaled) layers In this way finer (ie high-resolutionfeatures) and coarser (low-resolution features) features arecoupled in a two-dimensional array of horizontal andvertical layers and thus features of all resolutions areproduced from the first layer on and maintainedthroughout the network However in other network (viztraditional CNN 3D U-net or DenseVoxNet) coarse levelfeatures are generated with an increasing network depthe experimental results show that the multiscale schemeof our approach has attained the best performance amongall Moreover we have incorporated the 3D levelset algorithm within each method as a postprocessor thatrefines the segmented prediction It has been also shownthat adding a 3D level set increases the performance of alldeep learning-based approaches In addition the proposedmethod due to its simple network architecture has a totalnumber of parameters consisting of approximately 07million which is much fewer than DenseVoxNet with 18million and 3D U-net with 190 million parameters As apossible future direction the proposed method could befurther validate on other medical volumetric segmentationtasks

Data Availability

e T2-weighted MRI data used to support the findings ofthis study are restricted by the ethical board of Departmentof Radiological Sciences University of Pisa Via Savi 1056126 Pisa Italy and Department of Radiological SciencesOncology and Pathology University La Sapienza AOUSantrsquoAndrea Via di Grottarossa 1035 00189 Rome Italy inorder to protect patient privacy Data are available fromProf Andrea Laghi (Department of Radiological SciencesOncology and Pathology University La Sapienza AOUSantrsquoAndrea Via di Grottarossa 1035 00189 Rome Italy)and Prof Emanuele Neri (Department of RadiologicalSciences University of Pisa Via Savi 10 56126 Pisa Italy) forresearchers who meet the criteria for access to confidentialdata

Conflicts of Interest

e authors declare that there are no conflicts of interestregarding the publication of this paper

References

[1] Ashiya ldquoNotes on the structure and functions of large in-testine of human bodyrdquo February 2013 httpwwwpreservearticlescom201105216897notes-on-the-structure-and-functions-of-large-intestine-of-human-bodyhtml

[2] M H Soomro G Giunta A Laghi et al ldquoHaralickrsquos textureanalysis applied to colorectal T2-weighted MRI a preliminary

study of significance for cancer evolutionrdquo in Proceedings of13th IASTED International Conference on Biomedical Engi-neering (BioMed 2017) pp 16ndash19 Innsbruck Austria Feb-ruary 2017

[3] R L Siegel K D Miller and A Jemal ldquoCancer statistics2017rdquo CA A Cancer Journal for Clinicians vol 67 no 1pp 7ndash30 2017

[4] H Kaur H Choi Y N You et al ldquoMR imaging for pre-operative evaluation of primary rectal cancer practical con-siderationsrdquo RadioGraphics vol 32 no 2 pp 389ndash409 2012

[5] U Tapan M Ozbayrak and S Tatli ldquoMRI in local staging ofrectal cancer an updaterdquo Diagnostic and Interventional Ra-diology vol 20 no 5 pp 390ndash398 2014

[6] M A Gambacorta C Valentini N Dinapoli et al ldquoClinicalvalidation of atlas-based auto-segmentation of pelvic volumesand normal tissue in rectal tumors using auto-segmentationcomputed systemrdquo Acta Oncologica vol 52 no 8pp 1676ndash1681 2013

[7] B Irving A Cifor B W Papiez et al ldquoAutomated colorectaltumor segmentation in DCE-MRI using supervoxel neigh-bourhood contrast characteristicsrdquo in Proceedings of 17thInternational Conference on Medical Image Computing andComputer-Assisted Intervention (MICCAI 2014) pp 609ndash616Springer Boston MA USA September 2014

[8] S Trebeschi J J M van Griethuysen D M J Lambregtset al ldquoDeep learning for fully-automated localization andsegmentation of rectal cancer on multiparametric MRrdquo Sci-entific Reports vol 7 p 5301 2017

[9] A Prasoon K Petersen C Igel F Lauze E Dam andM Nielsen ldquoDeep feature learning for knee cartilage seg-mentation using a triplanar convolutional neural networkrdquo inProceedings of 16th International Conference on MedicalImage Computing and Computer-Assisted Intervention(MICCAI 2013) pp 246ndash253 Springer Nagoya JapanSeptember 2013

[10] M Havaei A Davy D Warde-Farley et al ldquoBrain tumorsegmentation with deep neural networksrdquo Medical ImageAnalysis vol 35 pp 18ndash31 2017

[11] H R Roth L Lu A Farag A Sohn and R M SummersldquoSpatial aggregation of holistically-nested networks for au-tomated pancreas segmentationrdquo in Proceedings of 19th In-ternational Conference on Medical Image Computing andComputer-Assisted Intervention (MICCAI 2016) pp 451ndash459 Springer Athens Greece October 2016

[12] O Ccediliccedilek A Abdulkadir S S Lienkamp T Brox andO Ronneberger ldquo3D U-net learning dense volumetric seg-mentation from sparse annotationrdquo in Proceedings of 19thInternational Conference on Medical Image Computing andComputer-Assisted Intervention (MICCAI 2016) pp 424ndash432Springer Athens Greece October 2016

[13] H Chen Q Dou L Yu J Qin and P A Heng ldquoVoxResNetdeep voxelwise residual networks for brain segmentation from3D MR imagesrdquo NeuroImage vol 170 pp 446ndash455 2017

[14] Q Dou L Yu H Chen et al ldquo3D deeply supervised networkfor automated segmentation of volumetric medical imagesrdquoMedical Image Analysis vol 41 pp 40ndash54 2017

[15] M H Soomro G De Cola S Conforto et al ldquoAutomaticsegmentation of colorectal cancer in 3D MRI by combiningdeep learning and 3D level-set algorithm-a preliminarystudyrdquo in Proceedings of IEEE 4th Middle East Conference onBiomedical Engineering (MECBME) pp 198ndash203 TunisTunisia March 2018

[16] L Yu J Z Cheng Q Dou et al ldquoAutomatic 3D cardio-vascular MR segmentation with densely-connected


volumetric convnetsrdquo in Proceedings of 20th InternationalConference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2017) pp 287ndash295 QuebecCity QC Canada September 2017

[17] B H Menze A Jakab S Bauer et al ldquoe multimodal braintumor image segmentation benchmark (BRATS)rdquo IEEETransactions on Medical Imaging vol 34 no 10 pp 1993ndash2024 2015

[18] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo inProceedings of 18th International Conference on MedicalImage Computing and Computer-Assisted Intervention(MICCAI 2015) pp 234ndash241 Springer Munich GermanyOctober 2015

[19] K He X Zhang S Ren and J Sun ldquoDeep residual learningfor image recognitionrdquo in Proceedings of IEEE Conference onComputer Vision and Pattern Recognition (CVPR) pp 770ndash778 Las Vegas NV USA June 2016

[20] G Huang Z Liu L van der Maaten and K Q WeinbergerldquoDensely connected convolutional networksrdquo in Proceedingsof the IEEE Conference on Computer Vision and PatternRecognition (CVPR) Honolulu HI USA July 2017

[21] T D Bui J Shin and T Moon ldquo3D densely convolutionnetworks for volumetric segmentationrdquo 2017 httparxivorgabs170903199

[22] G Huang D Chen T Li F Wu L van der Maaten andK Weinberger ldquoMulti-scale dense networks for resourceefficient image classificationrdquo in Proceedings of InternationalConference on Learning Representations Vancouver BCCanada April-May 2018

[23] V Caselles R Kimmel and G Sapiro ldquoGeodesic activecontoursrdquo International Journal of Computer Vision vol 22no 1 pp 61ndash79 1997

[24] M H Soomro G Giunta A Laghi et al ldquoSegmenting MRimages by level-set algorithms for perspective colorectalcancer diagnosisrdquo in Proceedings of Proceedings of the VIECCOMAS Dematic Conference on Computational Visionand Medical Image Processing (VipIMAGE 2017) vol 27Springer Porto Portugal October 2017

[25] T S Yoo M J AckermanW E Lorensen et al ldquoEngineeringand algorithm design for an image processing API a technicalreport on ITKmdashthe insight toolkitrdquo Studies in Health Tech-nology and Informatics vol 85 pp 586ndash592 2002

[26] P A Yushkevich J Piven H C Hazlett et al ldquoUser-guided3D active contour segmentation of anatomical structuressignificantly improved efficiency and reliabilityrdquoNeuroImagevol 31 no 3 pp 1116ndash1128 2006

[27] Y Jia E Shelhamer J Donahue et al ldquoCaffe convolutionalarchitecture for fast feature embeddingrdquo 2014 httparxivorgabs14085093

[28] L Bottou ldquoLarge-scale machine learning with stochasticgradient descentrdquo in Proceedings of 19th International Con-ference on Computational Statistics (COMPSTATrsquo2010)pp 177ndash186 Springer Paris France August 2010

[29] W Liu A Rabinovich and A C Berg ldquoParseNet lookingwider to see betterrdquo 2015 httparxivorgabs150604579v2

[30] P Kontschieder S R Bulo H Bischof and M PelilloldquoStructured class-labels in random forests for semantic imagelabellingrdquo in Proceedings of 2011 International Conference onComputer Vision (ICCV) pp 2190ndash2197 Barcelona SpainNovember 2011

[31] L R Dice ldquoMeasures of the amount of ecologic associationbetween speciesrdquo Ecology vol 26 no 3 pp 297ndash302 1945

[32] A A Taha and A Hanbury ldquoMetrics for evaluating 3Dmedical image segmentation analysis selection and toolrdquoBMC Medical Imaging vol 15 p 29 2015

[33] D C Ciresan L M Gambardella A Giusti andJ Schmidhuber ldquoDeep neural networks segment neuronalmembranes in electron microscopy imagesrdquo in Proceedings of25th International Conference on Neural Information Pro-cessing Systems (NIPS 2012) pp 2852ndash2860 Lake Tahoe NVUSA December 2012

[34] A Kronman and L Joskowicz ldquoA geometric method for thedetection and correction of segmentation leaks of anatomicalstructures in volumetric medical imagesrdquo InternationalJournal of Computer Assisted Radiology and Surgery vol 11no 3 pp 369ndash380 2015

[35] Y-T Chen ldquoA novel approach to segmentation and mea-surement of medical image using level set methodsrdquoMagneticResonance Imaging vol 39 pp 175ndash193 2017

[36] D Cremers M Rousson and R Deriche ldquoA review of sta-tistical approaches to level set segmentation integrating colortexture motion and shaperdquo International Journal of Com-puter Vision vol 72 no 2 pp 195ndash215 2007


International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018


Active and Passive Electronic Components

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of



Journal ofEngineeringVolume 2018

SensorsJournal of



RotatingMachinery


Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation


Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

in clinical radiotherapy practices to segment the colorectaltumor from large volumetric data as this may save time andreduce human interventions In contrast to natural imagesmedical imaging is generally more chaotic as the shape ofthe cancerous regions may vary from slice to slice as shownin Figure 1 Hence automatic segmentation of the colorectaltumor is a very challenging task not only because its sizemay be very small but also because of its rather inconsistentbehavior in terms of shape and intensity distribution

Lately automatic segmentation of the colorectal tumorfrom volumetric MRI data based on atlas [6] and supervoxelclustering [7] has been presented with some good perfor-mance Newly deep learning-based approaches have beenwidely employed with impressive results in medical imagesegmentation [8ndash16] Trebeschi et al [8] have presented adeep learning-based automatic segmentation method tolocalize and segment the rectal tumor in multiparametricMRI by incorporating a fusion between T2-weighted (T2w)MRI and diffusion-weighted imaging (DWI) MRI Despitetheir method displaying good performance it is unclearwhether only T2w modality which provides more anatomyinformation than DWI modality could be useful for co-lorectal tumor segmentation Secondly they applied theirimplementation on 2D data as it is very common in realdata but medical data such as CT (Computed Tomography)and MRI are in 3D volumetric form e 2D convolutionalneural network (CNN) algorithms segment the volumetricMRI or CTdata in a slice-by-slice sequence [9ndash11] where 2Dkernels are used by aggregating axial coronal and sagittalplanes in a one-to-one association individually Althoughthese 2D CNN-based methods demonstrated great im-provement in segmentation accuracy [17] the inherent 2Dnature of the kernels limits their application when usingvolumetric spatial information Based on this consideration3D CNN-based algorithms [12ndash16] have been recentlypresented where 3D kernels are used instead of 2D to extractspatial information across all three volumetric dimensionsFor example Ccediliccedilek et al [12] proposed a 3D U-net volume-to-volume segmentation network that is an extension of the2D U-net [18] 3D U-net used dual paths an analysis pathwhere features are abstracted and a synthesis path orupsampling path where full resolution segmentation isproduced Additionally 3D U-net established shortcutconnections between early and later layers of the sameresolution in both the analysis and synthesis paths Chenet al [13] presented a voxel-wise residual network (Vox-ResNet) that is an extension of 2D deep residual learning[19] to 3D deep network VoxResNet provides a skip con-nection to pass features from one layer to the next layerEven if these 3D U-net and VoxResNet provide several skipconnections to make training easy the presence of these skipconnections creates a short path from the early layers to thelast one and this may end up transforming the net into a verysimple configuration with the unwanted additional burdenof producing a very high number of parameters to be ad-justed during training Huang et al [20] introduced aDenseNet that extends the concept of skip connections in[18 19] by constructing direct connections from every layerto the corresponding previous layers to ensure maximum

gradient flow between layers In [20] DenseNet was provenas an accurate and efficient method for the natural imageclassification Yu et al [16] proposed the densely connectedvolumetric convolutional neural network (DenseVoxNet)for volumetric cardiac segmentation which is an extended3D version of DenseNet [20] DenseVoxNet utilizes twodense blocks followed by pooling layers e first blocklearns high-level feature maps and the second block learnslow-level feature maps the latter is followed by a poolinglayer that further reduces the resolution of the learned high-level feature maps in the first block Finally the high-resolution feature maps are restored by incorporatingsome deconvolution layers In DenseVoxNet early layers ofthe first block learn fine-scale features (ie high-level fea-tures) based on small receptive field while coarse-scalefeatures (ie low-level features) are learned by later layersof the second block with a larger receptive field In shortfine-scale and coarse-scale features are learned in early andlater layers respectively and this may reduce the networkability to learn multiscale contextual informationthroughout network thus leading to suboptimal perfor-mance [21]

In this study a novel method to overcome the above-mentioned problems in 3D volumetric segmentation ispresented We propose a 3D multiscale densely connectedconvolutional neural network (3D MSDenseNet) a volu-metric network that is an extension of the recently proposed2D multiscale dense networks (MSDNet) for the naturalimage classification [22] In summary we have employed 3DMSDenseNet for the segmentation of the colorectal tumorwith the following contributions

(1) A multiscale training scheme with parallel 3Ddensely interconnected convolutional layers for two-dimensional depth and coarser scales is used wherelow- and high-level features are generated from eachscale individually A diagonal propagation layout isincorporated to couple the depth features with thecoarser features from the first layer thus maintaininglocal and global (multiscale) contextual informationthroughout the network to improve segmentationresults efficiently

(2) e proposed network is based on volume-to-volume learning and interference which eradicatescomputation redundancy

(3) e method is validated on colorectal tumor seg-mentation in 3D MR images and it has attainedoutperformed segmentation results in comparisonwith previous baseline methods From the encour-aging results obtained withMR images the proposedmethod could be applied for further applications ofmedical imaging

2 Methods

Figure 2 represents an overview of our proposed meth-odology We have extended the characterization of themultiscale densely connected network to colorectal tumorsegmentation with 3D volume-to-volume learning







(a) (b) (c)


ConcatenationC


Upsampling

Softmax

Feature volume maps


by 3D level-set

Input volume


layer

Upsampling

Layer 3

C

C C

Layer l

C

C

Layer 2

C

C C

Layer 1

C CC

C

CScaled

(S2)

Depth (S1)

Final prediction

H1l (middot)

H2l (middot)

~

H2l (middot)

~H 2

l ([X11 X1

2 X1l ])

H 2l ([X1

1 X12 X1

l ])

X2l =



Identity










X2l

1113957H2l X1

1 X12 X1

l1113858 1113859( 1113857

H2l X1

1 X12 X1

l1113858 1113859( 1113857

⎡⎢⎣ ⎤⎥⎦ (3)




2l (middot) and H2




zφzt













DSC Sp Sg1113872 1113873 2TP


11138681113868111386811138681113868

11138681113868111386811138681113868

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (5)



recall TP

TP + FN

Sp cap Sg

11138681113868111386811138681113868

11138681113868111386811138681113868

Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (6)



ASD Sp Sg1113872 1113873 1

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868times 1113944

pkisinSp

d pk Sg1113872 1113873 + 1113944pgisinSg

d pg Sp1113872 1113873⎛⎜⎝ ⎞⎟⎠

(7)
















5 Discussion



0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12Lo

ss

(a)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(b)


0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

(c)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(d)





User box


Inpu

t vol

ume

512

times 51

2 times

69

Crop

ped

volu

me

195

times 11

4 times

61

3D F

CNN

s for

egro

und

prob

abili

ty

3D F

CNN

sse

gmen

tatio

n

3D F

CNN

s + 3

Dle

vel-s

et se

gmen

tatio

n

3D U

-Net

fore

grou

ndpr

obab

ility

3D U

-Net

segm

enta

tion

3D U

-Net

+ 3

D le

vel-

set s

egm

enta

tion

Den

seV

oxN

etfo

regr

ound

pro

babi

lity

Den

seV

oxN

etse

gmen

tatio

n

Den

seV

oxN

et +

3D

leve

l-set

segm

enta

tion

3D M

SDen

seN

etfo

regr

ound

pro

babi

lity

3D M

SDen

seN

etse

gmen

tatio

n

3D M

SDen

seN

et +

3D

leve

l-set

segm

enta

tion

Sagi

ttal

Axi

alCo

rona

l

(a)





(b)











6 Conclusion


Data Availability




References











































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia







(a) (b) (c)


ConcatenationC


Upsampling

Softmax

Feature volume maps


by 3D level-set

Input volume


layer

Upsampling

Layer 3

C

C C

Layer l

C

C

Layer 2

C

C C

Layer 1

C CC

C

CScaled

(S2)

Depth (S1)

Final prediction

H1l (middot)

H2l (middot)

~

H2l (middot)

~H 2

l ([X11 X1

2 X1l ])

H 2l ([X1

1 X12 X1

l ])

X2l =



Identity










X2l

1113957H2l X1

1 X12 X1

l1113858 1113859( 1113857

H2l X1

1 X12 X1

l1113858 1113859( 1113857

⎡⎢⎣ ⎤⎥⎦ (3)




2l (middot) and H2




zφzt













DSC Sp Sg1113872 1113873 2TP


11138681113868111386811138681113868

11138681113868111386811138681113868

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (5)



recall TP

TP + FN

Sp cap Sg

11138681113868111386811138681113868

11138681113868111386811138681113868

Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (6)



ASD Sp Sg1113872 1113873 1

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868times 1113944

pkisinSp

d pk Sg1113872 1113873 + 1113944pgisinSg

d pg Sp1113872 1113873⎛⎜⎝ ⎞⎟⎠

(7)
















5 Discussion



0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12Lo

ss

(a)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(b)


0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

(c)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(d)





User box


Inpu

t vol

ume

512

times 51

2 times

69

Crop

ped

volu

me

195

times 11

4 times

61

3D F

CNN

s for

egro

und

prob

abili

ty

3D F

CNN

sse

gmen

tatio

n

3D F

CNN

s + 3

Dle

vel-s

et se

gmen

tatio

n

3D U

-Net

fore

grou

ndpr

obab

ility

3D U

-Net

segm

enta

tion

3D U

-Net

+ 3

D le

vel-

set s

egm

enta

tion

Den

seV

oxN

etfo

regr

ound

pro

babi

lity

Den

seV

oxN

etse

gmen

tatio

n

Den

seV

oxN

et +

3D

leve

l-set

segm

enta

tion

3D M

SDen

seN

etfo

regr

ound

pro

babi

lity

3D M

SDen

seN

etse

gmen

tatio

n

3D M

SDen

seN

et +

3D

leve

l-set

segm

enta

tion

Sagi

ttal

Axi

alCo

rona

l

(a)





(b)











6 Conclusion


Data Availability




References











































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia









X2l

1113957H2l X1

1 X12 X1

l1113858 1113859( 1113857

H2l X1

1 X12 X1

l1113858 1113859( 1113857

⎡⎢⎣ ⎤⎥⎦ (3)




2l (middot) and H2




zφzt













DSC Sp Sg1113872 1113873 2TP


11138681113868111386811138681113868

11138681113868111386811138681113868

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (5)



recall TP

TP + FN

Sp cap Sg

11138681113868111386811138681113868

11138681113868111386811138681113868

Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (6)



ASD Sp Sg1113872 1113873 1

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868times 1113944

pkisinSp

d pk Sg1113872 1113873 + 1113944pgisinSg

d pg Sp1113872 1113873⎛⎜⎝ ⎞⎟⎠

(7)
















5 Discussion



0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12Lo

ss

(a)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(b)


0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

(c)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(d)





User box


Inpu

t vol

ume

512

times 51

2 times

69

Crop

ped

volu

me

195

times 11

4 times

61

3D F

CNN

s for

egro

und

prob

abili

ty

3D F

CNN

sse

gmen

tatio

n

3D F

CNN

s + 3

Dle

vel-s

et se

gmen

tatio

n

3D U

-Net

fore

grou

ndpr

obab

ility

3D U

-Net

segm

enta

tion

3D U

-Net

+ 3

D le

vel-

set s

egm

enta

tion

Den

seV

oxN

etfo

regr

ound

pro

babi

lity

Den

seV

oxN

etse

gmen

tatio

n

Den

seV

oxN

et +

3D

leve

l-set

segm

enta

tion

3D M

SDen

seN

etfo

regr

ound

pro

babi

lity

3D M

SDen

seN

etse

gmen

tatio

n

3D M

SDen

seN

et +

3D

leve

l-set

segm

enta

tion

Sagi

ttal

Axi

alCo

rona

l

(a)





(b)











6 Conclusion


Data Availability




References











































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia









DSC Sp Sg1113872 1113873 2TP


11138681113868111386811138681113868

11138681113868111386811138681113868

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (5)



recall TP

TP + FN

Sp cap Sg

11138681113868111386811138681113868

11138681113868111386811138681113868

Sg

11138681113868111386811138681113868

11138681113868111386811138681113868 (6)



ASD Sp Sg1113872 1113873 1

Sp

11138681113868111386811138681113868

11138681113868111386811138681113868 + Sg

11138681113868111386811138681113868

11138681113868111386811138681113868times 1113944

pkisinSp

d pk Sg1113872 1113873 + 1113944pgisinSg

d pg Sp1113872 1113873⎛⎜⎝ ⎞⎟⎠

(7)
















5 Discussion



0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12Lo

ss

(a)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(b)


0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

(c)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(d)





User box


Inpu

t vol

ume

512

times 51

2 times

69

Crop

ped

volu

me

195

times 11

4 times

61

3D F

CNN

s for

egro

und

prob

abili

ty

3D F

CNN

sse

gmen

tatio

n

3D F

CNN

s + 3

Dle

vel-s

et se

gmen

tatio

n

3D U

-Net

fore

grou

ndpr

obab

ility

3D U

-Net

segm

enta

tion

3D U

-Net

+ 3

D le

vel-

set s

egm

enta

tion

Den

seV

oxN

etfo

regr

ound

pro

babi

lity

Den

seV

oxN

etse

gmen

tatio

n

Den

seV

oxN

et +

3D

leve

l-set

segm

enta

tion

3D M

SDen

seN

etfo

regr

ound

pro

babi

lity

3D M

SDen

seN

etse

gmen

tatio

n

3D M

SDen

seN

et +

3D

leve

l-set

segm

enta

tion

Sagi

ttal

Axi

alCo

rona

l

(a)





(b)











6 Conclusion


Data Availability




References











































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia














5 Discussion



0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12Lo

ss

(a)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(b)


0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss

(c)

0 50 100 150 200 250 300 350 400Training epochs

0

02

04

06

08

1

12

Loss


(d)





User box


Inpu

t vol

ume

512

times 51

2 times

69

Crop

ped

volu

me

195

times 11

4 times

61

3D F

CNN

s for

egro

und

prob

abili

ty

3D F

CNN

sse

gmen

tatio

n

3D F

CNN

s + 3

Dle

vel-s

et se

gmen

tatio

n

3D U

-Net

fore

grou

ndpr

obab

ility

3D U

-Net

segm

enta

tion

3D U

-Net

+ 3

D le

vel-

set s

egm

enta

tion

Den

seV

oxN

etfo

regr

ound

pro

babi

lity

Den

seV

oxN

etse

gmen

tatio

n

Den

seV

oxN

et +

3D

leve

l-set

segm

enta

tion

3D M

SDen

seN

etfo

regr

ound

pro

babi

lity

3D M

SDen

seN

etse

gmen

tatio

n

3D M

SDen

seN

et +

3D

leve

l-set

segm

enta

tion

Sagi

ttal

Axi

alCo

rona

l

(a)





(b)











6 Conclusion


Data Availability




References











































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia




User box


Inpu

t vol

ume

512

times 51

2 times

69

Crop

ped

volu

me

195

times 11

4 times

61

3D F

CNN

s for

egro

und

prob

abili

ty

3D F

CNN

sse

gmen

tatio

n

3D F

CNN

s + 3

Dle

vel-s

et se

gmen

tatio

n

3D U

-Net

fore

grou

ndpr

obab

ility

3D U

-Net

segm

enta

tion

3D U

-Net

+ 3

D le

vel-

set s

egm

enta

tion

Den

seV

oxN

etfo

regr

ound

pro

babi

lity

Den

seV

oxN

etse

gmen

tatio

n

Den

seV

oxN

et +

3D

leve

l-set

segm

enta

tion

3D M

SDen

seN

etfo

regr

ound

pro

babi

lity

3D M

SDen

seN

etse

gmen

tatio

n

3D M

SDen

seN

et +

3D

leve

l-set

segm

enta

tion

Sagi

ttal

Axi

alCo

rona

l

(a)





(b)











6 Conclusion


Data Availability




References











































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia










6 Conclusion


Data Availability




References











































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia


























RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia


automatedsegmentationofcolorectaltumorin3dmriusing...

Documents