data science and cyberinfrastructure: critical enablers ... · data science and...

19
Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials Surya R. Kalidindi * The slow pace of new/improved materials development and deployment has been identified as the main bottleneck in the innovation cycles of most emerging technologies. Much of the continuing discussion in the materials development community is therefore focused on the creation of novel materials innovation ecosystems designed to dramatically accelerate materials development efforts, while lowering the overall cost involved. In this paper, it is argued that the recent advances in data science can be leveraged suitably to address this challenge by effectively mediating between the seemingly disparate, inherently uncertain, multiscale and multimodal measurements and computations involved in the current materials’ development efforts. Proper utilisation of modern data science in the materials’ development efforts can lead to a new generation of data-driven decision support tools for guiding effort investment (for both measurements and computations) at various stages of the materials development. It should also be recognised that the success of such ecosystems is predicated on the creation and utilisation of integration platforms for promoting intimate, synchronous collaborations between cross- disciplinary and distributed team members (i.e. cyberinfrastructure). Indeed, data sciences and cyberinfrastructure form the two main pillars of the emerging new discipline broadly referred to as materials informatics (MI). This paper provides a summary of current capabilities in this emerging new field as they relate to the accelerated development of advanced hierarchical materials (the internal structure plays a dominant role in controlling overall properties/performance in these materials) and identifies specific directions of research that offer the most promising avenues. Keywords: Materials informatics, Microstructure quantification, Process–structure–property linkages, Data science, Cyberinfrastructure, Metamodels, Spatial correlations, Reduced-order representations Materials, Manufacturing, and Informatics Materials with enhanced performance characteristics have served as critical enablers for the successful development of advanced technologies throughout human history and have contributed immensely to the prosperity and wellbeing of various nations. A majority of the materials employed in advanced technologies exhibit hierarchical internal structures with rich details at multiple length and/or structure scales (spanning from atomic to macroscale). Collectively, these features of the material internal structure are here simply referred to as the structure and constitute the central consideration in the development of new/improved hierarchical materi- als. Indeed, the existence of a causal relationship between the material structure and its properties is the central tenet in the field of materials science and engineering. It should be noted that the word structure is used very broadly in these statements (and in this paper) to include and refer to any of the details of the material internal structure (spanning all relevant length or structure scales involved). Indeed, the mathematical description of the material internal structure in its entirety, in any selected material system, is unimaginably complex and demands very high dimensional representation. For example, most materi- als being explored for structural applications (e.g. Ti alloys in jet engines and advanced high strength steels, Mg alloys in lightweight automobiles, Al alloys in aerospace frames, and Zr alloys in nuclear industry) exhibit polycrystalline microstructures at the mesoscale. 1–4 As an example, Fig. 1 shows details of the mesoscale stru- cture in such materials. A rigorous representation of the hierarchical structure in such materials should also include details at other relevant length/structure scales (e.g. point defects, dislocations, grain boundaries, phase boundaries). Although the above discussion was framed in the context Georgia Institute of Technology, Atlanta, GA 30332, USA *Corresponding author, email [email protected] ß 2015 Institute of Materials, Minerals and Mining and ASM International Published by Maney for the Institute and ASM International DOI 10.1179/1743280414Y.0000000043 International Materials Reviews 2015 VOL 60 NO 3 150

Upload: truongthu

Post on 15-Mar-2019

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

Data science and cyberinfrastructure: criticalenablers for accelerated development ofhierarchical materials

Surya R. Kalidindi*

The slow pace of new/improved materials development and deployment has been identified as

the main bottleneck in the innovation cycles of most emerging technologies. Much of the

continuing discussion in the materials development community is therefore focused on the

creation of novel materials innovation ecosystems designed to dramatically accelerate materials

development efforts, while lowering the overall cost involved. In this paper, it is argued that the

recent advances in data science can be leveraged suitably to address this challenge by

effectively mediating between the seemingly disparate, inherently uncertain, multiscale and

multimodal measurements and computations involved in the current materials’ development

efforts. Proper utilisation of modern data science in the materials’ development efforts can lead to

a new generation of data-driven decision support tools for guiding effort investment (for both

measurements and computations) at various stages of the materials development. It should also

be recognised that the success of such ecosystems is predicated on the creation and utilisation

of integration platforms for promoting intimate, synchronous collaborations between cross-

disciplinary and distributed team members (i.e. cyberinfrastructure). Indeed, data sciences and

cyberinfrastructure form the two main pillars of the emerging new discipline broadly referred to as

materials informatics (MI). This paper provides a summary of current capabilities in this emerging

new field as they relate to the accelerated development of advanced hierarchical materials (the

internal structure plays a dominant role in controlling overall properties/performance in these

materials) and identifies specific directions of research that offer the most promising avenues.

Keywords: Materials informatics, Microstructure quantification, Process–structure–property linkages, Data science, Cyberinfrastructure, Metamodels,Spatial correlations, Reduced-order representations

Materials, Manufacturing, andInformaticsMaterials with enhanced performance characteristicshave served as critical enablers for the successfuldevelopment of advanced technologies throughouthuman history and have contributed immensely to theprosperity and wellbeing of various nations. A majorityof the materials employed in advanced technologiesexhibit hierarchical internal structures with rich detailsat multiple length and/or structure scales (spanning fromatomic to macroscale). Collectively, these features of thematerial internal structure are here simply referred to asthe structure and constitute the central consideration inthe development of new/improved hierarchical materi-als. Indeed, the existence of a causal relationshipbetween the material structure and its properties is the

central tenet in the field of materials science andengineering. It should be noted that the word structureis used very broadly in these statements (and in thispaper) to include and refer to any of the details of thematerial internal structure (spanning all relevant lengthor structure scales involved).

Indeed, the mathematical description of the materialinternal structure in its entirety, in any selected materialsystem, is unimaginably complex and demands very highdimensional representation. For example, most materi-als being explored for structural applications (e.g. Tialloys in jet engines and advanced high strength steels, Mgalloys in lightweight automobiles, Al alloys in aerospaceframes, and Zr alloys in nuclear industry) exhibitpolycrystalline microstructures at the mesoscale.1–4 Asan example, Fig. 1 shows details of the mesoscale stru-cture in such materials. A rigorous representation of thehierarchical structure in such materials should also includedetails at other relevant length/structure scales (e.g. pointdefects, dislocations, grain boundaries, phase boundaries).Although the above discussion was framed in the context

Georgia Institute of Technology, Atlanta, GA 30332, USA

*Corresponding author, email [email protected]

� 2015 Institute of Materials, Minerals and Mining and ASM InternationalPublished by Maney for the Institute and ASM InternationalDOI 10.1179/1743280414Y.0000000043 International Materials Reviews 2015 VOL 60 NO 3150

Page 2: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

of a crystalline material, similar considerations exist inmost other material classes. For example, the hierarchy inpolymer structures5 includes details of monomers andtheir spatial arrangements into blocks and branches at themolecular or macromolecular level, micro-fibrils andcrystallites at the nanoscale, and spherulites at the micro-scale. The hierarchy in most biological materials is indeedmuch richer. For example, the hierarchy in bone structureincludes details of collagen molecules and mineral crystals,collagen fibrils, collagen fibre, lamella, osteons and ma-crostructure (e.g. cancellous or cortical).6–8 Furthermore,most materials of interest in advanced technologiesactually tend to be composites comprising multiplematerial classes.

It is emphasised again that the discussion in this paperis exclusively focused on hierarchical materials. In otherwords, the simplest of these materials exhibits at leasttwo distinct well separated length or structure scales (e.g.the macroscale and the microscale). It should also benoted that the description of the structure in suchhierarchical materials implicitly includes a full descrip-tion of the chemical compositions of all distinctmicroscale constituents (called local states) present inthe material system, in addition to their relative spatialplacement in the internal structure. In other words, theinformation included in the description of the materialstructure is orders of magnitude more detailed than thesimple overall chemical composition typically used toidentify or label a material system.

Based on the above description, it should be clear thata vast number of tiered spatial distributions have to bequantified to faithfully represent the complex hierarch-ical structure of advanced material systems. It is obviousthat such an effort would result in an extremely largeand unwieldy representation. Fortunately, the field ofmaterials science and engineering has already empiri-cally discovered that only certain salient features of the

material structure dominate the macroscale performancecharacteristics of interest for any selected application.Therefore, the main challenge in the development ofmaterials with enhanced properties reduces to identify-ing and tracking the salient structure features that areimportant to a specific engineering or technologyapplication. In other words, the core knowledge neededto guide the materials’ development efforts can besought and expressed as reduced-order process–struc-ture–property (PSP) linkages that capture the roles ofdifferent unit manufacturing (or processing) steps on thesalient structure features that control the propertycombinations (or performance characteristics) of inter-est. It is important to recognise that these linkagesrepresent reduced-order models as they utilise reduced-order representations of the material structure. Histori-cally, such efforts have been largely guided by thescientific approach that entails formulating a funda-mental hypothesis and then validating it with carefullydesigned experiments conducted in highly controlledenvironments. Such science-driven approaches for esta-blishing PSP linkages have been expensive and slow,9–11

because their focus has been to isolate and study eachphysical mechanism (i.e. cause) and its associated effectin a highly systematic manner.

From a data science perspective, one can formalise thediscussion above in terms of the fundamental datatransformations involved, as summarised in Fig. 2. Rawdata related to materials phenomena of interest is usuallygenerated by some combination of experiments, models,and simulations. Recent years have witnessed an explosionin the ability of materials experts to generate data fromnovel experiments and simulations. For example, the 3-Dexperimental dataset shown in Fig. 1 can now be generatedusing mostly automated protocols.12,13 In spite of thisautomation, this technique incurs a substantial amount oftime (of the order of several days). An exciting developmentin this field is the use of a femto-second laser for fast serialsectioning of the sample,14 as opposed to the conventionalmechanical approaches used in the earlier studies. This newtechnique has the potential to dramatically reduce the timerequired to obtain a 3-D structure dataset. It has also beendemonstrated that a focused ion beam attached to ascanning electron microscope can be used for serialsectioning the samples and reconstructing a 3-D materialstructure dataset (e.g. Refs. 15 and 16). However, thistechnique is ideal only for studies of very small volumesof material (with length scales of the order of a fewmicrometres). While the approaches mentioned earlier areall destructive (they ablate the material to expose newsurfaces of the sample), there are also a number of non-destructive techniques that rely on the use of X-rays. Whenthe X-ray techniques are combined with computedtomography techniques, it is possible to produce recon-structions of a broad range of 3-D material datasetsincluding porous structures (e.g. Refs. 17–19), mapping ofdefects (e.g. Refs. 20–22) and polycrystal microstructures(e.g. Refs. 23 and 24). In fact, it is now possible to obtain 4-D (three spatial dimensions and time) reconstructions usingdata gathered from high energy X-rays.25 At the finestspatial resolution, it is also possible to obtain 3-D and 4-Dstructure datasets at the atomic scales using techniques suchas transmission electron microscopy26 and atom probemicroscopy (e.g. Refs. 27 and 28). In parallel, there havealso been tremendous advances in the ability to generate

1 Mesoscale internal structure of beta-stabilised poly-

crystalline titanium containing 4300 crystals (or grains)

taken from Ref. 4. This experimental dataset was gen-

erated by a three-dimensional (3-D) reconstruction that

entailed the use of serial sectioning, optical micro-

scopy with intermittent electron backscatter diffraction

(EBSD), and image segmentation and processing algo-

rithms. The sample size is 1?11560?51660?3 mm3

(167067706200 voxels). The 3-D crystal lattice orienta-

tion in each voxel is included in this experimental data-

set. The colour key corresponds to the stereographic

projection of the crystallographic orientation parallel to

the Z-axis, shown in bottom left

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 151

Page 3: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

simulation datasets from computations at multiple length/structure scales (e.g. Refs. 29–43). The volume of this data(from both experiments and models) can be very largeushering the materials community into the materials bigdata era.

The main purpose of the structure datasets is that theyallow the materials specialists to extract trends on theevolution of selected salient structure features during agiven manufacturing route and study how these detailsaffect certain effective properties/performance charac-teristics of interest for the material. For example,considerable prior effort in the development of struc-tural metals has been spent on correlating the averagegrain size in the final metal product to the variousthermo-mechanical deformation histories applied duringthe manufacture of metal alloys. This is because theaverage grain size is generally observed to stronglyinfluence the overall mechanical properties of the metalproduct in service (e.g. Refs. 44–49), although it is notthe only factor influencing the final performance.However, this approach of salient structure parameteridentification and exploration has provided tremendousnew insights (higher value information) to improve theperformance of many structural material systems ofinterest. In the data science formalism, one mightcharacterise these higher value descriptions (identifyingspecific trends between selected parameters as opposedto comprehensive multivariate linkages) of PSP linkagesas information (see Fig. 2). This is mainly because, atthis stage, not all the dominant features in the PSPlinkages of interest have been identified in a compre-hensive manner. At the next higher level, one can aim toextract much more rigorous, reliable, and complete PSPlinkages from all the available data; this informationcould then be characterised as materials knowledge. Oneof the central goals of the emerging field of materialsinformatics (MI) is to introduce novel data-drivenapproaches for mining materials knowledge from thelarge collections of experimental, modelling and simula-tion datasets available (and/or being produced) today.Furthermore, the comprehensive PSP linkages available

at this stage should allow a rigorous quantification ofthe inherent uncertainty. At this level of knowledge, theavailable PSP linkages can be successfully employed insimulating manufacturing processes of interest andpredicting performance of the final product. However,the main focus in the data transformations at theknowledge level continues to be in the forward direction(processRstructureRproperties). At the final stage ofdata transformation, effort would be focused onestablishing invertible PSP linkages that allow custo-mised process and materials design for targeted app-lications (i.e. address inverse problems). This highestlevel of the understanding of PSP linkages can then becharacterised as wisdom. The primary focus in thispaper will be on data analytics needed to extractmaterials knowledge from the ensembles of materialsstructure and performance datasets being produced bythe materials experts, with an eye towards attainingwisdom in the future.

In order to realise the goals stated above for efficientlytransforming materials data into knowledge and wis-dom, and dramatically lowering the cost and timeinvolved in materials development efforts, it is impera-tive to develop novel protocols that fully exploit thelarge data generation capabilities made possible throughthe recent advances in multiscale measurements12–28 andsimulations of materials phenomena (e.g. Refs. 34–42).The central challenge is that in spite of the manyadvances there remain a large number of unknowns orgaps in capturing the underlying physics (at the differentlength scales). These critical gaps hinder the develop-ment of fully predictive PSP linkages for most hierarch-ical materials of interest to advanced technologies. Theonly practical way forward for the foreseeable future isto formally treat the hierarchical material as a complexsystem,50 which by definition is not yet amenable topredictive models. If one embraces the premise that acertain degree of uncertainty is inevitable in theformulation of the desired PSP linkages for hierarchicalmaterial systems, the focus could then be shifted tomanaging the uncertainty (i.e. complexity). In other

2 Schematic description of the envisioned transformations for materials data

Kalidindi Accelerated materials development through data science and cyberinfrastructure

152 International Materials Reviews 2015 VOL 60 NO 3

Page 4: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

words, the effort could be focused on the design,development, and validation of decision support systemsthat will leverage the best available understanding (withits uncertainties) and provide objective guidance onfuture effort investment (e.g. what combination ofexperiments and simulations are needed to reduce theuncertainty).

Given the high cost of the multiscale measurements, itis also obvious that the desired protocols for establishingmaterials knowledge and wisdom (see Fig. 2) will haveto rely on a limited number of experimental investiga-tions. However, these experiments have to be specificallydesigned to efficiently cross-feed multiscale structure-sensitive materials models. The central considerationsfor these new protocols should be:

(i) model maturity

(ii) model interoperability

(iii) model inversion. Briefly, model maturity quan-tifies the reliability (or the uncertainty) of thepredictions of any given model over a prescribedwindow on the input ranges.

The focus here is largely on multiscale physics-basedmodels (these are critical for achieving adequateaccuracy over sufficiently large windows on the inputranges) for predicting either the structure–propertyrelationships or the manufacturing process–structureevolution relationships. Consequently, protocols arecritically needed for robust evaluation of the modelmaturity over any selected range of initial structures andboundary conditions (defining either the manufacturingprocess conditions or the in service loading conditions).The main impediments for establishing these protocolsare:

(i) lack of a broadly adopted framework forrigorous quantification of the material structure

(ii) lack of validated experimental protocols fordirect measurement of the various materials’parameters introduced in the multiscale modelsand/or the ‘at-scale’, full-field, measurements ofresponse variables predicted by the multiscalemodels (needed for the critical validation of themodels).

Model interoperability ensures that the distinct compo-nents of a hierarchical multiscale model chain thattypically address specific materials phenomena atselected length/structure scales are able to exchange thehigh value information with the other components of themodel chain seamlessly with manageable (quantifiable)loss of accuracy.50 For example, in modelling the plasticresponse of polycrystalline metals,51 it is not yet clearwhat information about the dislocation structure needsto be communicated from dislocation dynamics simula-tions to crystal plasticity simulations. As a simpleapproach, one might decide to just communicate only theaverage dislocation density. However, if one is interested inunderstanding and predicting strain hardening and damageinitiation/evolution, it would be necessary to communicateinformation on the higher moments of the dislocation field(or equivalently higher-order spatial correlations in thedislocation networks) to the crystal plasticity modelsoperating at the next higher length/structure scale. Thethird key capability listed earlier, model inversion, isnecessitated by the need to drive materials developmentefforts from considerations of performance requirements(i.e. invert the current ‘cause and effect’ approach to a

transformative ‘goal-means’ approach articulated byOlson42,52). A major impediment in model inversion arisesfrom the simple fact that most currently used approachesin computational materials modelling have not beendesigned with invertibility in mind. For example,numerical approaches such as the finite element methodsor the finite volume methods have been designed to studyeffects of imposed loading or boundary conditions on aselected initial microstructure. They are completely ill-equipped for tackling inverse problems such as identify-ing the set of material structures that are expected to meetor exceed a specified set of property/performancerequirements. Model invertibility in most cases needsformulation of simplified, but sufficiently accurate,metamodels (also referred to as surrogate models) thatcover the desired space of material structures and loading/processing conditions. In general these approachesdemand compact, simple (e.g. algebraic), and sufficientlyaccurate representations of the PSP linkages1,42,50,52–54 tobe of practical utility in providing critical decisionsupport in the materials development efforts.

The above discussion should make clear the criticalneed and potential for the utilisation of modern datasciences (including advanced statistics, dimensionalityreduction and formulation of metamodels) and cyber-infrastructure (including integration platforms, data-bases and customised tools for enhancement ofcollaborations among cross-disciplinary team members)in overcoming the impediments described above. Thesehave been identified as the critical enablers for theemerging materials innovation ecosystems in manynational and international strategic initiatives.9,55–60 Infact, data sciences and cyberinfrastructure have alreadybeen successfully employed in a broad range of otherapplication domains. Examples include recommenda-tion systems (e.g. Amazon61), personal informatics (e.g.Ref. 62), drug discovery (e.g. Ref. 63), decision systems(e.g. Ref. 64), and healthcare (e.g. Ref. 65).

Data sciences and cyberinfrastructure are the founda-tional pillars of the emerging field broadly referred to asMaterials Informatics (MI).66–76 This emerging new fieldhas thus far focused largely on materials discoverythrough combinatorial chemistry and variations ofcrystal structures at a single length/structure scale. Inthis paper, the focus will remain on hierarchical materials,where microstructural features at different length/struc-ture scales play important roles in controlling themacroscale properties/performance characteristics ofinterest. Consequently, major emphasis is placed on firstidentifying and then communicating the high valueinformation among the constituent length scales for ahierarchical material system. Furthermore, because thegoverning physics at different length/structure scales varydramatically, and because of the highly localised natureof the knowledge and expertise of such phenomena,realisation of the goals articulated earlier is criticallydependent on the availability of suitably designed cyberi-nfrastructure that will facilitate and enhance cross-disciplinary collaborations.

Extensible Framework for StructureQuantificationThe lack of an extensible framework for materialstructure quantification, which is broadly applicable to

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 153

Page 5: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

the wide range of hierarchical materials of interest toemerging advanced technologies, is the central impedi-ment in ushering materials science and engineering intothe big data age. Rigorous structure quantification isalso foundational to the critically needed advances indevelopment of novel data-driven protocols for modelmaturation, model interoperability, and model inver-sion. In spite of the central role structure plays inestablishing core materials knowledge expressed as PSPlinkages, it has eluded a broadly accepted quantitativedefinition. For example, although the American societyfor testing of materials (ASTM) standards are widelyadopted by the multiple stakeholders in the manufactur-ing value chain (including materials producers, productdesigners and original equipment manufacturers), there isno ASTM standard yet for a comprehensive quanti-fication of the material structure. At best, the currentstandards only address quantification of very primitivestructure measures such as the average grain size77,78 inrelatively simple material systems. Measures such as theaverage grain size should be considered primitive becauseit is easy to envision multiple hierarchical materialstructures that have the same exact values for suchprimitive measures while exhibiting dramatically differentmacroscale properties/performance characteristics.

Core knowledge needed for the development ofadvanced hierarchical materials is best archived, curatedand visualised in the higher dimensional space ofvariables used to represent the material structure.54,67

This is because structure evolution during processing canbe represented as a distinct pathline in the structure spaceand each point in this space can be associated with asingle value of property combinations of interest.79

Therefore, it would be possible to visualise the salientPSP linkages in a suitably defined low-dimensionalprojection of the structure space. The central challengetherefore is to define a practically useful structure space.When the structure space is defined using very primitivemeasures, it would not be able to distinguish betweenstructures that exhibit very distinct performance char-acteristics. On the other hand, if the structure space isdefined to account for every minute detail of the structure(implicitly demanding a high dimensional representa-tion), it would not be amenable to a comprehensiveexploration (e.g. for the optimisation of performancecharacteristics of interest for a selected application). Thisis precisely where a data-driven approach offers manyadvantages. In a data-driven approach, the decision onexactly what constitutes the set of important salientfeatures is not taken in a static manner – instead it is takenobjectively based on the actual available data. It iscontinuously refined as more data becomes available.Therefore, the emerging interdisciplinary MI field focusesmainly on computational algorithms and tools designedto extract and curate the embedded materials knowledgein an objective (data-driven) and dynamic manner. This isaccomplished using a combination of advanced statistics,applied mathematics and modern cyberinfrastructure.

The above discussion should make clear that anextensible framework for material internal structurequantification is the central starting point in formulatinga data-driven approach to hierarchical materials devel-opment. Only an extensible framework would permitautomated and efficient evaluation of multiple choicesone faces in this daunting task. Moreover, only an

extensible framework will allow automated documenta-tion of the novel integrated workflows that are yet to beexplored and evaluated in pursuit of the grandchallenges identified earlier. When such a framework isimplemented on a broadly accessible cyberinfrastruc-ture, it will allow identification of the best integratedworkflows (integrating experiments and models, materi-als and manufacturing, etc.) based on the experienceaccumulated from the broader community. The desiredrequirements laid out above can be satisfied by seeking adigital signal representation of the material structure80 asmh

s , which denotes the probability that a specified spatialbin (or voxel) indexed by s is physically occupied by apotential local state indexed by h. Since the values of m arebounded between zero and one (in many cases it can be justbinary80), it produces a generalised representation for abroad range of materials systems at different length/structure scales. The information on the different lengthscales is encoded into the properties associated with thespatial bins, while the information on the local state of thematerial (e.g. chemical composition, phase identifiers,tensorial representations of different defect configurationsof interest) is encoded into the properties associated withthe bins in the local state space. In addition to transformingthe material structure into a versatile digital signal, thisapproach inherently treats the material structure as astochastic process because of the probabilistic interpreta-tion assigned to the variable m. The digital signalrepresentation of structure offers many advantages includ-ing fast computation of spatial correlations,1,81,82 auto-mated identification of salient structure features in largedatasets,83 extraction of representative volume elements(RVEs) from an ensemble of datasets,3,15,84 reconstructionsof structures from measured statistics,81,85–87 building ofreal-time searchable structure databases,67,88 and mining ofhigh fidelity multiscale structure–performance–structureevolution correlations from physics-based models.89–92

Because of the absence of a natural origin from whereone might start indexing the spatial bins, only the relativeplacement of local states in the material structurecontains meaningful information. An extensible frame-work for rigorous quantification of spatial correlations inthe material structure is available in the form of n-pointspatial correlations (or n-point statistics).1,67,81,82,93,94

Although a number of other ad hoc measures of materialstructure are possible, only the n-point spatial correla-tions provide the most complete set of measures that arenaturally organised by increasing amounts of structureinformation. For example, the most basic of the n-pointstatistics are the 1-point statistics and they reflect theprobability density associated with finding a specific localstate of interest at any randomly selected single point (orvoxel) in the material structure. In other words, theyessentially capture the information on volume fractionsof the various distinct local states present in the materialsystem. The next higher level of structure information iscontained in the 2-point statistics, denoted as f hh’

r , whichcapture the probability density associated with findinglocal states h and h9 at the tail and head, respectively, of aprescribed vector r randomly placed into the microstruc-ture. Mathematically, these are expressed as81,82

f hh’r ~

1

Sr

XSr

s~1

mhs mh’

szr (1)

Kalidindi Accelerated materials development through data science and cyberinfrastructure

154 International Materials Reviews 2015 VOL 60 NO 3

Page 6: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

where r indexes the bins in the space of vectors (generallythe same binning scheme as that was used for the spatialdomain). In equation (1), Sr denotes the number of spatialbins for which the bins indexed s and szr, both lie withinthe spatial domain of the material structure instantiationbeing studied. If assumptions of periodicity of thematerial structure are invoked (e.g. this is routinely donein evaluating the response of a selected structure usingnumerical approaches such as the molecular dynamics(MD), dislocation dynamics, finite element models, andphase-field models), then Sr5S, where S is the totalnumber of spatial bins in the microstructure instantiation.It is also pointed out that computationally efficientschemes for computing the spatial correlations usingdiscrete Fourier transforms (DFTs) have been developedand utilised successfully.81,82 Although several of theprior studies have routinely assumed periodicity of thematerial structure in their computations, it is relativelysimple to devise a padding scheme95 that allows one toefficiently compute the spatial correlations using DFTs,even for the case when the assumptions of periodicity arenot invoked.

It should be noted that there is a tremendous leapin the amount of structure information contained in the2-point statistics compared to the 1-point statistics.Higher-order correlations (3-point and higher) aredefined in a completely analogous manner. The relation-ships between these microstructure measures and severalof the classically defined ones are summarised in severalbooks.53,93 An implicit benefit of treating the materialstructure in a statistical framework is that it naturallyleads to a quantification of the variance associated withthe structure.30,53,96–100 The variance in structure canthen be combined appropriately with the other uncer-tainties in the process (for example, those associatedwith the measurements and those associated with themodels used to predict overall properties or performancecharacteristics of interest in an engineering application)to arrive at the overall variance in the componentperformance. Lack of tight variances on the perfor-mance characteristics of the final product is often citedas one of the main reasons for the inability to scale aprocess from the laboratory scale to the manufacturingenvironment. As these variances can be traced tovariances in material structure (produced by variancesin processing), it is imperative to track the variances inthe material structure using a practical approach. Onceagain data-driven approaches provide a way forward toaddressing this challenging task.88,96

The strongest support for the choice of n-point spatialcorrelations as the most appropriate measures ofmaterial structure comes from the pioneering work ofKroner,101 who has taught us that the effective proper-ties of composite material systems can be convenientlyexpressed as a series sum with the structure detailsentering this series explicitly in the form of n-pointspatial correlations. These composite theories have beengeneralised to a broad range of materials phenomenaand have been summarised in several books.53,93,102

There are also several reports in literature, where theyhave been successfully applied to estimate effectiveproperties (both linear and non-linear) of a broad rangeof materials with complex structures.103–109 Physically,the n-point spatial correlations are very effective inrigorously quantifying the local neighbourhoods in the

complex internal structure of most advanced materials.As the local neighbourhoods control the local response,it is only logical that the n-point spatial correlations arethe ideal measures of the material structure in formulat-ing PSP linkages of interest in designing high perfor-mance engineering components.

Reduced-order representations ofmicrostructureFor most structural material systems of interest inadvanced technologies, the set of n-point statistics is anextremely large unwieldy set even for n52. Rigorousanalyses and mining of these datasets are only possiblewith the application of data science tools. For example,it was recently demonstrated that techniques such asprincipal component analysis (PCA)110–112 can be usedto obtain objective low dimensional representations ofthe 2-point statistics.67,96 Principal component analysisprovides a linear transformation of high dimensionaldata in a new orthogonal frame where the axes areordered according to the observed variance among theelements of the dataset. Consequently, a truncated PCArepresentation provides an objective (data-driven)reduced-order representation of the original data. It isemphasised here that although PCA dimensionalityreduction techniques have been explored in materialsproblems in prior literature,69,113 they have only recentlybeen employed on 2-point spatial correlations ofmicrostructure in attempts to successfully extract highfidelity structure–property linkages.67,88,96,114

As an example, let ffrjr~1,2, . . . ,Rg denote thetruncated set of independent 2-point statistics82 of interestin a specific application. Let i51, 2, …, I enumerate theelements of an ensemble of material structures beingstudied. It is generally expected that I#R. In suchsituations, PCA identifies a maximum of (I21) orthogo-nal directions in the R-dimensional space that arearranged by decreasing levels of variance in the givenensemble of structures. Mathematically, the PCA repre-sentation of any member of the selected ensemble (ofstructures), labelled by superscript (k), can be expressed as

f (k)r ~

Xmin((I{1),R)

i~1

a(k)i Qirz-f r (2)

where -fr is simply the averaged 2-point statistics for the

entire ensemble, and a(k)i (referred as PC weights) provide

an objective representation of the (k)th structure in thenew orthogonal reference frame identified by Qir (fromPCA). Another important output from the PCA is thesignificance of each principal component bi obtained inthe eigenvalue decomposition performed as a part of thePCA.110–112 The values of bi provide important measuresof the inherent variance among the members of theensemble of structures.96 More importantly, by retainingonly the components associated with the most significanteigenvalues, it is often possible to obtain an objectivereduced-order representation of the structure with only ahandful of parameters. Mathematically, this reduced-order representation can be expressed as

f (k)r &

XR�i~1

a(k)i Qirz-f r (3)

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 155

Page 7: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

where R�%min((I{1),R). Selection of R* will depend onthe specific properties that need to be correlated to thestructure metrics. Note also that the concepts describedabove can be easily extended to include higher-orderstatistics of the structure (e.g. 3-point spatial correlations).

The PCA representations of the n-point statistics havebeen successfully used in automated and efficientclassification of various ensembles of structures.67,88

An example is reproduced here in Fig. 3. Although onlythe first three dimensions are plotted in Fig. 3 (i.e.R*53), it should be noted that this approach yields data-driven reduced-order representations for structureensembles to arbitrary truncation levels selected by theuser. As noted earlier, PCA provides guidance regardingthe significance of each principal component (bi)through which the user can make an objective decisionregarding the acceptable truncation level for a specificapplication.

One of the benefits of the PCA representations shownin Fig. 3 is that it also quantifies the inherent variance ina given class of structures. For example, it is clear fromFig. 3 that the structures in HT3 exhibited the highestvariance, whereas those in HT2 produced the lowestvariance, among the five heat treatments studied.Although quantitative values of the variance were notreported in this specific study, they were explored ingreat detail in a subsequent study that used the samefoundational concepts.96

In the examples presented above, the local state wasdefined at the continuum scale and identified the specificphase found in the micrograph. However, the samemethodology can be applied to a broad range of other

material structures at other length scales. In a recentpaper, this approach was successfully applied toquantify the semi-crystalline polymer structure datasetsproduced by MD simulations.115

Structure measurements andreconstructionsThe discussion above raises an important question:exactly what should we be measuring when we desire toextract the important PSP linkages needed for materialsdevelopment efforts? The conventional approaches inmaterials science and engineering are generally focusedon mapping contiguous volumes of the material internalstructure in two or three dimensions at various lengthscales of interest. If indeed only a finite set of spatialcorrelations is needed in formulating PSP linkages ofinterest (as suggested by the PCA example presentedearlier), it should be possible to develop customisedprotocols that focus exclusively on the importantstatistics and produce the required information in acost-effective manner. This is especially true, when thecharacterisation technique involves probing the materialstructure voxel-by-voxel and each measurement incurs asignificant cost (e.g. measurement of crystal latticeorientations by electron back-scattered diffraction116

and measurement of local mechanical properties usingnanoindentation117–119). For example, Adams andco-workers53,100,120 have demonstrated that it is mucheasier to recover 2-point and 3-point spatial correlationsin three dimensions in polycrystalline samples, whencompared to the effort involved in measuring the

3 Visualisation of an ensemble of material structures, taken from Ref. 67. Each point in the reduced-order three-dimen-

sional PCA space represents a micrograph (examples shown on left) and each coloured volume represents a structure

class. The size of each coloured region reflects the variance within the class. The axes in the 3-D plot correspond to

the ai in equation (3). The colour key for the different heat treatments is as follows: HT15Red, HT25Blue, HT35Green,

HT45Cyan, HT55Magenta

Kalidindi Accelerated materials development through data science and cyberinfrastructure

156 International Materials Reviews 2015 VOL 60 NO 3

Page 8: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

material structure in 3-D contiguous volumes in thesame class of samples.4,14,121–123 These authors have alsodemonstrated that it is often possible to recoverdistribution functions quantifying structure in 3-D usinginformation gathered on 2-D sections using theoriesfrom stereology.124,125 While these prior studies demon-strate tremendous potential for dramatically reducingthe cost incurred in structure quantification, they are stillvery much in a nascent stage of development. Muchfuture work is needed to further refine and criticallyvalidate these approaches and their ability to producerobust and reliable PSP linkages for a broad range ofmaterials being developed for emerging technologies.

After deciding what should be measured, the nextquestion to address is how much data are needed. Thegoal of structure measurement, in general, should be toquantify not only the expected values of the structuremeasures of interest, but also their variance. It is wellknown that control of the structure variance is the bestmeans to control the variance in the properties/perfor-mance of the final product. When using the framework ofthe n-point statistics along with the PCA representationsin an orthonormal frame (in the space of statistics)described earlier, the variance can be related to the eigen-values computed as a part of the PCA decomposition.88,96

Roughly speaking, the variance can be mathematicallyrelated to the volumes of the regions occupied by themembers of the ensemble of structures extracted fromthe sample (or multiple samples subjected to nominallythe same processing history), as depicted in Fig. 3.Consequently, it is possible to establish a data-drivenprocess that will objectively decide how much data areadequate to reliably estimate (to within a set accuracylimit) the distribution of the selected structure measures inany given sample.

In addition to the amount of data, one also needs todecide on a scan size in structure measurements. Withinthe framework of n-point statistics, the relevant lengthto consider in deciding on the scan size is the coherencelength,53,93 defined as the length beyond which the n-point statistics (obtained on an ensemble of structurestaken from a given sample) are completely uncorrelated.This length therefore depends on the specific samplebeing studied. For example, in a perfectly disorderedstructure, the coherence length is of the order of theindividual spatial bin size. However, perfectly disorderedstructures are seldom realised in practice. As one doesnot a priori know the coherence length in a givensample, one needs a few preliminary measurements (forexample, these could be long line scans) to establish thecoherence length and then ensure that the scan size islarger than the coherence length of the structure in thegiven sample. Generally speaking, if the scan sizes are ofthe order of the coherence length, one needs to acquire asufficiently large number of scans in order to establishreliably the variance in the desired subset of spatialcorrelations, as discussed earlier. The general practice inthe field, however, has been to obtain very large scans(as large as practically feasible within availableresources) and use a small number of these large scansinstead of a large number of smaller scans from differentlocations in the physical sample. This practice istantamount to sub-dividing the large scan into smallerregions and treating each smaller scan as an independentmeasurement (although in reality it is not!). From a

statistics viewpoint, the preferred practice would be toobtain a large number of adequately sized scans (eachapproximately about twice the coherence length) fromrandomly selected regions in the physical sample.

The discussion above naturally leads to the oft-debated question of how should one produce a RVEof the material structure. In the present context, it ishighly desired that the RVE reflects the expected valuesof the important structure measures. It is noted here thatmost commonly adopted definitions of RVE in currentliterature102,126–138 focus largely on the convergence inthe prediction of selected macroscale (effective) proper-ties and do not explicitly consider whether or not theRVE has captured the structure details to sufficientaccuracy. Incidentally, the classical definition of RVEprovided by Hill139 requires the RVEs to capture boththe representative structure and its homogenised effec-tive properties. In the present discussion, authors willfocus first on the structure aspects and then address laterthe predictions of macroscale properties. Within theframework of the n-point statistics presented earlier, inorder to faithfully capture the material structure, theRVE should reflect as closely as possible the expectedvalues of the salient set of n-point statistics. Given thevery high dimensional representations of n-point statis-tics, the only practical approach to this task is throughthe use of reduced-order representations such as thosedescribed earlier. For example, looking at Fig. 3, thegoal would be to construct an RVE for any of theensemble of structures (from any one of the heattreatments) in such a manner that the n-point statisticsof the RVE would correspond to the centre of thevolume of interest shown in this figure. Herein lies themain challenge of constructing RVEs that faithfullycapture the main features of an ensemble of measuredstructures – it is often not easy to construct suchstructures from a prescribed set of spatial correlations.

One trivial solution to the RVE constructiondescribed above is to think of the RVE as an equallyweighted representation of all the members of theselected ensemble of structures. If one were to use thisapproach, each member of the ensemble would representa statistical volume element (SVE).98,99,140,141 In fact, ifone were to follow this approach, the size of the SVEscan be significantly smaller than that of the RVE. Theuse of a set of SVEs of smaller volumes (instead of asingle RVE) offers tremendous computational savings,especially when the macroscale properties need to beevaluated using sophisticated physics-based numericalsimulations. The main disadvantage of using the equallyweighted set of SVEs is simply the fact that one typicallyneeds a fairly large number of SVEs to approximate theRVE,132 especially when SVEs are selected to be ofrelatively small volumes.

An alternate approach was recently presented byNiezgoda et al.,84 who introduced the concept ofweighted sets of Statistical Volume Elements (WSVEs).In this approach, the identification of a WSVE isapproached as an optimisation problem that searchesthrough all weighted combinations of the availableSVEs and minimises the difference between the spatialstatistics of the constructed WSVE and the ensembleaveraged spatial statistics from all available SVEs, whilebeing subjected to the following constraints: (i) thenumber of SVEs used to build the WSVE is limited to

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 157

Page 9: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

the number prescribed by the user and (iii) the weightsassigned to the individual members of WSVE have to bepositive and sum up to one. In other words, WSVEapproximates the RVE as a set of optimally selected andweighted SVEs (from the available ensemble of SVEs)with the weights essentially representing the volumefractions of the selected SVEs in the RVE (see Fig. 4). Itwas demonstrated that the WSVEs established using theconcepts described above automatically approximatedwell the effective properties associated with the largerstructure datasets, while providing major computationaladvantages because of the dramatic reduction in thesizes and numbers of the volume elements.3,15 This ismainly because the WSVEs efficiently capture the spatialstatistics in the ensemble of SVEs (or the RVE). Thecomputational advantages of the WSVEs were particu-larly impressive when computationally expensive models(e.g. coupled multiscale models, crystal plasticity) wereused to estimate the effective properties or performanceassociated with a given microstructure.3

Automated mining of process–structure–property linkagesThe structure information gathered from protocolsdescribed above has very little intrinsic value. Highvalue (both scientific and economic) is usually derived

from these structure datasets when they can beassociated with appropriate information on either theproperties exhibited by them or the manufacturingprocesses employed to modulate them (into newstructures with better final properties). This additionaland crucial information is typically gathered throughmultiscale measurements and/or execution of physics-based numerical simulation tools. It should be notedthat this task usually requires allocation of significantresources and time and therefore presents a major risk tothose who undertake materials development activity.The ensuing risk from such effort and time consumingtasks can be mitigated to a large extent if suitable coreknowledge is mined through such activities in auto-mated, cost-effective, ways and successfully transferredto subsequent related tasks. This can be accomplishedthrough the mining and establishment of reliable PSPlinkages that can be applied to a broad range ofstructures (much broader than those used to establishthe linkages themselves).

As discussed earlier, PSP linkages needed for thedevelopment of advanced hierarchical materials are bestarchived in a suitably defined low-dimensional projec-tion of the structure space.54,67,79 In some cases, itis possible to establish such linkages using intuitiveselection of structure measures (e.g. Hall–Petchrelations142,143). However, given the large dimensional

4 Illustration of the construction of a weighted set of statistical volume element (WSVE) comprised of three weighted

optimally selected statistical volume elements (SVEs) for an experimentally characterised precipitate structure. The cor-

responding plots of 2-point statistics are shown on the right

Kalidindi Accelerated materials development through data science and cyberinfrastructure

158 International Materials Reviews 2015 VOL 60 NO 3

Page 10: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

representations demanded by the complex structures inmost hierarchical material systems, it is highly desirableto explore such relationships through DATA-drivenapproaches. These novel approaches offer many bene-fits: (i) they allow for automation in evaluating themultiple options one faces invariably in mining thesalient PSP linkages of interest from the availableexperimental and simulation datasets. (ii) Theseapproaches often cast the PSP linkages as simplemetamodels (also referred as surrogate models) thatrequire significantly lower computational cost (whencompared to the physics-based multiscale models andexperiments that generated the raw data used toestablish these linkages) and are potentially invertible.This feature is of significant value to the engineeringdesign/manufacturing stakeholders in the advancedtechnology sectors.

The reduced-order representations of the spatialcorrelations in the microstructure (see equation (3)) arefoundational to a new data-driven framework67,114,144

for establishing reliable low-cost structure–(homoge-nised) property metamodels from ensembles of experi-mental and/or simulation datasets. Although theestablishment of the PSP linkages in this manner incursa one-time cost, it is expected that this effort will lead tomajor savings in future tasks where the low-costmetamodels can be substituted for the more expensiveexperiments and/or simulations. For illustration of thisapproach, let P(k),a

(k)1 ,a

(k)2 , . . . ,a

(k)R�

� �denote one data

point for each microstructure, where (k) indexes aspecific microstructure in an ensemble of microstruc-tures, P(k) denotes a specific macroscale property ofinterest established either from experiments or models,and a

(k)i denote the reduced-order representation of the

microstructure (see equation (3)). Consider a datasetwith K (i.e. k51,2,…,K) data points. The goal is to minehigh fidelity structure–property linkages from such adataset. In recently reported case studies,67,114,144 thiswas successfully accomplished using simple polynomialfunctions and ordinary least squares linear regressiontechniques.36 In order to mine such simple linkages, oneneeds to define a suitable error associated with each datapoint E(k) and use it appropriately in the regressionmethod. As an example, this can be accomplished as

E(k)~P(k){f p a

(k)1 ,a

(k)2 , . . . ,a

(k)R�

� ���� ���P(k)

(4)

where f p a(k)1 ,a

(k)2 , . . . ,a

(k)R�

� �denotes a pth-order poly-

nomial function. The polynomial coefficients can thenbe established using standard protocols of minimisingthe sum of the squares of the residuals in the entiredataset (including all K data points). Note that theaccuracy of the extracted polynomial linkage dependscritically on the selection of both p and R*. Criticalselection of these parameters is essential for theextraction of high-fidelity structure–property linkages.Although higher values of p and R* will always producea lower value of the error, they do not necessarilyincrease the fidelity of the extracted linkages. This isbecause the higher values of p and R* may lead to over-fitting of the linkages and can produce erroneousestimates in any subsequent application of the linkagesto new microstructures (those not included in theregression analyses). Leave-one-out cross-validation

(LOOCV) represents one of the many ways to provideguidance for objective selection of the parameters pand R*, while avoiding over-fitting of the data. Thistechnique involves the training of the polynomial fit Ktimes, while leaving one data point out of the test seteach time. Applied over K data points, LOOCV willquantify the contribution of each data point to thecoefficients of a proposed polynomial fit. For an over-fitted polynomial, the exclusion of a single data pointwill cause significant change in the coefficients, whereasfor a good fit this change will be negligible. In summary,in this approach, one makes a judicious compromise inchoosing the best fit based on a thorough considerationof the error distributions from both the regressionmethod and the cross-validation technique.

The data-driven approach described above offersmany advantages: (i) the process of establishing thePSP linkages can be largely automated with a compre-hensive exploration of different error measures, differentfunctions for capturing the linkages, and differenttechniques for quantifying the degree of over-fit. (ii)The established PSP linkages can often be dynamicallymodified with only an incremental effort (requirescleverly designed algorithms) when additional databecome available. (iii) The error distributions computedas a part of these protocols also quantify the inherentuncertainty of the mined PSP linkages. Figure 5 depictsan example from our recent work,144 where the focuswas on establishing structure–property linkages thatcould guide the design of the optimal processing path ina class of steels with inclusions.

In another variation of the data science approach, thecomputational cost of solving the numerically stiff non-linear constitutive laws of crystal plasticity theory* wasreduced by about two orders of magnitude.145–149 Thiswas accomplished through the use of a compact databaseof DFTs to efficiently reproduce the solutions from thephysics-based model for the main functions of the crystalplasticity theory for any given crystal orientation sub-jected to arbitrary deformation mode. As with the earlierexample, a special advantage of the database approachessuggested here is that trade-offs can be made by the user interms of the desired accuracy and computation speed inany simulation through the selection of the truncationlevels in the metamodel (in the case of crystal plasticitysimulations this is controlled by the number of dominantDFTs retained in the metamodel).

The structure–property linkages described earlier areaimed at passing the salient information from lowerlength scales to the higher length scales. However, incertain situations, it becomes necessary to simulatecoupled phenomena at two well-separated length scales.As an example, consider the simulation of a complexprocessing operation where different macroscale spatiallocations in the workpiece experience different thermalhistories (often an unavoidable consequence of theboundary conditions imposed at the macroscale).Consequently, strong variations in the material struc-ture should be expected at different locations in theworkpiece. In other words, it is not enough to track the

*Crystal plasticity theories are widely used to predict the plastic anisotropyof polycrystalline materials by accounting for the fundamental mechanismof plastic deformation at the scale of the constituent single crystals bytaking into account the details of slip system geometry in each individualcrystal.

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 159

Page 11: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

evolution of a single representative material structure forthe entire workpiece. The development of such structureheterogeneities can be expected to influence the macro-scale simulation by altering the local effective propertiesat different locations in the workpiece. In such asituation, it is necessary to track independently materialstructures at multiple macroscale locations in the work-piece, and pass high value information in both directions(between the constituent length scales). Accomplishingthis task within the currently employed computationalframeworks requires executing a very large number ofnumerical simulations at the lower length scale withinsimulations executed at a higher length scale (e.g.

multilevel finite element method150). This is extremelydifficult, if not impossible, to address real-worldhierarchical materials design and development problemsusing any of the currently employed computationalstrategies.

The challenge described above can be addressedwith modest computational resources using a datascience approach called materials knowledge systems(MKS).90–92,151–155 In the MKS framework, the focus ison localisation (i.e. opposite of homogenisation) rela-tionships that capture the spatial distribution of theresponse field of interest (e.g. stress or strain rate fields)at the microscale (on a RVE) for an imposed loading

5 Variation of the error from the regression analyses and the cross-validation for different truncation levels in the

reduced-order quantification of the spatial correlations in a class of two-phase material structures and their linkage to

macroscale yield properties.145 Examination of these errors indicates that R*53 and p54 and presents the best com-

promise between a good fit and the risk of over-fitting. The plot on the bottom left shows the match between the origi-

nal data (gathered from finite element simulations of the type shown on bottom right) and the predictions of the

metamodel mined from the data. A total of 400 data points were generated using finite element model simulations on

an ensemble of material structures with a range of precipitate volume fractions, precipitate shape and size distribu-

tions to establish this structure–property linkage

Kalidindi Accelerated materials development through data science and cyberinfrastructure

160 International Materials Reviews 2015 VOL 60 NO 3

Page 12: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

condition at the macroscale. In this approach, thelocalisation relationships are expressed as calibratedmetamodels, whose specific forms are inspired byrigorously established composite theories called asstatistical continuum theories.81,101,156–158 More specifi-cally, these localisation linkages are expressed as asimple algebraic series sum those terms that capturesystematically the individual contributions from ahierarchy of local structure descriptors. Each term inthis series expansion is expressed as a convolution ofthe appropriate local structure descriptor and a physics-capturing kernel. A salient feature of the MKS app-froach is that the physics-capturing kernels are cali-brated to results from previously validated numericalmodels for the multiscale phenomena being studied (forexample, in studies of stress or strain localisation in acomposite materials system, the MKS linkages wouldbe calibrated to results obtained from execution ofvalidated micromechanical finite element models on aselected ensemble of material structures). The mostimpressive benefit of the MKS approach lies in thedramatic reduction of the computational cost, often byseveral orders of magnitude compared to numericalapproaches typically employed in material structuredesign problems. In various preliminary demonstra-tions, the MKS methodology has been successfullyapplied to capturing thermo-elastic stress (or strain)distributions in composite RVEs,90,92,152 rigid-visco-plastic deformation fields in composite RVEs,89 and theevolution of the composition fields in spinodal decom-position of binary alloys.91

Let npm denote a macroscale imposed variable (e.g.local stress, strain or strain rate tensors) that needs to bespatially distributed in the microstructure as ps for eachspatial cell indexed by s. In the MKS case studiescompleted to date, the physical quantities of interestwere chosen such that npm is equal to the volumeaveraged value of ps over the microscale. In other words,the response variable chosen was selected such that it isconserved in going between the constituent length scales.The MKS localisation relationship is expressed as92,153

ps~XH

h~1

XS

t~1

aht mh

sztz

XH

h~1

XH

h0~1

XS

t~1

XS

t0~1

ahh0

tt0 mhsztm

h0

sztzt0z � � �!

SpT

(5)

where the kernels aht and ahh’

tt’ are referred to as the first-

order and second-order influence coefficients, respec-tively that are independent of the microstructure

descriptors mhs . For multi-scale problems involving

elasticity, these influence coefficients are fourth-ranktensors. The influence coefficients capture the contribu-tions of various microstructure features in the neigh-bourhood of the spatial position s to the local responsefield at that position. In this notation, t enumerates thebins in the vector space used to define the neighbour-hood of the spatial bin of interest,80 which has beentessellated using the same scheme that was used for thespatial domain of the material internal structure. Itshould be noted that the influence coefficients in thelocalisation relationship (equation (5)) are closelyrelated to the well known Green’s functions.81

The calibration of the first-order term in the MKSseries is made possible by the fact that equation (5) takesa much simpler form when transformed into the DFTspace, which can be expressed as

Pk~XH

h~1

(bhk)�Mh

k

!" #SpT, bh

k~=k aht

� �,

Pk~=k(ps), Mhk~=k(mh

s )

(6)

where =kðÞ denotes the DFT operation with respect tothe spatial variables s or t, and the superscript * denotesthe complex conjugate. Note that the number of coupledfirst-order coefficients in equation (6) is only H,although the total number of first-order coefficientsstill remains as S6H. This simplification is a directconsequence of the well known convolution propertiesof DFTs.159 Because of this dramatic uncoupling of theinfluence coefficients into smaller sets, it becomes trivial

to estimate the values of the influence coefficients bhk (in

the DFT space) using standard regression methods. It is

emphasised here that establishing bhk is a one-time

(calibration) computational task for a selected compo-site material system and a selected physical phenomenonof interest (including a description of the boundaryconditions).

The details of the calibration procedures for the in-fluence coefficients have been discussed in detail in priorpublications.90,92 Briefly, the influence coefficients werecalibrated using digitally created microscale volumeelements (MVEs) subjected to selected periodic boundaryconditions in finite element simulations. In prior work,periodic boundary conditions were utilised,90,92,160–162 asthey are particularly well suited for DFT representations.It should also be noted that the selection of the size of theMVE can have a significant influence on the calibratedvalues of the influence coefficients. As the influencecoefficients are expected to decay to zero values forincreasing values of t, the localisation captured byequation (5) is associated with a finite interaction zoneor finite memory. In order to capture the spatialcharacteristics of localisation accurately, it is recom-mended that the MVE size used for generating thecalibration datasets be at least twice the size of theinteraction zone. Since the size of the interaction zone isnot known a priori, a few trials are typically needed toestablish a suitable MVE size for a given material systemand a selected physical phenomenon. Finally, it is alsoimportant to ensure that the MVEs are large enough thatthe boundary conditions do not significantly impact thecalibrated values of the influence functions.

The influence functions established on smaller spatialdomains (MVEs) can be easily extended and applied tosignificantly larger spatial domains such as those neededto represent RVEs.92 As the influence functions decaysharply with increasing t (just like Green’s functions),they can be extended to larger spatial domains by simplypadding the functions with zeros. It was demonstrated92

that the trivially extended influence coefficients accu-rately reproduced the microscale spatial distribution ofthe desired field on the larger MVEs with about the sameaccuracy that was realised for the smaller MVEs.

Figure 6 demonstrates the accuracy of the MKSapproach for predicting the local rigid-perfectly plasticresponse in an example material structure with two

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 161

Page 13: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

isotropic phases.89 The error between the MKS predic-tions and the FEM analysis was quantified in eachspatial bin and the average error in the MKS predic-tions was noted to be only 2?2%. More importantly,the FE analyses using 93693693 3-D elements couldnot be performed on a regular desktop PC. It wasexecuted on an IBM e1350 supercomputing system (partof The Ohio Supercomputer Centre) and required 94processor hours. In contrast, the MKS method tookonly 32 s on a regular laptop (2 GHz CPU and 2 GBRAM).

Integration and collaboration platformsThe data science tools described earlier are aimed atmining the low-dimensional representations of theimportant PSP linkages critically needed to dramaticallyaccelerate the rate at which new materials are designed,developed, and deployed in new high performanceproducts introduced in the market place. However, to

fully realise these ambitious goals, it is imperative todevelop and validate suitable protocols for effectiveintegration of the core materials knowledge (i.e. PSPlinkages) in manufacturing process simulation andproduct design tools. Historically, this integration hasnot been easy (see Fig. 7). There exists a fundamentaldisconnect between how knowledge is sought andexpressed in the materials and manufacturing fields.Experts in materials science often express the knowledgethey accumulate from their experiments and models ashighly simplified PSP linkages. Their desire to seeksimplified PSP linkages is largely a byproduct of theusage of simplified intuitive measures for the quantifica-tion of the complex hierarchical material structure.However, these PSP linkages are rarely cast in a formsuitable for the formulation of the internal state variabletheories used widely in the manufacturing processsimulation tools (same with product design tools) todescribe the material constitutive response. This isbecause most internal state variable theories use

6 Comparison of the contour maps of the local:e11 component of the strain rate tensor for a 3-D material structure pre-

dicted by the MKS approach and the conventional micromechanical finite element simulations. The middle section of

the 3-D RVE used in the calculation is shown at the top a, while the predicted strain rate contours by the FE method

b and the MKS established in this work c for the same section are shown below. Both phases are assumed to exhibit

isotropic plasticity with yield strengths of 200 and 250 MPa, respectively. The macroscopic applied strain rate is

0?02 s21

Kalidindi Accelerated materials development through data science and cyberinfrastructure

162 International Materials Reviews 2015 VOL 60 NO 3

Page 14: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

sophisticated tensorial descriptors of the material inter-nal state, which do not necessarily connect directly withthe physical quantities measured and modelled by thematerials experts. As a consequence of this fundamentaldisconnect in the practices in these two fields, integrationof the materials knowledge into broadly used manufac-turing simulation and product design tools continues toexperience major hindrances.

The data-driven approaches described in this paperoffer an alternative approach that might address thechallenge described above. The approaches describedearlier are capable of organising the core materialsknowledge (i.e. PSP linkages) as either low-cost meta-models or easily accessible databases that can be directlyintegrated into manufacturing simulation and productdesign tools (see Fig. 7). Preliminary examples of suchintegration are demonstrated in recent work145,151 andhave identified major computational advantages. Inother words, data sciences can serve as an effectiveand direct integrator of the core materials knowledgeinto various components of the product design andmanufacturing value chain. This would, however, bepossible only through an intimate cross-disciplinarycollaboration between materials experts, design/manu-facturing experts, and data scientists. Because of themany barriers that currently exist between these fields(e.g. differences in approaches, terminology used), it isimperative to design and build novel integration plat-forms (i.e. cyberinfrastructure) that are specificallydesigned to enhance and accelerate such collaborations.Some of the desired components of this supportingcyberinfrastructure include (i) automated protocols forcapturing and tracking data provenance through itsmany adaptations by the collaboration team members,(ii) automated protocols for the identification of thesalient aspects of the data (i.e. metadata) and sharingthem with cross-disciplinary team members, (iii) com-munity building of ontologies and domain lexicons thatenable and promote meaningful exchange of ideas,data, tools, and knowledge between cross-disciplinaryteam members, and (iv) a code repository with ver-sioning. In essence, the approach described herecan be referred to as DC-MGI or DC-ICME, andencompasses a data science and cyberinfrastructuresupported approach to practical realisation of thematerials genome initiative (MGI)9 and integratedcomputational materials engineering (ICME)10 visions(see Fig. 8).

Summary and outlookThis paper has summarised the current status of anemerging framework for accelerating the development ofnew/improved hierarchical materials on the foundationsof data sciences and cyberinfrastructure, while fullyleveraging the recent advances in both experimental andcomputational sciences. Although the initial resultsdescribed above are very promising, it should be clearthat they represent the very early stages of this nascentnew field. The framework described above needs severalextensions before it can be applied to a large number ofcomplex material systems explored in advanced tech-nologies. In the results presented here, the local states ofthe materials were considered to be relatively simple andthe materials phenomena explored were also relativelysimple. Furthermore, most of the case studies completedto date considered mainly meso-length scales. It istherefore necessary to extend the framework and toolspresented here to more realistic material systems wherethe material structure definitions demand the use of

8 Schematic depiction of the DC-materials genome initia-

tive/integrated computational materials engineering

(DC-MGI/ICME) approach

7 Schematic depiction of the current and proposed protocols for integration of high value materials knowledge into man-

ufacturing process and product design simulation tools

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 163

Page 15: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

continuous state variables (e.g. polycrystalline materialswhere the local state description requires the specifica-tion of some combination of composition, phaseidentifier, and crystal lattice orientation) and spanmultiple length scales (from atomistic to the macro-scale). Such extensions will in turn allow exploration ofmore complex materials phenomena encountered intypical manufacturing process routes (e.g. thermo-mechanical treatments) and in service conditions (e.g.fatigue).

As a simple example, consider a materials systemwhere the local state in the material structure requiresthe description of the chemical composition (i.e. thestructure description requires specification of the spatialdistribution of the chemical composition). Let cs denotethe average chemical composition in the spatial bin s(suitably defined at the hierarchical length scale ofinterest). In prior work,91 this structure description wasconverted to a digital signal mh

s in a trivial manner byusing the local state identifier h to index the primitivebinning of the local state space. In other words, therange of composition c[(Ca,Cb) was divided into aconvenient number of bins, and each bin in this rangewas indexed with a different value of h to allow the easyconversion to a versatile digital signal mh

s that can bereadily used with the spatial statistics calculators (basedon DFT methods81,82). While this approach producedreasonable results, it is clearly not computationallyefficient, especially when a large number of local statebins are needed to capture the complex underlyingphysics of the problem (such as the structure datasetsproduced from phase-field simulations). A betterapproach would be to explore new spectral representa-tions of functions over the continuous local state spacesof interest as they are likely to produce computationallyefficient representations of the structure field. As anexample, for the chemical composition variable men-tioned earlier, preliminary (not yet published) work hasindicated that spectral representations using Legendrepolynomials produce highly efficient and compactrepresentations. In a similar vein, it was very recentlydemonstrated155 that generalised spherical harmonics163

serve as excellent spectral basis for functions on thecrystal lattice orientation space (needed in describingpolycrystalline microstructures). If this is properlyaccomplished in a rigorous mathematical framework,it should be possible to obtain the most compact spectralrepresentations of the n-point spatial correlations of acomplex microstructure that are particularly suited todata-driven approaches, where the higher-order terms inthe spectral series would be explored on an as-neededbasis (as demanded by available data) with incrementalnon-redundant effort (i.e. the additional terms in theseries do not change the values of the terms alreadyincluded in the series).

A second critically needed extension to the frameworkpresented here for the computations of the n-pointstatistics may focus on the treatment of point clouddatasets such as those produced in MD simulations. Asthese datasets do not typically provide data on auniform spatial grid, the techniques described here needfurther refinement to be computationally efficient forsuch datasets. One possible direction would be todevelop efficient computational protocols to convertpoint cloud datasets into digital microstructure signals

described on a uniform spatial grid. Another option is toexplore the use of special algorithms that computeFourier transforms efficiently for data on a non-uniformgrid. It is also possible that sometimes the microstruc-ture information cannot be expressed as point data. Thismight happen in describing complex defect structures(e.g. dislocation structures). Further enhancements tothe framework are needed to address such situations.

The reconstruction of the microstructure from spatialcorrelations represents a major gap at this time.Although it is possible to reconstruct a specific imagefrom a knowledge of the complete set of its 2-pointstatistics,81 there is hardly any reason or motivation todo so. Instead, the desire is to reconstruct RVEs from anensemble of microstructures. Although this paperpresented one approach to this problem, there is a clearneed for much more future work in this direction.Furthermore, the more useful reconstructions of veryhigh practical value are the reconstructions from partialdatasets. For example, one often might have only apartial set of experimentally measured spatial correla-tions (e.g. 2-D scans on specific sections into thesample). Also, one might be interested in reconstructingmicrostructures from the reduced-order PCA represen-tations to make physical connections between the PCsand specific microstructural features. All these problemsare likely to be of high value to future work in ICMEand MGI efforts.

As noted earlier, it is anticipated that most advancedmaterials used in emerging technologies will demand atiered description to address the hierarchical materialinternal structure (spanning multiple length scales). Inthis paper, the assumption of well separated lengthscales was implicitly invoked, as is routinely done inworking with most composite theories. In other words, itis assumed that the same overall philosophy can beapplied repeatedly, as many times as needed, indescribing materials whose structures exhibit salientfeatures at multiple well separated length scales. Inpractice, there might be several situations where theseparation of length scales is not achieved. In suchsituations, one is forced to employ the spatial resolutionof the smaller length scale involved and extend the RVEsto obtain a statistically meaningful representation of thespatial correlations for the larger length scale. In suchsituations, RVEs can become extremely large. Further-more, the use of the RVE concept itself can encounteradditional limitations in practice. For example, in thinfilms or graded materials, the assumption of statisticalhomogeneity might fail. Another example would be inapplications where the structure features of interest arerare occurrences (e.g. features responsible for fatiguedamage initiation), where extremely large RVEs mightbe needed; these might even be as large as the entiresample.

Multiscale measurements play an important role inthe realisation of the goals articulated in this paper asthey provide the critical data needed to improve andvalidate the material structure-sensitive models (i.e.model maturity). In particular, new measurementprotocols are critically needed for combinatorial synth-esis and/or high throughput processing and character-isation (structure and response) aimed at rapidexploration of the multiscale PSP linkages in hierarch-ical materials. For example, traditional approaches that

Kalidindi Accelerated materials development through data science and cyberinfrastructure

164 International Materials Reviews 2015 VOL 60 NO 3

Page 16: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

combine material structure characterisation and stan-dard mechanical testing (using simple tension or simplecompression) evaluate material responses one materialstructure at a time and therefore produce relatively lowvolume of high quality data at a relatively high cost.However, this may not present the best strategy foraccelerated development of new/improved materials. Itmight be more cost-effective to pursue testing protocolsthat allow high throughput material structure prototyp-ing (e.g. single or double cone tests, Jominy bars) to becombined with fast quantification of structures (e.g.customised protocols for the direct measurement ofsalient spatial correlations in the structure) along withlocal evaluation of mechanical properties (e.g. indenta-tion methods). Such new protocols that can provide thecritical data at the requisite speed, cost and accuracy,needed to support objective decision making in thematerials development efforts, present an exciting newdirection for research in support of MGI and ICME.

As a final note, it is emphasised here that the data-driven approaches described here for establishing thematerials core knowledge (i.e. PSP linkages or metamo-dels) are ideally suited for incorporation into multiscalerobust design approaches such as inductive designexploration method (IDEM).40,50,164 Implementationof IDEM requires formulation of PSP metamodels atvarious levels of material hierarchy, along with arigorous quantification of the associated uncertainty.Integrating the PSP metamodels with the robust designframework of IDEM represents an exciting new direc-tion for research that can provide a practical pathwayfor addressing the grand challenges described in thispaper.

Acknowledgement

The author acknowledges funding from the Office ofNaval Research (ONR) award N00014-11-1-0759 (DrWilliam M. Mullins, program manager). The authoralso acknowledges numerous discussions with colleaguesProfessor David McDowell and Dr Tony Fast on thevarious concepts presented and discussed in this paper.

References1. D. T. Fullwood, S. R. Niezgodab, B. L. Adamsa and S. R.

Kalidindi: ‘Microstructure sensitive design for performance

optimization’, Prog. Mater. Sci., 2010, 55, (6), 477–562.

2. A. J. Schwartz, M. Kumar and B. L. Adams: ‘Electron

backscatter diffraction in materials science’; 2000, New York,

Kluwer Academic/Plenum Publishers.

3. S. M. Qidwai, D. M. Turner, S. R. Niezgoda, A. C. Lewis, A. B.

Geltmacher, D. J. Rowenhorst and S. R. Kalidindi: ‘Estimating

response of polycrystalline materials using sets of weighted

statistical volume elements (WSVEs)’, Acta Mater., 2012, 60,

5284–5299.

4. D. J. Rowenhorst, A. C. Lewis and G. Spanos: ‘Three-

dimensional analysis of grain topology and interface curvature

in a b-titanium alloy’, Acta Mater., 2010, 58, (16), 5511–5519.

5. E. Baer, A. Hiltner and H. D. Keith: ‘Hierarchical structure in

polymeric materials’, Science, 1987, 235, (4792), 1015–1022.

6. N. M. Hancox: ‘Biology of bone’; 1972, Cambridge, Cambridge

University Press.

7. J. D. Currey: ‘The many adaptations of bone’, J. Biomech., 2003,

36, (10), 1487–1495.

8. J. D. Currey: ‘The mechanical adaptations of bones’; 1984,

Princeton, Princeton University Press.

9. NSTC: ‘Materials genome initiative for global competitiveness’,

National Science and Technology Council, 2011.

10. T. M. Pollock et al.: ‘Integrated computational materials engineer-

ing: a transformational discipline for improved competitiveness and

national security’; 2008, Washington, DC, The National

Academies Press.

11. D. L. McDowell and T. L. Story: ‘New Directions in Materials

Design Science and Engineering (MDS&E)’, Report of a NSF

DMR-sponsored workshop, The Georgia Center for Advanced,

Atlanta, October 19–21, 1998.

12. J. E. Spowart: ‘Automated serial sectioning for 3-D analysis of

microstructure’, Scr. Mater., 2006, 5, 5–10.

13. J. E. Spowart, H. M. Mullens and B. T. Puchala: ‘Collecting and

analyzing microstructures in three dimensions: a fully automated

approach’, J. Miner. Met. Mater., 2003, 55, (10), 35–37.

14. M. P. Echlin, A. Mottura, C. J. Torbet and T. M. Pollock: ‘A new

TriBeam system for three-dimensional multimodal materials

analysis’, Rev. Sci. Instrum., 2012, 83, (2), 023701.

15. E. A. Wargo, A. C. Hannaa, A. Cecena, S. R. Kalidindib and

E. C. Kumbur: ‘Selection of representative volume elements for

pore-scale analysis of transport in fuel cell materials’, J. Power

Sources, 2012, 197, 168–179.

16. P. G. Kotula, G. S. Rohrer and M. P. Marsh: ‘Focused ion beam

and scanning electron microscopy for 3D materials characteriza-

tion’, MRS Bull., 2014, 39, (04), 361–365.

17. J. Villanova, C. Peter, S. Heikki, L. Jerome, U.-V. Francois, L.

Elisa, D. Gerard, B. Pierre, J. David, R. Denis, L. Aaron and M.

Christophe: ‘Multi-scale 3D imaging of absorbing porous

materials for solid oxide fuel cells’, J. Mater. Sci., 2014, 49,

(16), 5626–5634.

18. M. Ebner, F. Geldmacher, F. Marone, M. Stampanoni and V.

Wood: ‘X-ray tomography of porous, transition metal oxide

based lithium ion battery electrodes’, Adv. Ener. Mater., 2013, 3,

(7), 845–850.

19. O. Betz, U. Wegst, D. Weide, M. Heethoff, L. Helfen, W. K. Lee

and P. Cloetens: ‘Imaging applications of synchrotron x-ray

micro-tomography in biological morphology and biomaterial

science. I. General aspects of the technique and its advantages in

the analysis of arthropod structures’, J. Microsc., 2007, 227, (1),

51–71.

20. A. Stienon, A. Fazekasa, J.-Y. Buffierea, A. Vincenta, P.

Daguierb and F. Merch: ‘A new methodology based on X-ray

micro-tomography to estimate stress concentrations around

inclusions in high strength steels’, Mater. Sci. Eng. A, 2009, 513-

514, 376–383.

21. H. Proudhon, J. Y. Buffiere and S. Fouvry: ‘Three-dimensional

study of a fretting crack using synchrotron X-ray micro-

tomography’, Eng. Fract. Mech., 2007, 74, (5), 782–793.

22. J. F. Bingert et al.: ‘High-energy diffraction microscopy char-

acterization of spall damage’, in ‘Dynamic behavior of materials’,

(eds. B. Song, D. Casem, and J. Kimberley), Vol. 1, 397–403;

2014, Springer. New York.

23. L. Wang, M. Li, J. Almer and T. Bieler: ‘Microstructural

characterization of polycrystalline materials by synchrotron X-

rays’, Front. Mater. Sci., 2013, 7, (2), 156–169.

24. R. Pokharel, J. Lind, A. K. Kanjarala, R. A. Lebensohn, S. F. Li,

P. Kenesei, R. M. Suter and A. D. Rollett: ‘Polycrystal plasticity:

comparison between grain-scale observations of deformation and

simulations’, Annu. Rev. Condens. Matter Phys., 2014, 5, (1), 317–

346.

25. E. B. Gulsoy, A. J. Shahani, JW Gibbs, J. L. Fife and P. W.

Voorhees: ‘Four-dimensional morphological evolution of an

aluminum silicon alloy using propagation-based phase contrast

X-ray tomographic microscopy’, Mater. Trans., 2014, 55, (1),

161–164.

26. B. Barwick, H. S. Park, O. H. Kwon, J. S. Baskin and A. H.

Zewail: ‘4D imaging of transient structures and morphologies in

ultrafast electron microscopy’, Science, 2008, 322, (5905), 1227–

1231.

27. M. K. Miller and R. G. Forbes: ‘Atom probe tomography’,

Mater. Character., 2009, 60, (6), 461–469.

28. I. Arslan, E. A. Marquis, M. Homer, M. A. Hekmaty and N. C.

Bartelt: ‘Towards better 3-D reconstructions by combining

electron tomography and atom-probe tomography’,

Ultramicroscopy, 2008, 108, (12), 1579–1585.

29. K. Kirane, S. Ghosh, M. Groeber and A. Bhattacharjee: ‘Grain

level dwell fatigue crack nucleation model for Ti alloys using

crystal plasticity finite element analysis’, J. Eng. Mater. Technol.

Trans. ASME, 2009, 131, (2), 021003.

30. C. P. Przybyla and D. L. McDowell: ‘Simulated microstructure-

sensitive extreme value probabilities for high cycle fatigue of

duplex Ti-6Al-4V’, Int. J. Plast., Special Issue in Honor or

Nobutada Ohno. 2011, 27, (12), 1871–1895.

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 165

Page 17: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

31. D. L. McDowell and F. P. E. Dunne: ‘Microstructure-sensitive

computational modeling of fatigue crack formation’, Int. J.

Fatigue, 2010, 32, (9), 1521–1542. [Special Issue on Emerging

Frontiers in Fatigue].

32. B. L. Wang, Y. H. Wen, J. Simmons and Y. Wang: ‘Systematic

approach to microstructure design of Ni-base alloys using

classical nucleation and growth relations coupled with phase field

modeling’, Metall. Mater. Trans. A., 2008, 39A, (5), 984–993.

33. Y. H. Wen, J. P. Simmons, C. Shen, C. Woodward and Y. Wang:

‘Phase-field modeling of bimodal particle size distributions during

continuous cooling’, Acta Mater., 2003, 51, (4), 1123–1132.

34. S. Ghosh, Z. Nowak and K. Lee: ‘Quantitative characterization

and modeling of composite microstructures by Voronoi cells’,

Acta Mater., 1997, 45, (6), 2215–2234.

35. S. Ghosh, K. Lee and S. Moorthy: ‘Multiple scale analysis of

heterogeneous elastic structures using homogenization theory and

voronoi cell finite element method’, Int. J. Solids Struct., 1995, 32,

(1), 27–62.

36. V. G. Kouznetsova, M. G. D. Geers and W. A. M. Brekelmans:

‘Multi-scale second-order computational homogenization of

multi-phase materials: a nested finite element solution strategy’,

Comput. Methods Appl. Mech. Eng., 2004, 193, (48–51), 5525–

5550.

37. V. Kouznetsova, M. G. D. Geers and W. A. M. Brekelmans:

‘Multi-scale constitutive modelling of heterogeneous materials

with a gradient-enhanced computational homogenization

scheme’, Int. J. Numer. Methods Eng., 2002, 54, (8), 1235–1260.

38. H. Kadowaki and W. K. Liu: ‘Bridging multi-scale method for

localization problems’, Comput. Methods Appl. Mech. Eng., 2004,

193, (30–32), 3267–3302.

39. D. L. McDowell, H.-J. Choi, J. H. Panchal, R. Austin, J. K. Allen

and F. Mistree: ‘Plasticity-related microstructure–property rela-

tions for materials design’, Key Eng. Mater., 2007, 34, (0–341),

21–30.

40. H. J. Choi, D. L. Mcdowell, J. K. Allen and F. Mistree: ‘An

inductive design exploration method for hierarchical systems

design under uncertainty’, Eng. Optim, 2008, 40, (4), 287–307.

41. D. J. Luscher, D. L. McDowell and C. A. Bronkhorst: ‘A second

gradient theoretical framework for hierarchical multiscale model-

ing of materials’, Int. J. Plast., 2010, 26, (8), 1248–1275.

42. G. B. Olson: ‘Computational design of hierarchically structured

materials’, Science, 1997, 277, (29), 1237–1242.

43. S. R. Kalidindi, A. Bhattacharyya and R. D. Doherty: ‘Detailed

analyses of grain-scale plastic deformation in columnar poly-

crystalline aluminium using orientation image mapping and

crystal plasticity models’, Proc. R. Soc. Lond. A, 2004, 460,

(2047), 1935–1956.

44. S. H. Choi, J. K. Kimb, B. J. Kima and Y. B. Parka: ‘The effect of

grain size distribution on the shape of flow stress curves of Mg-

3Al-1Zn under uniaxial compression’, Mater. Sci. Eng. A, 2008,

488, (1–2), 458–467.

45. E. El-Danaf, S. R. Kalidindi and R. D. Doherty: ‘Influence of

grain size and stacking-fault energy on deformation twinning in

fcc metals’, Metall. Mater. Trans. A, 1999, 30, (5), 1223–1233.

46. D. M. Dimiduk, P. M. Hazzledine, T. A. Parthasarathy, M. G.

Mendiratta and S. Seshagiri: ‘The role of grain size and selected

microstructural parameters in strengthening fully lamellar TiAl

alloys’, Metall. Mater. Trans. A Phys. Metall. Mater. Sci., 1998,

29, (1), 37–47.

47. D. J. Morrison and J. C. Moosbrugger: ‘Effects of grain size on

cyclic plasticity and fatigue crack initiation in nickel’, Int. J.

Fatigue, 1997, 20, S51–S59.

48. A. Lasalmonie and J. L. Strudel: ‘Influence of grain size on the

mechanical behaviour of some high strength materials’, J. Mater.

Sci., 1986, 21, (6), 1837–1852.

49. D. Hull: ‘Effect of grain size and temperature on slip, twinning

and fracture in 3% silicon iron’, Acta Metall., 1961, 9, (3), 191–

204.

50. D. L. McDowell, J. H. Panchal, H.-J. Choi, C. C. Seepersad, J. K.

Allen and F. Mistree: ‘Integrated design of multiscale, multi-

functional materials and products’; 2009, Burlington, Elsevier.

51. D. L. McDowell: ‘A perspective on trends in multiscale plasticity’,

Int. J. Plast., 2010, 26, (9), 1280–1309. [Special issue in honor of

David L. McDowell].

52. G. B. Olson: ‘Pathways of discovery designing a new material

world’, Science, 2000, 228, (12), 933–998.

53. B. L. Adams, S. R. Kalidindi and D. Fullwood: ‘Microstructure

sensitive design for performance optimization’; 2012, Waltham,

Butterworth-Heinemann.

54. J. H. Panchal, S. R. Kalidindi and D. L. McDowell: ‘Key

computational modeling issues in integrated computational

materials engineering’, Comput. Aided Des., 2013, 45, (1), 4–25.

55. NSTC: A national strategic plan for advanced manufacturing,

National Science and Technology Council, Executive Office of the

President, February 2012.

56. Office of Science and Technology Policy: ‘Obama administration

unveils ‘Big Data’ initiative: announces $200 million in new R&D

investments’, Office of Science and Technology Policy,

Washington, DC, 20502; 2012.

57. J. Allison: ‘Integrated computational materials engineering: A

perspective on progress and future steps’, J. Miner. Met. Mater.

Soc., 2011, 63, (4), 15–18.

58. G. J. Schmitz and U. Prahl: ‘ICMEg – the Integrated

Computational Materials Engineering expert group – a new

European coordination action’, Integr. Mater. Manuf. Innovation,

2014, 3, (1), 2.

59. G. J. Schmitz and U. Prahl: ‘Integrative computational materials

engineering: concepts and applications of a modular simulation

platform’; 2012, Chichester, John Wiley & Sons.

60. The European Materials Modelling Council: [cited 2014 Aug 12],

Available at: http://emmc.info/index.html

61. G. Linden, B. Smith and J. York: ‘Amazon.com recommenda-

tions: item-to-item collaborative filtering’, Internet Comput. IEEE,

2003, 7, (1), 76–80.

62. I Li, A Dey and J. Forlizzi: ‘A stage-based model of personal

informatics systems’, Proc. SIGCHI Conf. on ‘Human Factors in

Computing Systems, 557–566; 2010, New York, NY, USA, ACM.

63. M. Hohman, K. Gregory, K. Chibale, P. J. Smith, S. Ekins and B.

Bunin: ‘Novel web-based tools combining chemistry informatics,

biology and social networks for drug discovery’, Drug Discov.

Today, 2009, 14, (5), 261–270.

64. J. M. Tien: ‘Toward a decision informatics paradigm: a real-time,

information-based approach to decision making’, IEEE Trans.

Syst. Man Cybern. Part C Appl. Rev., 2003, 33, (1), 102–113.

65. T. T. Wan: ‘Healthcare informatics research: from data to

evidence-based management’, J. Med. Syst., 2006, 30, (1), 3–7.

66. K. Rajan: ‘Materials informatics’, Mater. Today, 2005, 8, (10),

38–45.

67. S. R. Kalidindi, S. R. Niezgoda and A. A. Salem: ‘Microstructure

informatics using higher-order statistics and efficient data-mining

protocols’, JOM, 2011, 63, (4), 34–41.

68. D. Gorse and R. Lahana: ‘Functional diversity of compound

libraries’, Curr. Opin. Chem. Biol., 2000, 4, (3), 287–294.

69. S. Curtarolo, D. Morgan, K. Persson, J. Rodgers and G. Ceder:

‘Predicting crystal structures with data mining of quantum

calculations’, Phys. Rev. Lett., 2003, 91, (13), 135503.

70. G. Ceder: ‘Predicting properties from scratch’, Science, 1998, 280,

(5366), 1099–1100.

71. C. M. Breneman, L. Catherine Brinson, L. S. Schadler, B.

Natarajan, M. Krein, K. Wu, L. Morkowchuk, Y. Li, H. Deng

and H. Xu: ‘Stalking the materials genome: a data-driven

approach to the virtual design of nanostructured polymers’,

Adv. Funct. Mater., 2013, 23, (46), 5746–5752.

72. M. Krein, B. Natarajan, L. S. Schadler, L. C. Brinson, H. Deng,

D. Gai, Y. Li and C. M. Breneman: ‘Development of materials

informatics tools and infrastructure to enable high throughput

materials design’, MRS Online Proc Libr, 2012, 1425. mrsf11-

1425-uu06-05, doi:10.1557/opl.2012.57.

73. D. Cebon and M. F. Ashby: ‘Engineering materials informatics’,

MRS Bull., 2006, 31, (12), 1004–1012.

74. S. R. Kalidindi: ‘Microstructure informatics’, in ‘Informatics for

materials science and engineering: data-driven discovery for

accelerated experimentation and application’, (ed. K. Rajan,

Butterworth-Heinemann), 2013, 443–466.

75. L. Peurrung, K. Ferris and T. Osman: ‘The materials informatics

workshop: Theory and application’, JOM, 2007, 59, (3), 50–50.

76. Z.-K. Liu, L.-Q. Chen and K. Rajan: ‘Linking length scales via

materials informatics’, JOM, 2006, 58, (11), 42–50.

77. ASTM International: ‘E112 – 10: Standard test methods for

determining average grain size’; 2010, West Conshohocken, PA,

USA, ASTM International.

78. ASTM International: ‘E1181 – 02: Standard test methods for

characterizing duplex grain sizes’; 2008, West Conshohocken, PA,

USA, ASTM International.

79. J. B. Shaffer, M. Knezevic and S. R. Kalidindi: ‘Building texture

evolution networks for deformation processing of polycrystalline

fcc metals using spectral approaches: Applications to process

design for targeted performance’, Int. J. Plast., 2010, 26, (8),

1183–1194.

Kalidindi Accelerated materials development through data science and cyberinfrastructure

166 International Materials Reviews 2015 VOL 60 NO 3

Page 18: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

80. B. L. Adams, X. Gao and S. R. Kalidindi: ‘Finite approximations

to the second-order properties closure in single phase polycrys-

tals’, Acta Mater., 2005, 53, (13), 3563–3577.

81. D. T. Fullwood, S. R. Niezgoda and S. R. Kalidindi:

‘Microstructure reconstructions from 2-point statistics using

phase-recovery algorithms’, Acta Mater., 2008, 56, (5), 942–948.

82. S. R. Niezgoda, D. T. Fullwood and S. R. Kalidindi: ‘Delineation

of the space of 2-point correlations in a composite material

system’, Acta Mater., 2008, 56, (18), 5285–5292.

83. S. R. Niezgoda and S. R. Kalidindi: ‘Applications of the phase-

coded generalized Hough transform to feature detection, analysis,

and segmentation of digital microstructures’, Comput. Mater.

Continua, 2009, 14, (2), 79–97.

84. S. R. Niezgoda, D. M. Turner, D. T. Fullwood and S. R.

Kalidindi: ‘Optimized structure based representative volume

element sets reflecting the ensemble-averaged 2-point statistics’,

Acta Mater., 2010, 58, (13), 4432–4445.

85. D. T. Fullwood, S. R. Kalidindi, S. R. Niezgoda, A. Fast and N.

Hampson: ‘Gradient-based microstructure reconstructions from

distributions using fast Fourier transforms’, Mater. Sci. Eng. A

Struct. Mater. Prop. Microstruct. Process., 2008, 494, (1–2), 68–

72.

86. B. Bochenek and R. Pyrz: ‘Reconstruction of random micro-

structures: a stochastic optimization problem’, Comput. Mater.

Sci, 2004, 31, (1–2), 93–111.

87. A. P. Roberts: ‘Statistical reconstruction of three-dimensional

porous media from two-dimensional images’, Phys. Rev. E, 1997,

56, (3), 3203.

88. S. R. Niezgoda, A. K. Kanjarla and S. R. Kalidindi: ‘Novel

microstructure quantification framework for databasing, visuali-

zation, and analysis of microstructure data’, Integr. Mater.

Manuf. Innovation, 2013, 2, 3.

89. S. R. Kalidindi: ‘Computationally-efficient fully-coupled multi-

scale modeling of materials phenomena using calibrated localiza-

tion linkages’, ISRN Mater. Sci., 2012, doi:10.5402/2012/305692.

90. T. Fast and S. R. Kalidindi: ‘Formulation and calibration of

higher-order elastic localization relationships using the MKS

approach’, Acta Mater., 2011, 59, 4595–4605.

91. T. Fast, S. R. Niezgoda and S. R. Kalidindi: ‘A new framework

for computationally efficient structure–structure evolution lin-

kages to facilitate high-fidelity scale bridging in multi-scale

materials models’, Acta Mater., 2011, 59, (2), 699–707.

92. G. Landi, S. R. Niezgoda and S. R. Kalidindi: ‘Multi-scale

modeling of elastic response of three-dimensional voxel-based

microstructure datasets using novel DFT-based knowledge

systems’, Acta Mater., 2010, 58, (7), 2716–2725.

93. S. Torquato: ‘Random heterogeneous materials’; 2002, New

York, Springer-Verlag.

94. W. F. Brown: ‘Solid mixture permittivities’, J. Chem. Phys., 1955,

23, (8), 1514–1517.

95. W. H. Press, S. A. Teukolsky, W. T. Vetterling and B. P.

Flannery: ‘Numerical Recipes: The art of scientific computing,

3rd edn, 2007, Cambridge University Press.

96. S. R. Niezgoda, Y. C. Yabansu and S. R. Kalidindi:

‘Understanding and visualizing microstructure and microstruc-

ture variance as a stochastic process’, Acta Mater., 2011, 59,

6387–6400.

97. C. Przybyla, B. L. Adams and M. Miles: ‘A method for

determining property variance in polycrystalline materials’,

NUMIFORM; 2004, Ohio State University, Columbus, OH,

AIP Conference Proceedings.

98. C. P. Przybyla, R. Prasannavenkatesan, N. Salajegheh and D. L.

McDowell: ‘Microstructure-sensitive modeling of high cycle

fatigue’, Int. J. Fatigue, 2010, 32, (3), 512–525. [Special issue on

Fatigue of Materials: Competing Failure Modes and Variability in

Fatigue Life].

99. C. P. Przybyla and D. L. McDowell: ‘Microstructure-sensitive

extreme value probabilities for high cycle fatigue of Ni-base

superalloy IN100’, International J. Plast., 2010, 26, (3), 372–394.

100. X. Gao, C. P. Przybyla and B. L. Adams: ‘Methodology for

recovering and analyzing two-point pair correlation functions in

polycrystalline materials’, Metall. Mater. Trans. A, 2006, 37, (8),

2379–2387.

101. E. Kroner: ‘Statistical Modelling’, in ‘Modelling small deforma-

tions of polycrystals’, (eds. J. Gittus and J. Zarka), 229–291; 1986,

London, Elsevier Science Publishers.

102. Graeme W. Milton: The theory of composites (Cambridge

Monographs on Applied and Computational Mathematics).

Cambridge monographs on applied and computational mathe-

matics, 6. Cambridge University Press, 1st edition, May 2002.

103. D. T. Fullwood, B. L. Adams and S. R. Kalidindi: ‘A strong

contrast homogenization formulation for multi-phase anisotropic

materials’, J. Mech. Phys. Solids, 2008, 56, (6), 2287–2297.

104. B. L. Adams, H. Garmestani and G. Saheli: ‘Microstructure

design of a two phase composite using two-point correlation

functions’, J. Comput. Aided Mater. Des., 2004, 11, 103–115.

105. G. Saheli, H. Garmestani and B. L. Adams: ‘Microstructure

design of a two phase composite using two-point correlation

functions’, J. Comput. Aided Mater. Des., 2004, 11, (2–3), 103–

115.

106. H. Garmestani, S. Lina, B. L. Adams and S. Ahz: ‘Statistical

continuum theory for large plastic deformation of polycrystalline

materials’, J. Mech. Phys. Solids, 2001, 49, (3), 589–607.

107. B. L. Adams: ‘Use of microstructural statistics in predicting

polycrystalline material properties’, Metall. Mater. Trans., 1999,

30A, 969.

108. B. L. Adams and T. Olson: ‘The mesostructure–properties linkage

in polycrystals’, Prog. Mater. Sci., 1998, 43, (1), 1–87.

109. M. J. Beran, T. A. Mason, B. L. Adams and T. Olsen: ‘Bounding

elastic constants of an orthotropic polycrystal using measure-

ments of the microstructure’, J. Mech. Phys. Solids, 1996, 44, (9),

1543–1563.

110. N. Halko, P.-G. Martinsson, Y. Shkolnisky and M. Tygert: ‘An

algorithm for the principal component analysis of large data sets’,

SIAM J. Sci. Comput., 2011, 33, (5), 2580–2594.

111. V. Rokhlin, A. Szlam and M. Tygert: ‘A randomized algorithm

for principal component analysis’, SIAM J. Matrix Anal. Appl.,

2009, 31, (3), 1100–1124.

112. I. T. Jolliffe: ‘Principal component analysis: a beginner’s guide – I.

Introduction and application.’, Weather, 1990, 45, (10), 375–382.

113. C. Suh, A. Rajagopalan, X. Li and K. Rajan: ‘The application of

principal component analysis to materials science data’, Data Sci.

J., 2002, 1, 19–26.

114. A. CeCen, T. Fast, E. C. Kumbur and S. R. Kalidindi: ‘A data-

driven approach to establishing microstructure–property relation-

ships in porous transport layers of polymer electrolyte fuel cells’,

J. Power Sources, 2014, 245, 144–153.

115. X. Dong, D. L. McDowell, S. R. Kalidindi and K. I. Jacob:

‘Dependence of mechanical properties on crystal orientation of

semi-crystalline polyethylene structures’, Polymer, 2014, 55, (16),

4248–4257.

116. B. L. Adams, S. I. Wright and K. Kunze: ‘Orientation imaging:

the emergence of a new microscopy’, Metall. Trans. A, 1993, 24A,

(4), 819–831.

117. S. Pathak, J. Michler, K. Wasmer and S. R. Kalidindi: ‘Studying

grain boundary regions in polycrystalline materials using spherical

nano-indentation and orientation imaging microscopy’, J. Mater.

Sci., 2012, 47, 815–823.

118. S. Pathak, D. Stojakovic and S. R. Kalidindi: ‘Measurement of

the local mechanical properties in polycrystalline samples using

spherical nanoindentation and orientation imaging microscopy’,

Acta Mater., 2009, 57, (10), 3020–3028.

119. S. R. Kalidindi and S. J. Vachhani: ‘Mechanical characterization

of grain boundaries using nanoindentation’, Curr. Opin. Solid

State Mater. Sci., 2014, 8, 196–204.

120. T. A. Mason and B. L. Adams: ‘Use of microstructural statistics

in predicting polycrystalline material properties’, Metall. Mater.

Trans. A, 1999, 30, (4), 969–979.

121. M. D. Uchic, M. A. Groeber and A. D. Rollett: ‘Automated serial

sectioning methods for rapid collection of 3-D microstructure

data’, JOM, 2011, 63, (3), 25–29.

122. W. Xu, M. Ferry, N. Mateescu, J. M. Cairney and F. J.

Humphreys: ‘Techniques for generating 3-D EBSD microstruc-

tures by FIB tomography’, Mater. Charact., 2007, 58, (10), 961–

967.

123. S. Van Boxel, S. Schmidt, W. Ludwig, Y. B. Zhang, D. J. Jensen

and W. Pantleon: ‘Direct observation of grain boundary migra-

tion during recrystallization within the bulk of a moderately

deformed aluminium single crystal’, Mater. Trans., 2014, 55, (01),

128–136.

124. B. L. Adams and D. P. Field: ‘Measurement and representation of

grain-boundary texture’, Metall. Trans. A, 1992, 23A, (9 pt 2),

2501–2513.

125. B. L. Adams: ‘Orientation imaging microscopy: application to the

measurement of grain boundary structure’, Mater. Sci. Eng. A,

1993, 166, (1–2), 59–66.

126. A. A. Gusev: ‘Representative volume element size for elastic

composites: a numerical study’, J. Mech. Phys. Solids, 1997, 45,

(9), 1449–1459.

Kalidindi Accelerated materials development through data science and cyberinfrastructure

International Materials Reviews 2015 VOL 60 NO 3 167

Page 19: Data science and cyberinfrastructure: critical enablers ... · Data science and cyberinfrastructure: critical enablers for accelerated development of hierarchical materials ... can

127. S. Nemat-Nasser and M. Hori: ‘Micromechanics: overall proper-

ties of heterogeneous materials’, 2nd edn; 1999, Amsterdam,

Elsevier.

128. U. Hornung: ‘Homogenization and porous media,

Interdisciplinary Applied Mathematics Series’, Vol. 6; 1997,

Berlin, Springer.

129. A. Cherkaev: Variational methods for structural optimization,

Applied Mathematical Sciences’, Vol. 140; 1991, New York,

Springer.

130. X. L. Chen and Y. J. Liu: ‘Square representative volume elements

for evaluating the effective material properties of carbon

nanotube-based composites’, Comput. Mater. Sci., 2004, 29, (1),

1–11.

131. W. J. Drugan and J. R. Willis: ‘A micromechanics-based nonlocal

constitutive equation and estimates of representative volume

element size for elastic composites’, J. Mech. Phys. Solids, 1996,

44, (4), 497–524.

132. T. Kanit, T. Kanit, S. Forest, I. Galliet, V. Mounoury and D.

Jeulin: ‘Determination of the size of the representative volume

element for random composites: statistical and numerical

approach’, Int. J. Solids Struct., 2003, 40, (13–14), 3647–3679.

133. Z. Shan and A. M. Gokhale: ‘Representative volume element for

non-uniform micro-structure’, Comput. Mater. Sci., 2002, 24, (3),

361–379.

134. K. Sab: ‘On the homogenization and the simulation of random

materials’, Eur. J. Mech. A. Solids, 1992, 11, (5), 505–515.

135. M. Ostoja-Starzewski: ‘Microstructural randomness and scaling

in mechanics of materials’; 2008, Boca Raton, Chapman & Hall/

CRC.

136. E. Kroner: ‘Berechnung der elastischen Konstanten des

Vielkristalls aus den Konstanten des Einkristalls’, Zeitschrift fur

Physik A Hadrons Nuclei, 1958, 151, (4), 504–518.

137. J. R. Willis: ‘Variational and related methods for the overall

properties of composite materials’, Adv. Appl. Mech., 1981, 21, 2–

78.

138. J. J. McCoy: ‘Macroscopic response of continua with random

microstructures’, in ‘Mechanics today’, (ed. S. Nemat-Nasser),

Vol. 6, 1–40; 1981, Oxford, Pergamon Press.

139. R. Hill: ‘Elastic properties of reinforced solids: some theoretical

principles’, J. Mech. Phys. Solids, 1963, 11, 357–372.

140. D. Jeulin and M. Ostoja-Starzewski: ‘Mechanics of random and

multiscale microstructures’; 2001, Wien, New York, Springer.

141. D. L. McDowell, S. Ghosh and S. R. Kalidindi: ‘Representation

and computational structure–property relations of random

media’, JOM, 2011, 63, (3), 45–51.

142. N. J. Petch: ‘Cleavage strength of polycrystals’, J. Iron Steel Inst.,

1953, 174, (Part 1), 25–28.

143. E. O. Hall: ‘The deformation and ageing of mild steel III.

Discussion of results’, Proc. Phys. Soc. Sect. B, 1951, 64, 747–753.

144. A. Gupta, et al: Structure–property linkages for non-metallic

inclusions/steel composite system using a data science approach.

Acta Mater., 2014, in preparation.

145. H. F. Al-Harbi and S. R. Kalidindi: Crystal plasticity finite

element simulations using a database of discrete Fourier trans-

forms. Int. J. Plast., 2014, doi: 10.1016/j.ijplas.2014.04.006.

146. M. Knezevic, H. F. Al-Harbi and S. R. Kalidindi: ‘Crystal

plasticity simulations using discrete Fourier transforms’, Acta

Mater., 2009, 57, (6), 1777–1784.

147. M. Knezevic, S. R. Kalidindi and D. Fullwood: ‘Computationally

efficient database and spectral interpolation for fully plastic

Taylor-type crystal plasticity calculations of face-centered cubic

polycrystals’, Int. J. Plast., 2008, 24, (7), 1264–1276.

148. S. R. Kalidindi, H. K. Duvvuru and M. Knezevic: ‘Spectral

calibration of crystal plasticity models’, Acta Mater., 2006, 54, (7),

1795–1804.

149. H. F. Al-Harbi, M. Knezevic and S. R. Kalidindi: ‘Spectral

approaches for the fast computation of yield surfaces and first-

order plastic property closures for polycrystalline materials with

cubic-triclinic textures’, Comput. Mater. Continua, 2010, 15, (2),

153–172.

150. F. Feyel: ‘A multilevel finite element method (FE2) to describe the

response of highly non-linear structures using generalized

continua’, Comput. Methods Appl. Mech. Eng., 2003, 192, (28),

3233–3244.

151. H. F. Al-Harbi, G. Landi and S. R. Kalidindi: ‘Multi-scale

modeling of the elastic response of a structural component made

from a composite material using the materials knowledge system’,

Modell. Simul. Mater. Sci. Eng., 2012, 20, (5), 055001.

152. G. Landi and S. R. Kalidindi: ‘Thermo-elastic localization

relationships for multi-phase composites’, Comput. Mater.

Continua, 2010, 16, (3), 273–293.

153. S. R. Kalidindi, S. R. Niezgoda, G. Landi, S. Vachhani and T.

Fast: ‘A novel framework for building materials knowledge

systems’, Comput. Mater. Continua, 2010, 17, (2), 103–125.

154. G. Landi, S. R. Niezgoda and S. R. Kalidindi: ‘Multi-scale

modeling of elastic response of three-dimensional voxel-based

microstructure datasets using novel DFT-based knowledge

systems’, Acta Mater., 2009, 58, (7), 2716–2725.

155. Y. C Yabansu, D. K. Patel, and S. R. Kalidindi: ‘Calibrated

localization relationships for elastic response of polycrystalline

aggregates’, Acta Mater., 2014, 81, 151–160.

156. E. Kroner: ‘Bounds for effective elastic moduli of disordered

materials’, J. Mech. Phys. Solids, 1977, 25, (2), 137–155.

157. M. Binci, D. Fullwood and S. R. Kalidindi: ‘A new spectral

framework for establishing localization relationships for elastic

behavior of composites and their calibration to finite-element

models’, Acta Mater., 2008, 56, (10), 2272–2282.

158. S. R. Kalidindi, M. Binci, D. Fullwood and B. L. Adams: ‘Elastic

properties closures using second-order homogenization theories:

Case studies in composites of two isotropic constituents’, Acta

Mater., 2006, 54, (11), 3117–3126.

159. A. V. Oppenheim, R. W. Schafer and J. R. Buck: ‘Discrete time

signal processing’; 1999, Englewood Cliffs, NJ, Prentice Hall.

160. B. S. Anglin, R. A. Lebensohn and A. D. Rollett: ‘Validation of a

numerical method based on Fast Fourier Transforms for

heterogeneous thermoelastic materials by comparison with

analytical solutions’, Comput. Mater. Sci., 2014, 87, (0), 209–217.

161. R. A. Lebensohn, A. K. Kanjarla and P. Eisenlohr: ‘An elasto-

viscoplastic formulation based on fast Fourier transforms for the

prediction of micromechanical fields in polycrystalline materials’,

Int. J. Plast., 2012, 32–33, 59–69.

162. H. Moulinec and P. Suquet: ‘A numerical method for computing

the overall response of nonlinear composites with complex

microstructure’, Comput. Methods Appl. Mech. Eng., 1998, 157,

(1–2), 69–94.

163. H.-J. Bunge: ‘Texture analysis in materials science. Mathematical

Methods’; 1993, Gottingen, Cuvillier Verlag.

164. H. J. Choi, JK Allen, D. Rosen, D. L. McDowell and F. Mistree:

‘An inductive design exploration method for robust multiscale

materials design’, J. Mech. Des., 2008, 130, (3), 031402.

Kalidindi Accelerated materials development through data science and cyberinfrastructure

168 International Materials Reviews 2015 VOL 60 NO 3