putting the crowd to work in a knowledge-based factory

8
Putting the crowd to work in a knowledge-based factory J.R. Corney a, * , C. Torres-Sánchez a , A.P. Jagadeesan a , X.T. Yan a , W.C. Regli b , H. Medellin c a University of Strathclyde, Department of Design, Manufacture and Engineering Management, 75 Montrose St., Glasgow G1 1XJ, UK b Drexel University, Department of Computer Science, 3201 Arch Street, Philadelphia, PA 19104, USA c Area Mecanica y Electrica, Universidad Autonoma de San Luis Potosi, San Luis Potosi, Mexico article info Article history: Received 13 May 2010 Received in revised form 22 May 2010 Accepted 26 May 2010 Keywords: Crowdsourcing Machine learning Geometric reasoning Canonical views Shape similarity 2D nesting abstract Although researchers have developed numerous computational approaches to reasoning and knowledge representation, their implementations are always limited to specific applications (e.g. assembly planning, fault diagnosis or production scheduling) for which bespoke knowledge bases or algorithms have been created. However, ‘‘cloud computing” has made irrelevant both the physical location and internal processes used by machine intelligence. In other words, the Internet encourages functional processes to be treated as ‘black boxes’ with which users need only be concerned with posing the right question and interpreting the response. The system asking the questions does not need to know how answers are generated, only that they are available in an appropriate time frame. This paper proposes that Crowdsourcing could provide on-line, ‘black-box’, reasoning capabilities that could far exceed the capabilities of current AI technologies (i.e. genetic algorithms, neural-nets, case-based reasoning) in terms of flexibility and scope. This paper describes how Crowdsourcing has been deployed in three different reasoning scenarios to carry out industrial tasks that involve significant amounts of tacit (e.g. unformalised) knowledge. The first study reports the application of Crowdsourcing to identify canonical view of 3D CAD models. The qualitative results suggest that the anonymous, Internet, workforce have a good comprehension of 3D geometry. Having established this basic competence the second experiment assesses the Crowd’s ability to judge the similarity of 3D components. Comparison of the results with pub- lished benchmarks shows a high degree of correspondence. Lastly the performance of the Internet labourers is quantified in a 2D nesting task, where their performance is found to be superior to reported computa- tional algorithms. In all these cases results were returned within a couple of hours and the paper concludes that there is potential for broad application of Crowdsourcing to geometric problem solving in CAD/CAM. Ó 2010 Elsevier Ltd. All rights reserved. 1. Introduction We present Crowdsourcing as a tool to facilitate machine intelli- gence in a knowledge-based factory. Our approach acknowledges the outstanding capacity of the human brain, but rather than trying to understand, or mimic, the complexity of the cognitive processes, we propose that human intelligence is employed directly at critical steps where machine intelligence cannot match human performance. A direct consequence of this methodology is that rather than systems requiring, say, rule-bases, inference-engines, fitness func- tions or databases of ‘‘case” examples, the focus becomes the con- struction of queries for the crowd and the development of statistical methods for the aggregation of their responses. However although this scenario (i.e. where Crowdsourcing provides the rea- soning functions traditionally carried out by AI subsystems) will change the nature of individual software components, the overall architecture of problem solving systems will, in many cases, be un- changed. But Crowdsourcing offers the opportunity to do more than simply provide a neat on-line interface to human reasoning and judgement. It offers the opportunity to discover effective prob- lem solving strategies. By definition, ‘‘machine intelligence” sys- tems are created by programmers who have encoded problem solving strategies in the software. So while occasionally emergent behaviours are observed, we would argue that for the most part AI systems solve problems exactly has they have been design too. And this is both their limitation and their strength. Crowdsourcing, in contrast, solves problems, cleans up data, classifies content, selects options, creates new content, and many other tasks using strategies that appear ‘opaque’ to the user. But because of the digital nature of the activity, there is an opportunity to record, observe and assess the problem solving strategies of many individuals in a way that would be extremely difficult to 1474-0346/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.aei.2010.05.011 * Corresponding author. Tel.: +44 (0) 141 548 2254; fax: +44 (0) 141 548 4870. E-mail addresses: [email protected] (J.R. Corney), carmen.torres@g- mail.com (C. Torres-Sánchez), [email protected] (A.P. Jagadeesan), [email protected] (X.T. Yan), [email protected] (W.C. Regli), [email protected] (H. Medellin). Advanced Engineering Informatics 24 (2010) 243–250 Contents lists available at ScienceDirect Advanced Engineering Informatics journal homepage: www.elsevier.com/locate/aei

Upload: jr-corney

Post on 26-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Advanced Engineering Informatics 24 (2010) 243–250

Contents lists available at ScienceDirect

Advanced Engineering Informatics

journal homepage: www.elsevier .com/ locate/ae i

Putting the crowd to work in a knowledge-based factory

J.R. Corney a,*, C. Torres-Sánchez a, A.P. Jagadeesan a, X.T. Yan a, W.C. Regli b, H. Medellin c

a University of Strathclyde, Department of Design, Manufacture and Engineering Management, 75 Montrose St., Glasgow G1 1XJ, UKb Drexel University, Department of Computer Science, 3201 Arch Street, Philadelphia, PA 19104, USAc Area Mecanica y Electrica, Universidad Autonoma de San Luis Potosi, San Luis Potosi, Mexico

a r t i c l e i n f o

Article history:Received 13 May 2010Received in revised form 22 May 2010Accepted 26 May 2010

Keywords:CrowdsourcingMachine learningGeometric reasoningCanonical viewsShape similarity2D nesting

1474-0346/$ - see front matter � 2010 Elsevier Ltd. Adoi:10.1016/j.aei.2010.05.011

* Corresponding author. Tel.: +44 (0) 141 548 2254E-mail addresses: [email protected] (J.

mail.com (C. Torres-Sánchez), [email protected]@strath.ac.uk (X.T. Yan), [email protected] (W.C. R(H. Medellin).

a b s t r a c t

Although researchers have developed numerous computational approaches to reasoning and knowledgerepresentation, their implementations are always limited to specific applications (e.g. assembly planning,fault diagnosis or production scheduling) for which bespoke knowledge bases or algorithms have beencreated.

However, ‘‘cloud computing” has made irrelevant both the physical location and internal processes usedby machine intelligence. In other words, the Internet encourages functional processes to be treated as‘black boxes’ with which users need only be concerned with posing the right question and interpretingthe response. The system asking the questions does not need to know how answers are generated, onlythat they are available in an appropriate time frame.

This paper proposes that Crowdsourcing could provide on-line, ‘black-box’, reasoning capabilities thatcould far exceed the capabilities of current AI technologies (i.e. genetic algorithms, neural-nets, case-basedreasoning) in terms of flexibility and scope. This paper describes how Crowdsourcing has been deployed inthree different reasoning scenarios to carry out industrial tasks that involve significant amounts of tacit (e.g.unformalised) knowledge. The first study reports the application of Crowdsourcing to identify canonicalview of 3D CAD models. The qualitative results suggest that the anonymous, Internet, workforce have agood comprehension of 3D geometry. Having established this basic competence the second experimentassesses the Crowd’s ability to judge the similarity of 3D components. Comparison of the results with pub-lished benchmarks shows a high degree of correspondence. Lastly the performance of the Internet labourersis quantified in a 2D nesting task, where their performance is found to be superior to reported computa-tional algorithms. In all these cases results were returned within a couple of hours and the paper concludesthat there is potential for broad application of Crowdsourcing to geometric problem solving in CAD/CAM.

� 2010 Elsevier Ltd. All rights reserved.

1. Introduction

We present Crowdsourcing as a tool to facilitate machine intelli-gence in a knowledge-based factory. Our approach acknowledgesthe outstanding capacity of the human brain, but rather than tryingto understand, or mimic, the complexity of the cognitive processes,we propose that human intelligence is employed directly at criticalsteps where machine intelligence cannot match human performance.

A direct consequence of this methodology is that rather thansystems requiring, say, rule-bases, inference-engines, fitness func-tions or databases of ‘‘case” examples, the focus becomes the con-struction of queries for the crowd and the development ofstatistical methods for the aggregation of their responses. However

ll rights reserved.

; fax: +44 (0) 141 548 4870.R. Corney), [email protected] (A.P. Jagadeesan),egli), [email protected]

although this scenario (i.e. where Crowdsourcing provides the rea-soning functions traditionally carried out by AI subsystems) willchange the nature of individual software components, the overallarchitecture of problem solving systems will, in many cases, be un-changed. But Crowdsourcing offers the opportunity to do morethan simply provide a neat on-line interface to human reasoningand judgement. It offers the opportunity to discover effective prob-lem solving strategies. By definition, ‘‘machine intelligence” sys-tems are created by programmers who have encoded problemsolving strategies in the software. So while occasionally emergentbehaviours are observed, we would argue that for the most part AIsystems solve problems exactly has they have been design too. Andthis is both their limitation and their strength.

Crowdsourcing, in contrast, solves problems, cleans up data,classifies content, selects options, creates new content, and manyother tasks using strategies that appear ‘opaque’ to the user. Butbecause of the digital nature of the activity, there is an opportunityto record, observe and assess the problem solving strategies ofmany individuals in a way that would be extremely difficult to

244 J.R. Corney et al. / Advanced Engineering Informatics 24 (2010) 243–250

do in any other scenario. To illustrate this, Section 3 of the paperpresents the early results of Crowdsourced part nesting wherenot only do the results improve upon those generated by commer-cial CAM systems but could, potentially, also provide insights intohow automate system could be improved.

Finally, crucial to any industrial use of Crowdsourced ‘‘humanintelligence”, is the constant availability of sufficient quantity,and quality, of on-line workers. Regardless of the task undertaken,our results suggest that the Internet is now sufficiently large andglobally distributed well that commercial Crowdsourcing sitescan easily provide results in only a few hours on a 24/7 basis.

This paper is divided into five sections: this introduction contin-ues with a brief overview of machine reasoning in the knowledge-based factory and describing the emerging technology of InternetCrowdsourcing. The next section (Section 2) describes how Crowd-sourcing has been use to provide reasoning for a 3D content basedretrieval system whose overall architecture is analogous to ‘‘tradi-tional” AI applications for machine vision or speech recognition.The following section (Section 3) illustrates the opportunity forCrowdsourcing to contribute insights to problem solving strategies(in this case 2D shape nesting) as well as results. Finally, a discus-sion (Section 4) summarises the potential applications and thechallenges of Crowdsourcing industrial tasks, before conclusionsare drawn, in Section 5.

1.1. Machine cognition in the knowledge-based factory

Machine intelligence has been used to support industrial pro-cesses that range from computer vision for robotics to creative de-sign. Duffy [1] presents six main machine learning techniques:agent-based learning, analogical reasoning, induction methods, ge-netic algorithms, knowledge compilation, and neural networks. Acommon element of all these methods is the need to ‘‘inform” or‘‘teach” the system using databases of examples.

For example, analogical reasoning (i.e. finding solutions to prob-lems based on retrieving knowledge from previous experiences),induction methods (i.e. where knowledge is generated by theamalgamation of similar data and its analysis to obtain a classifica-tion), genetic algorithms (i.e. when new concepts are generated bythe cross-over or ‘mutation’ of previous ones), knowledge compila-tion (i.e. simplify into more fundamental knowledge, so it can bereusable in other situations), and neural networks (when a ma-chine, executes a similar learning mechanism to a human brainby training on example data) are all strategies that could also beemployed in conjunction with Crowdsourced databases of exam-ples to create a true ‘knowledge-based’ factory.

Currently, when embedded in an application, AI technologies(such as those listed above) are rarely able to work on raw data(e.g. documents, audio or image files etc.). More typically the datais analysed to identify features, or characteristics, which form the‘language’ of the reasoning system (Fig. 1).

In a machine vision system, the features might be ‘‘edges” iden-tifies in a .jpeg image, in speech recognition software signal pro-cessing is used to identify phonetic patterns and in the 3D CAMdepressions are identified on geometric CAD models prior toprocess planning. This model is unchanged by Crowdsourcing tech-nology and the following sections demonstrates how micro-out-sourcing to the Internet can provide the functionality for boththe ‘Feature Recognition’ and ‘Reasoning’ stages that are character-istic of so many problem solving architectures.

Data Feature Recognition Reasoning Result

Fig. 1. Classic AI architecture.

1.2. Crowdsourcing technology

The term ‘crowdsourcing’ was coined by Jeff Howe in 2006 as‘‘the act of a company or institution taking a function once per-formed by employees and outsourcing it to an undefined (and gen-erally large) network of people in the form of an open call” [2].These activities are executed by people who do not necessarilyknow each other, and interact with the company, the ‘requester’,via virtual tools and an internet connection. They become ‘theworkers’: they can access tasks, execute them, upload the resultsand receive various forms of payment using any web browser. Thisis a labour market open 24/7, with a diverse workforce available toperform tasks quickly and cheaply.

The crowdsourcing platform used in the investigations reportedhere was Amazon’s mTurk (www.mturk.com) which was selectedbecause of the large number of worker available. Although itshould be note that there are several alternatives (e.g. HumanGrid:http://www.humangrid.de and Crowdflower: crowdflower.com).

As shown in Fig. 2, the ‘requesters’ both design and post tasksfor the Crowd to work on. In mTurk, tasks given to the ‘workers’are called ‘HITs’ (Human Intelligence Tasks). Requesters can testworkers before allowing them to accept tasks and so establish abaseline performance level of prospective workers. Requesterscan also accept, or reject, the results submitted by the workers,and this decision impacts on the worker’s reputation within themTurk system. Payments for completed tasks can be redeemedas ‘Amazon.com’ gift certificates or alternatively transferred to aworker’s bank account. Details of the mTurk interface design,how an API is used to create and post HITs and a description ofthe workers’ characteristics are beyond the scope of this paperbut can be found (along with further details of the experimental re-sults) in [3,4]. With each result submitted by a worker the reques-ter receives an answer that including various information abouthow the task was processed. One element of this data is an unique‘‘workerID” allowing the requester to distinguish between individ-ual workers. Using this ‘‘workerID” it is possible to analyse howmany different HITs each worker completed.

A definitive classification of Crowdsourcing tasks has not yetbeen established, however Corney et al. [5] suggest three possiblecategorisations based upon: nature of the task (creation, evaluationand organisation tasks), nature of the crowd (‘expert’, ‘most people’and ‘vast majority’) and nature of the payment (voluntary contribu-tion, rewarded at a flat rate and rewarded with a prize). SimilarlyCrowdsourcing practitioners, such as Chaordix (from the CambrianHouse [6]) describes Crowdsourcing models as a Contest (i.e. indi-vidual submit ideas and the winner is selected by the company,‘the requester’), a Collaboration (i.e. individuals submit their ideasor results, the crowd evolves the ideas and picks a winner), and Mod-erated (i.e. individuals submit their ideas, the crowd evolves thoseideas, a panel – set by ‘the requesters’ select the finalists and thecrowd votes on a winner). In the last few years academics acrossmany different disciplines have started reporting the use of InternetCrowdsourcing to support a range of research projects, e.g. socialnetwork motivators [7], relevance of evaluations and queries [8,9],accuracy in judgement and evaluations[10]. Despite this activityfew industrial applications of Crowdsourcing have been reportedand this gap in the literature motivated the authors to undertakethe studies into 3D search and 2D part nesting reported in the fol-lowing sections.

2. 3D Search case study

At present, 3D models (e.g. engineering drawings) are indexedby alpha numeric ‘part-numbers’ with a format unique to eachindividual organisation. Although this indexing system works well

Fig. 2. Schematic of the Amazon’s MTurk system for Crowdsourcing tasks.

3D models Canonical Views 3D Similarity Identification

Search results

Fig. 3. Schematic of the Crowdsourcing technique for shape search.

J.R. Corney et al. / Advanced Engineering Informatics 24 (2010) 243–250 245

in the context of on-going maintenance and development of indi-vidual parts, it offers little scope for ‘data-mining’ (i.e. exploration)of an organisation’s inventory of designs. In addition to the sourc-ing of parts the application of a 3D similarity matching algorithmto large collections of parts would allow many other efficiencygains, such as the following, in [11,12]:

� Cost estimation for machined parts. For some manufacturingdomains such as rapid prototyping, reasonably accurate esti-mates of cost can be achieved by estimating the volume orweight of the part. But for other manufacturing domains, suchas machining, cost estimation depends on the geometric detailsof the object, and automated procedures are not available foraccurate cost estimation. However, the cost of manufacturinga new part can be estimated by finding previously manufac-tured parts that are similar in shape to the new part. If a suffi-ciently similar part can be found in the database of previouslymanufactured objects, then the cost of the new part can be esti-mated by suitably modifying the actual cost of the currentlymanufactured similar part.� Part family formation. In many manufacturing domains such as

sheet metal bending, machine tools can be setup to producemore than one type of part without requiring a setup or toolchange [13]. However, parts need to be shape compatible inorder for them to share common tools and setups. Therefore,in order to find common tools and setups, geometrically similarparts need to be grouped into families. Shared tools and setupscan be used to manufacture objects in the same family, result-ing in significant cost savings.� Reduction in part proliferations by reusing previously designed

parts. Reusing design/manufacturing information stored wouldresult in a faster and more efficient design process. Whiledesigning a new part the designer can refer to existing designsand utilize previously used components.

Although the advantages of 3D content based retrieval are com-pelling, it has proved hard to reproduce human perceptions of 3Dsimilarity algorithmically [14]. A typical computational approachto content based retrieval would be to process the 3D models toidentify certain characteristic features (ideally quantitative proper-ties) against which candidate models would be matched; the closer

the match the more similar the part. But researchers have strug-gled to determine quantifiable ‘‘features” that robustly reflect thesimilarity seen by human eyes [15,16].

In contrast, a Crowdsourcing approach does not have to explic-itly identify or compute any features but simply allow the humanworkers to visually compare shapes within a web browser.Although in principle simple, early prototypes of the HIT madetwo important implementation details clear:

(1) Using 2D images of 3D models allowed the maximum num-ber of components to be displayed (and compared) in a webbrowser window at one time.

(2) 2D images had to be carefully selected for the details of theshape to be clearly seen.

Consequently, before the similarity of 3D parts can be Crowd-sourced, the visibility of their most important features has to be as-sured by identifying a characteristic 2D image of each of themodels. However, the computation of a ‘‘good view” (i.e. canonicalview) of a 3D object is not a property that can be robustly deter-mined algorithmically. Therefore, before assessing the similarityof 3D parts, canonical views of parts had to be Crowdsourced.The overall process is shown schematically in Fig. 3.

2.1. Crowdsourcing canonical viewpoints

One of the important results established by psychophysicalstudies of image understanding in the 1980s was the realisationthat people recognised objects more easily if they had a certain ori-entation. Using the mTurk API, a HIT to determine the canonicalview of a 3D object was designed and implemented (Fig. 4). Thetask showed an animation of an object rotating alternately abouttwo axes on the top left of the pages and to the right of this imagelay a grid of 12 images. Ideally interactive 3D models would havebeen presented to the workers, but the implementation route is

Fig. 4. The canonical viewpoint HIT.

246 J.R. Corney et al. / Advanced Engineering Informatics 24 (2010) 243–250

dictated by the capabilities of the mTurk API, which has to delivertasks robustly to any web browser, across variable Internet speedsand run on any PC. Consequently 3D models were not considered aviable option, hence the animated 2D images. The HIT provider wasasked to select the three most representative views in order. EachHIT consisted of five such tasks (i.e. five different shapes) arrangedvertically on a web page. A set of 20 HITs consisted of the same fiveshapes being presented (in randomly varied order) 20 times to dif-ferent workers.

All answers were returned within 1 h 21 min 53 s 10% of the re-sponses were rejected due to technical problems relating to thetype of browser being used. This was a relatively constrained taskinvolving selection from a limited number of alternatives. Theusers were paid $0.15 per HIT, and all the results were returnedin 1 h 21 min in the first of the sets. Subsequent rounds were com-pleted in less time (set 2: 23 min 45 s; set 3: 41 min 22 s; set 4:42 min 41 s),

Fig. 5 shows results for one of the components whose view-points were accessed by a HIT. The numbers in brackets are thenumber of mTurk workers who selected each view as most repre-sentative of the components shape.

Unlike the other Crowdsourced tasks reported here there is nopublished benchmark data for quantifying the ‘‘correctness” ofthe views selected. However it is possible to compare the rela-tive responses (shown in brackets under the images) againstone’s own judgement as engineers (in the authors’ case). In thiscontext the selection of ‘‘Option 9” with nine votes as the mostcharacteristic view corresponded with the judgement of the CAD/CAM researchers polled internally. Importantly, for the investiga-

Fig. 5. Typical results for the canonical views H

tion reported in the next section, the study established that themTurk workers had ‘at least’ as good a judgement of 3D shapesas the authors. This result was typical; all the viewpoint HITsposted returned a reasonable assessment of the object’sshape.

2.2. Crowdsourcing 3D shape similarity

Having established the viability of Crowdsourcing ‘‘good” view-points of 3D models mTurk workers were given a HIT with a pool ofstill images of 3D models (Fig. 6). It was requested that the workers‘‘put similar looking models together into groups” by clicking ontheir images and adding them to families of similar shapes dis-played in rows. The selected shapes appeared below the main poolof images in rows associated with the different family groups iden-tified. The workers were asked to continue this process until therewere no images left in the pool. The results were aggregated usingthe method described in [4] and displayed using a dendogram(Fig. 7).

Using the ‘‘solid of revolution” class of parts from the Engineer-ing Shape Benchmark ESB [17] 479 parts were presented to work-ers as a single HIT. 15 HITs were posted and all the HITs wereaccepted (i.e. allocated to individual workers on mTurk) in 15 min02 s and returned in 1 h 49 min 40 s. As there were a large numberof parts to cluster, workers had to scroll the HIT page up and downmany times. In order to encourage good results, every worker whoproduce a credible response was paid $12.00. People who didexceptional work (and identified many families of similar shpes)were rewarded with a bonus of $6.00.

IT (in brackets the frequency of selection).

Fig. 6. The shape similarity HIT.

J.R. Corney et al. / Advanced Engineering Informatics 24 (2010) 243–250 247

A correspondence between the part families identified in theESB and those sorted by the mTurk workers is described in [18]and summarised here in Table 1.

From this it can be seen that validating the results of the Crowd-sourced similarity clusters against the ESB’s published groupingsproduces correlation of better than 70%. Furthermore examinationsof the differences, detailed in [13] show they arise from inherentlyambiguous parts, whose correct assignment is arguable. Overall

Fig. 7. Shape similarity matrix constructed by aggregation of re

the results provided good correspondence to the reported classifi-cations for the ESB.

3. Crowdsourcing 2D part nesting

Other AI reasoning applications such a planning require verydifferent approaches from those used in classification or recogni-tion problems. Typically networks of constraints are constructed

sponses and plotted on a dendogram (truncated version).

Table 1Percentage of difference in clusters. ‘‘ESB data” vs ‘‘Cluster-HIT”.

ESB benchmark data Approximate similaritybetween clusters(ESB vs HIT) (%)

Flat-thinwall components 71.5Rectangular-cubic prism 85.8Solid of revolution 80.6

Fig. 8. The ‘2D-packing’ HIT.

Fig. 9. Result of a ‘Packing’ HIT. This mTurk worker achieved a 89.41% packingefficiency 1 h 6 min 58 s.

1 Mumford-Valenzuela benchmark data (2001): http://dip.sun.ac.za/~vuuren/repositories/levelpaper/WebpageData/nice1.25.xls.

248 J.R. Corney et al. / Advanced Engineering Informatics 24 (2010) 243–250

and the ‘‘intelligence” or problem solving strategy is embedded inthe algorithm used to navigate this graph [19] Crowdsourcing of-fers the possibility of solving planning, and other combinatoriallyexplosive, problems using distributed human labour. To investi-gate this possibility, we created an experiment that asked mTurkworkers to pack 2D shape into the smallest possible area.

Numerous manufacturing applications need to arrange variablenumbers of arbitrary shapes into limited areas or volumes. Forexample, in the stamping of sheet metal material can represent75%, or more, of the total cost, consequently even small inefficien-cies in material utilization can lead to the loss of profit [13]. Com-putation of a theoretically optimum solution for 2D part nesting isknown to be NP-complete and consequently numerous ‘‘good” and‘‘near optimum” solution are used in practice. However the abilityto improve on even a good solution can often have significant eco-nomic benefits (e.g. less waste, higher productivity).

The packing problem occurs in many different applications fromcontainer transportation to the 2D stamping of complex profiles onCNC machines [20,21]. The task tests the crowd’s ability to interac-

tively optimize a problem with many complex interactions (andprovides a numerical measure of success) by asking workers to cre-ate the most compact layout of a set of profiles. The packing HIT isdifferent from the other examples (i.e. canonical views and shapesimilarity) in that it is not looking for an average or consensus solu-tion, but rather it is seeking the best amongst many attempts. Likemany design tasks, results cannot be averaged or aggregated.

This HIT page (Fig. 8) was developed using ‘‘Flash” software. Theworkers were provided with the task of packing objects of differentshapes into a rectangular space on the webpage, while ensuringthat the overall length of the space used was as small as possible.The shapes of the objects used in the HIT shown belong to theMumford-Valenzuela benchmark data set.1 The green shapes couldbe dragged into the space provided within the red boundaries. The

Table 2Performance of Crowdsourcing against the Albano benchmark.

Efficiency (%) Used length Working time (s) Best result in the literature (Gomes 2006) Efficiency improvement (%)

Best Average Best Average Best Average Efficiency (%) Time (s) Best Average

89.41 83.57 177.0 189.8 4018 3522 87.43 2257 2.26 �4.41

J.R. Corney et al. / Advanced Engineering Informatics 24 (2010) 243–250 249

workers were able to rotate and nudge the objects to allow denserpacking. The length of the space used was displayed dynamicallywhile the objects were packed, so the worker could manipulatetheir positions and orientations in order to get an efficient packing.Once they were satisfied with the packing they could upload theresults using the ‘‘Submit” button the HIT page.

In this case, worker were paid $3, with a bonus of $0.5 per 0.5%of improvement on the ‘best result’ (the value was display are thestart of the job) and spent between 1 h 24 min and 26 min 20 sminutes on the task. The work submitted was scored and classifieddepending on the density of their packing, in %, and the time em-ployed to return the task (Fig. 9 shows an example). There wasnot a strong correlation between the outcome of this experimentand the educational background of the ‘workers’, their age, timespent at mTurk, or even the country where they lived . From aset of ten HITs two workers produced results that improved onthe best known arrangement.

In order to quantify the effectiveness and performance of theCrowdsourcing approach for two-dimensional strip packing, sev-eral experiments were conducted using the Albano data sets [22]from the textile industry and comprises a total of 24 items.

Table 2 summarises our initial results obtained from the exper-iments with the Albano packing problem using the proposedCrowdSourcing approach. Table 2 also gives a comparison withthe best results available from the literature, which correspondsto the simulated annealing hybrid algorithm (SAHA) [23]. Fromthese results it can be observed that the CrowdSourcing approach,although 78% slower, improved on the best algorithmic result pub-lished in the literature for this data set.

4. Discussion

In this paper, we have shown how Crowdsourced can be used tocarry out a variety of geometric reasoning tasks. This is a powerfultestament to the flexibility of the process presented here. Althoughthe design and coding required to implement the HITs was not triv-ial, it was considerably easier than the development of new algo-rithms for machine cognition and learning. The crucial aspectwas to formulate the right question and carefully consider theinstructions provided to the mTurk workers.

In a knowledge-based factory scenario, the definition of ‘‘right”,or appropriate, questions to be crowdsourced dovetails with someof the machine learning techniques described in Section 1.1. Theformulation of appropriate questions could be channelled towardsthe harness of the crowd’s judgement and knowledge, so the sys-tem can react at any given scenario when a problem arises, withthe most appropriate solution in each case provided by the crowd.

The standing problem for Crowdsourcing activities lays with thedifficulty in finding the ‘‘right” crowd, especially if the task to solverequires a ‘special talent’. In the work reported here there was nopre-qualification task to determine the skill levels of workers. Sothe results reported have been generated by anyone who acceptedthe task. For a knowledge-based factory setting, this might be adrawback when tasks require a highly specialized judgement.However, for those other situations where a less specific, more’common sense’ or tacit knowledge is required, outsourcing a jobto generate cluster of proposed solutions from the crowd mightbe viable. But regardless of the complexity of the task the quanti-

fication of the Crowd’s performance will help give confidence thatthe ‘‘right” workers are being employed, and it is clear that aca-demic literature is starting to provide exemplars of how Crowd-sourced results can be statistically analysed [24].

How to embed knowledge-based resources in a company’s sys-tems (e.g. work flow) remains a research question. In addition torecognising the value of having ‘‘human’s in the loop”, enterpriseswishing to exploit Crowdsourcing will also have to tackle the is-sues of confidentiality. How ‘‘safe” is it to let various componentdetails be processes outside the factory? One suspects that giventhe volumes of traffic through systems like mTurk there is littlereal danger, but establishing, this to the satisfaction of a companyboard, will be challenging.

Further work in this area includes the exploitation of Crowd-sourcing in the gathering of solution strategies and examplesrather than just recording the end result. Indeed understandingthe problem solving process used by humans is often a first step to-wards teaching machines how to learn and acquire the knowledgeto support the process [25]. Any database of knowledge will be bet-ter the more complete it is, the greater the variety of solution-examples it contains. Crowdsourcing could not only be used togenerate the large pool of ‘solutions’ needed but also help retrieve(e.g. identify) the best solutions for a given problem.

There is also considerable scope for further academic investiga-tion of the reward schemes used to motivate the Crowd. In ourwork generous bonuses were offered to encourage quality work,but we did not explore the limits or variations of paymentschemes. The relationship between payment levels, speed of re-sponse and quality of results will require large scale, sustained tri-als. The authors are currently exploring funding opportunities tocarry out this sort of investigation.

5. Conclusions

Examples of crowdsourced work for various mechanical CAD/CAM applications have been presented in this paper. Beyond sim-ply establishing that the approach produced surprisingly good re-sults we learnt that it was important to present the ‘‘rightquestion” to the crowd. Therefore, the sophisticated job was sim-plified into several steps to ensure clarity and comprehension bythe workers. Interestingly, like Crowdsourcing applications inother domains we have shown that preparation work (e.g. bestview selection) for a more complex task (shape similarity classifi-cation) can be successfully Crowdsourced, assisting the cognitiveactivity in an integrated bottom-up fashion. We believe the workon 2D part nesting shows that there is scope for the broad applica-tion of Crowdsourcing to the solution of NP-complete problems inCAD/CAM.

In the long term, the methods presented here could also be usedfor building up databases of ‘solutions and decisions’ that machineintelligence requires. In other words an Internet crowd could beused for the generation ‘‘cases”, by exposing them to decision-making situations the system will encounter. Once analysed andamalgamated, these could be stored and embedded into the sys-tem’s knowledge bases, from which they can be pulled and put intoaction when necessary. In this way the crowd, a ‘‘knowledge net-work”, becomes the solution provider.

250 J.R. Corney et al. / Advanced Engineering Informatics 24 (2010) 243–250

References

[1] A.H.B. Duffy, The ‘‘What” and ‘‘How” of learning in design, IEEEExpert: Intelligent Systems and Their Applications 12 (3) (1997) 71–76.

[2] J. Howe, ‘‘The rise of Crowdsourcing: http://www.wired.com/wired/archive/14.06/crowds_pr.html”, Wired, issue 14, retrieved (accessed 17.9.2009).

[3] P. Jagadeesan, J. Wenzel, et al., Geometric Reasoning via InternetCrowdSourcing 2009 SIAM/ACM Joint Conference on Geometric & PhysicalModeling, San Francisco, CA, 2009.

[4] P. Jagadeesan, J. Wenzel, et al., Validation of purdue engineering shapebenchmark clusters by Crowdsourcing, in: International Conference onProduct Lifecycle Management, Bath, UK, 2009.

[5] J.R. Corney, C. Torres-Sanchez, A. Jagadeesan, R. Prasanna, C. William,Outsourcing labour to the cloud, International Journal of Innovation andSustainable Development 4 (4) (2010) 294–313.

[6] ‘‘CambrianHouse: www.cambrianhouse.com/”, retrieved (accessed 8.9.2009).[7] D.C. Brabham, Moving the crowd at iStockphoto: the composition of the crowd

and motivations for participation in a Crowdsourcing application, FirstMonday 13 (6) (2008).

[8] O. Alonso, S. Mizzaro, Relevance criteria for e-commerce: a Crowdsourcing-based experimental analysis, in: Proceedings of the 32nd International ACMSIGIR Conference on Research and Development in Information Retrieval, ACM,Boston, MA, USA, 2009, pp. 760–761.

[9] V. Kostakos, Is the crowd’s wisdom biased? A quantitative analysis of threeonline communities, International Symposium on Social Intelligence andNetworking (SIN09), Vancouver, Canada, 2009.

[10] A. Kittur, E.H. Chi, et al., Crowdsourcing user studies with mechanical turk, in:Proceeding of the Twenty-sixth Annual SIGCHI Conference on Human Factorsin Computing Systems, ACM, Florence, Italy, 2008.

[11] S.C.F. Chan, V.T.Y. Ng, et al., A solid modeling library for the World WideWeb, Computer Networks and ISDN Systems. 30 (20–21) (1998) 1853–1863.

[12] H. Rea, J. Corney, et al., Commercial and business issues in the e-sourcing andreuse of mechanical components, The International Journal of AdvancedManufacturing Technology 30 (9) (2006) 952–958.

[13] U. Alva, S.K. Gupta, Automated design of sheet metal punches for bendingmultiple parts in a single setup, Robotics and Computer-IntegratedManufacturing 17 (1–2) (2001) 33–47.

[14] J.W.H. Tangelder, R.C. Veltkamp, A survey of content based 3D shape retrievalmethods, Multimedia Tools and Applications 39 (3) (2008) 441–471.

[15] A. Golovinskiy, W. Matusik, et al., A statistical model for synthesis of detailedfacial geometry, in: ACM SIGGRAPH 2006 Papers, ACM, Boston, Massachusetts,2006, pp. 1025–1034.

[16] J. Wang, Y. He, H. Tian, H. Cai, Retrieving 3D CAD model by freehandsketches for design reuse, Advanced Engineering Informatics 22 (3) (2008)385–392.

[17] S. Jayanti, Y. Kalyanaraman, et al., Developing an engineering shapebenchmark for CAD models, Computer-Aided Design 38 (9) (2006) 939–953.

[18] A.P. Jagadeesan, J. Wenzel, et al., Fast human classification of 3D objectbenchmarks, in: I. Pratikakis, M. Spagnuolo, T. Theoharis, R. Veltkamp (Eds.),Proceedings of the EUROGRAPHICS Workshop on 3D Object, retrieval, 2010.

[19] J. Mula, R. Polera, J.P. García-Sabatera, F.C. Larioa, Models for productionplanning under uncertainty: a review, International Journal of ProductionEconomics 103 (1) (2006) 271–285.

[20] J. Puchinger, G.R. Raidl, Models and algorithms for three-stage two-dimensional bin packing, European Journal of Operational Research 183 (3)(2007) 1304–1327.

[21] S.Q. Xie, G.G. Wang, Y. Liu, Nesting of two-dimensional irregular parts: anintegrated approach, International Journal of Computer IntegratedManufacturing 20 (8) (2007) 741–756.

[22] A. Albano, G. Sapuppo, Optimal allocation of two-dimensional irregular shapesusing heuristic search methods, IEEE Transactions on Systems, Man andCybernetics SMC 10 (1980) 242–248.

[23] Gomes A. Miguel, F. Oliveira José, Solving irregular strip parking problems byhybridising simulated annealing and linear programming, European Journal ofOperational Research 171 (2006) 811–829.

[24] C. Callison-Burch, Fast, cheap, and creative: evaluating translation qualityusing Amazon’s Mechanical Turk, in: Proceedings of the EMNLP 2009, ACL andAFNLP, 2009, pp. 286–295.

[25] R. Sung, J.M. Ritchie, H.J. Rea, J. Corney, Automated design knowledge captureand representation in single-user CAD environments, Journal of EngineeringDesign. Available online: <19 February 2010>.