real-time large crowd rendering with efficient character

16
Research Article Real-Time Large Crowd Rendering with Efficient Character and Instance Management on GPU Yangzi Dong and Chao Peng Department of Computer Science, University of Alabama in Huntsville, Huntsville, AL 35899, USA Correspondence should be addressed to Yangzi Dong; [email protected] Received 20 October 2018; Revised 26 January 2019; Accepted 3 March 2019; Published 26 March 2019 Academic Editor: Ali Arya Copyright © 2019 Yangzi Dong and Chao Peng. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Achieving the efficient rendering of a large animated crowd with realistic visual appearance is a challenging task when players interact with a complex game scene. We present a real-time crowd rendering system that efficiently manages multiple types of character data on the GPU and integrates seamlessly with level-of-detail and visibility culling techniques. e character data, including vertices, triangles, vertex normals, texture coordinates, skeletons, and skinning weights, are stored as either buffer objects or textures in accordance with their access requirements at the rendering stage. Our system preserves the view-dependent visual appearance of individual character instances in the crowd and is executed with a fine-grained parallelization scheme. We compare our approach with the existing crowd rendering techniques. e experimental results show that our approach achieves better rendering performance and visual quality. Our approach is able to render a large crowd composed of tens of thousands of animated instances in real time by managing each type of character data in a single buffer object. 1. Introduction Crowd rendering is an important form of visual effects. In video games, thousands of computer-articulated polygonal characters with a variety of appearances can be generated to inhabit in a virtual scene like a village, a city, or a forest. Movements of the crowd are usually programmed through a crowd simulator [1–4] with given goals. To achieve a realistic visual approximation of the crowd, each character is usually tessellated with tessellation algorithms [5], which increases the character’s mesh complexity to a sufficient level, so that fine geometric details and smooth mesh deformations can be preserved in the virtual scene. As a result, the virtual scene may end up with a composition of millions of, or even hundreds of millions of, vertices and triangles. Rasterizing such massive amount of vertices and triangles into pixels is a high computational cost. Also, when storing them in memory, the required amount of memory may be beyond the storage capability of a graphic hardware. us, in the production of video games [6–9], advanced crowd rendering technologies are needed in order to increase the rendering speed and reduce memory consumption while preserving the crowd’s visual fidelity. To improve the diversity of character appearances in a crowd, a common method is duplicating a character’s mesh many times and then assigning each duplication with a different texture and a varied animation. Some advanced methods allow developers to modify the shape proportion of duplications and then retarget rigs and animations to the modified meshes [10, 11] or synthesize new motions [12, 13]. With the support of hardware-accelerated geometry- instancing and pseudo-instancing techniques [9, 14–16], multiple data of a character, including vertices, triangles, textures, skeletons, skinning weights, and animations, can be cached in the memory of a graphics processing unit (GPU). At each time when the virtual scene needs to be rendered, the renderer will alter and assemble those data dynamically without the need of fetching them from CPU main memory. However, storing the duplications on the GPU consumes a large amount of memory and limits the number of instances that can be rendered. Furthermore, even though the instancing technique reduces the CPU-GPU Hindawi International Journal of Computer Games Technology Volume 2019, Article ID 1792304, 15 pages https://doi.org/10.1155/2019/1792304

Upload: others

Post on 05-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Research ArticleReal-Time Large Crowd Rendering with Efficient Character andInstance Management on GPU

Yangzi Dong and Chao Peng

Department of Computer Science University of Alabama in Huntsville Huntsville AL 35899 USA

Correspondence should be addressed to Yangzi Dong yangzidonguahedu

Received 20 October 2018 Revised 26 January 2019 Accepted 3 March 2019 Published 26 March 2019

Academic Editor Ali Arya

Copyright copy 2019 Yangzi Dong and Chao PengThis is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Achieving the efficient rendering of a large animated crowd with realistic visual appearance is a challenging task when playersinteract with a complex game scene We present a real-time crowd rendering system that efficiently manages multiple types ofcharacter data on the GPU and integrates seamlessly with level-of-detail and visibility culling techniques The character dataincluding vertices triangles vertex normals texture coordinates skeletons and skinning weights are stored as either buffer objectsor textures in accordance with their access requirements at the rendering stage Our system preserves the view-dependent visualappearance of individual character instances in the crowd and is executed with a fine-grained parallelization schemeWe compareour approach with the existing crowd rendering techniques The experimental results show that our approach achieves betterrendering performance and visual quality Our approach is able to render a large crowd composed of tens of thousands of animatedinstances in real time by managing each type of character data in a single buffer object

1 Introduction

Crowd rendering is an important form of visual effects Invideo games thousands of computer-articulated polygonalcharacters with a variety of appearances can be generatedto inhabit in a virtual scene like a village a city or a forestMovements of the crowd are usually programmed through acrowd simulator [1ndash4] with given goals To achieve a realisticvisual approximation of the crowd each character is usuallytessellated with tessellation algorithms [5] which increasesthe characterrsquos mesh complexity to a sufficient level so thatfine geometric details and smooth mesh deformations canbe preserved in the virtual scene As a result the virtualscene may end up with a composition of millions of or evenhundreds of millions of vertices and triangles Rasterizingsuch massive amount of vertices and triangles into pixelsis a high computational cost Also when storing them inmemory the required amount of memory may be beyondthe storage capability of a graphic hardware Thus in theproduction of video games [6ndash9] advanced crowd renderingtechnologies are needed in order to increase the rendering

speed and reducememory consumption while preserving thecrowdrsquos visual fidelity

To improve the diversity of character appearances in acrowd a common method is duplicating a characterrsquos meshmany times and then assigning each duplication with adifferent texture and a varied animation Some advancedmethods allow developers to modify the shape proportionof duplications and then retarget rigs and animations tothe modified meshes [10 11] or synthesize new motions[12 13] With the support of hardware-accelerated geometry-instancing and pseudo-instancing techniques [9 14ndash16]multiple data of a character including vertices trianglestextures skeletons skinning weights and animations canbe cached in the memory of a graphics processing unit(GPU) At each time when the virtual scene needs to berendered the renderer will alter and assemble those datadynamically without the need of fetching them from CPUmain memory However storing the duplications on theGPU consumes a large amount of memory and limits thenumber of instances that can be rendered Furthermoreeven though the instancing technique reduces the CPU-GPU

HindawiInternational Journal of Computer Games TechnologyVolume 2019 Article ID 1792304 15 pageshttpsdoiorg10115520191792304

2 International Journal of Computer Games Technology

communication overhead it may suffer the lack of dynamicmesh adaption (eg continuous level-of-detail)

In this work we present a rendering system whichachieves a real-time rendering rate for a crowd composedof tens of thousands of animated characters The systemensures a fully utilization ofGPUmemory and computationalpower through the integration with continuous level-of-detail (LOD) and View-Frustum Culling techniques Thesize of memory allocated for each character is adjusteddynamically in response to the change of levels of detail asthe camerarsquos viewing parameters change The scene of thecrowd may end up with more than one hundred milliontriangles Different from existing instancing techniques ourapproach is capable of rendering all different charactersthrough a single buffer object for each type of data Thesystem encapsulates multiple data of each unique sourcecharacters into buffer objects and textures which can thenbe accessed quickly by shader programs on the GPU aswell as maintained efficiently by a general-purpose GPUprogramming framework

The rest of the paper is organized as follows Section 2reviews the previous works about crowd simulation andcrowd rendering Section 3 gives an overview of our systemrsquosrendering pipeline In Section 4 we describe fundamentalsof continues LOD and animation techniques and discusstheir parallelization on the GPU Section 5 describes how toprocess and store the source characterrsquos multiple data andhow to manage instances on the GPU Section 6 presents ourexperimental results and compares our approach with theexisting crowd rendering techniques We conclude our workin Section 7

2 Related Work

Simulation and rendering are two primary computing com-ponents in a crowd application They are often tightlyintegrated as an entity to enable a special form of in situvisualization which in general means data is rendered anddisplayed by a renderer in real time while a simulation isrunning and generating new data [17ndash19] One example isthe work presented by Hernandez et al [20] that simulated awandering crowd behavior and visualized it using animated3D virtual characters on GPU clusters Another example isthework presented byPerez et al [21] that simulated and visu-alized crowds in a virtual city In this section we first brieflyreview somepreviouswork contributing to crowd simulationThen more related to our work we focus on accelerationtechniques contributing to crowd rendering including level-of-detail (LOD) visibility culling and instancing techniques

A crowd simulator uses macroscopic algorithms (egcontinuum crowds [22] aggregate dynamics [23] vectorfields [24] and navigation fields [25]) or microscopic algo-rithms (eg morphable crowds [26] and socially plausiblebehaviors [27]) to create crowd motions and interactionsOutcomes of the simulator are usually a successive sequenceof time frames and each frame contains arrays of positionsand orientations in the 3D virtual environment Each pairof position and orientation information defines the global

status of a character at a given time frame McKenzie et al[28] developed a crowd simulator to generate noncombatantcivilian behaviors which is interoperable with a simulationof modern military operations Zhou et al [29] classified theexisting crowd modeling and simulation technologies basedon the size and time scale of simulated crowds and evaluatedthem based on their flexibility extensibility execution effi-ciency and scalability Zhang et al [30] presented a unifiedinteraction framework on GPU to simulate the behavior ofa crowd at interactive frame rates in a fine-grained parallelfashionMalinowski et al [31] were able to perform large scalesimulations that resulted in tens of thousands of simulatedagents

Visualizing a large number of simulated agents usinganimated characters is a challenging computing task andworth an in-depth study Beacco et al [9] surveyed previousapproaches for real-time crowd rendering They reviewedand examined existing acceleration techniques and pointedout that LOD techniques have been used widely in orderto achieve high rendering performance where a far-awaycharacter can be representedwith a coarse version of the char-acter as the alternative for rendering Awell-known approachis using discrete LOD representations which are a set ofoffline simplified versions of a mesh At the rendering stagethe renderer selects a desired version and renders it withoutany additional processing cost at runtime However DiscreteLODs require too much memory for storing all simplifiedversions of the mesh Also as mentioned by Cleju and Saupe[32] discrete LODs could cause ldquopoppingrdquo visual artifactsbecause of the unsmooth shape transition between simplifiedversions Dobbyn et al [33] introduced a hybrid renderingapproach that combines image-based and geometry-basedrendering techniques They evaluated the rendering qualityand performance in an urban crowd simulation In theirapproach the characters in the distance were rendered withimage-based LOD representations and they were switchedto geometric representations when they were within a closerdistance Although the visual quality seemed better thanusing discrete LOD representations popping artifacts alsooccurred when the renderer switches content between imagerepresentations and geometric representations Ulicny et al[34] presented an authoring tool to create crowd scenes ofthousands of characters To provide users immediate visualfeedback they used low-poly meshes for source charactersThe meshes were kept in OpenGLrsquos display lists on GPU forfast rendering

Characters in a crowd are polygonal meshes The meshis rigged by a skeleton Rotations of the skeletonrsquos jointstransform surrounding vertices and subsequently the meshcan be deformed to create animationsWhile LOD techniquesfor simplifying general polygonal meshes have been studiedmaturely (eg progressive meshes [35] quadric error metrics[36]) not many existing works studied how to simplifyanimated characters Landreneau and Schaefer [37] presentedmesh simplification criteria to preserve deforming featuresof animations on simplified versions of the mesh Theirapproach was developed based on quadric error metrics andthey added simplification criteria with the consideration ofverticesrsquo skinning weights from the joints of the skeleton and

International Journal of Computer Games Technology 3

the shape deviation between the characterrsquos rest pose anda deformed shape in animations Their approach producedmore accurate animations for dynamically simplified charac-ters than many other LOD-based approaches but it causeda higher computational cost so it may be challenging tointegrate their approach into a real-time application Will-mott [38] presented a rapid algorithm to simplify animatedcharacters The algorithm was developed based on the ideaof vertex clustering The author mentioned the possibilityof implementing the algorithm on the GPU However incomparison to the algorithm with progressive meshes itdid not produce well-simplified characters to preserve finefeatures of character appearance Feng et al [39] employedtriangular-char geometry images to preserve the features ofboth static and animated charactersTheir approach achievedhigh rendering performance by implement geometry imageswith multiresolutions on the GPU In their experimentthey demonstrated a real-time rendering rate for a crowdcomposed of 153 million triangles However there could bea potential LOD adaptation issue if the geometry imagesbecome excessively large Peng et al [8] proposed a GPU-based LOD-enabled system to render crowds along with anovel texture-preserving algorithm on simplified versions ofthe character They employed a continuous LOD techniqueto refine or reduce mesh details progressively during theruntime However their approach was based on the simu-lation of single virtual humans Instantiating and renderingmultiple types of characters were not possible in their systemSavoy et al [40] presented a web-based crowd renderingsystem that employed the discrete LOD and instancingtechniques

Visibility culling technique is another type of accelerationtechniques for crowd rendering With visibility culling arenderer is able to reject a character from the renderingpipeline if it is outside the view frustum or blocked by othercharacters or objects Visibility culling techniques do notcause any loss of visual fidelity on visible characters Tecchiaet al [41] performed efficient occlusion culling for a highlypopulated scene They subdivided the virtual environmentalmap into a 2D grid and used it to build a KD-tree of thevirtual environmentThe large and static objects in the virtualenvironment such as buildings were used as occludersThenan occlusion tree was built at each frame and merged withthe KD-tree Barczak et al [42] integrated GPU-acceleratedView-Frustum Culling and occlusion culling techniques intoa crowd rendering system The system used a vertex shaderto test whether or not the bounding sphere of a characterintersects with the view frustum A hierarchical Z bufferimage was built dynamically in a vertex shader in order toperform occlusion culling Hernandez and Isaac Rudomin[43] combined View-Frustum Culling and LOD SelectionDesired detail levels were assigned only to the charactersinside the view frustum

Instancing techniques have been commonly used forcrowd rendering Their execution is accelerated by GPUswith graphics API such as DirectX and OpenGL Zelsnack[44] presented coding details of the pseudo-instancing tech-nique using OpenGL shader language (GLSL) The pseudo-instancing technique requires per-instance calls sent to and

executed on the GPU Carucci [45] introduced the geometry-instancing technique which renders all vertices and trianglesof a crowd scene through a geometry shader using one callMillan and Rudomin [14] used the pseudo-instancing tech-nique for rendering full-detail characters which were closerto the camera The far-away characters with low details wererendered using impostors (an image-based approach) Ashrafand Zhou [46] used a hardware-accelerated method throughprogrammable shaders to animated crowds Klein et al [47]presented an approach to render configurable instances of3D characters for the web They improved XML3D to store3D content in a more efficient way in order to supportan instancing-based rendering mechanism However theirapproach lacked support for multiple character assets

3 System Overview

Our crowd rendering system first preprocesses source char-acters and then performs runtime tasks on the GPU Figure 1illustrates an overview of our system Our system integratesView-Frustum Culling and continuous LOD techniques

At the preprocessing stage a fine-to-coarse progressivemesh simplification algorithm is applied to every sourcecharacter In accordance with the edge-collapsing criteria[8 35] the simplification algorithm selects edges and thencollapses them by merging adjacent vertices iteratively andthen removes the triangles containing the collapsed edgesThe edge-collapsing operations are stored as data arrays onthe GPU Vertices and triangles can be recovered by splittingthe collapsed edges and are restored with respect to the orderof applying coarse-to-fine splitting operations Vertex normalvectors are used in our system to determine a proper shadingeffect for the crowd A bounding sphere is computed for eachsource character It tightly encloses all vertices in all framesof the characterrsquos animation The bounding sphere will beused during the runtime to test an instance against the viewfrustumNote that bounding spheresmay be in different sizesbecause the sizes of source characters may be different Otherdata including textures UVs skeletons skinning weightsand animations are packed into texturesThey can be accessedquickly by shader programs and the general-purpose GPUprogramming framework during the runtime

The runtime pipeline of our system is executed on theGPU through five parallel processing components We usean instance ID in shader programs to track the index ofeach instance which corresponds to the occurrence of asource character at a global location and orientation in thevirtual scene A unique source character ID is assigned toeach source character which is used by an instance to indexback to the source character that is instantiated from Weassume that the desired number of instances is providedby users as a parameter in the system configuration Theglobal positions and orientations of instances simulated froma crowd simulator are passed into our system as input Theydetermine where the instances should occur in the virtualscene The component of View-Frustum Culling determinesthe visibility of instances An instance will be considered to bevisible if its bounding sphere is inside or intersects with the

4 International Journal of Computer Games Technology

TexturesUVs

SkeletonsSkinning weights

Vertex normals

VerticesTriangles

Animation

Fine-to-coarse

simplification

Coarse-to-fine reordered primitives

Source character preprocessing on the CPU

Bounding spheres

Edge collapses

Buffer objects and textures in GPUrsquos global memory

View-Frustum Culling

LOD Selection

LOD Mesh Generation

Animating Meshes Rendering

of instances

Runtime pipeline on the GPU

Primitive budget

Global positions and orientations

of instances

DataParallel processing component

Figure 1 The overview of our system

view frustum The component of LOD Selection determinesthe desired detail level of the instances It is executed withan instance-level parallelization A detail level is representedas the numbers of vertices and triangles selected which areassembled from the vertex and triangle repositories Thecomponent of LODMesh Generation produces LODmeshesusing the selected vertices and triangles The size of GPUmemory may not be enough to store all the instances attheir finest levels Thus a configuration parameter called theprimitive budget is passed into the runtime pipeline as aglobal constraint to ensure the generated LOD meshes fitinto the GPUmemoryThe component of Animating Meshesattaches skeletons and animations to the simplified versions(LODmeshes) of the instances At the end of the pipeline therendering component rasterizes the LOD meshes along withappropriate textures and UVs and displays the result on thescreen

4 Fundamentals of LOD Selection andCharacter Animation

During the step of preprocessing the mesh of each sourcecharacter is simplified by collapsing edges Same as existingwork the collapsing criteria in our approach preservesfeatures at high curvature regions [39] and avoids collapsingthe edges onor crossing texture seams [8] Edges are collapsedone-by-one We utilized the same method presented in [48]which saved collapsing operations into an array structuresuitable for the GPU architecture The index of each arrayelement represents the source vertex and the value of theelement represents the target vertex it merges to By usingthe array of edge collapsing the repositories of vertices andtriangles are rearranged in an increasing order so that atruntime the desired complexity of a mesh can be generatedby selecting a successive sequence of vertices and trianglesfrom the repositories Then the skeleton-based animations

are applied to deform the simplified meshes Figure 2 showsthe different levels of detail of several source characters thatare used in our work In this section we brief the techniquesof LOD Selection and character animation

41 LOD Selection Let us denote 119870 as the total number ofinstances in the virtual scene A desired level of details foran instance can be represented as the pair of V119873119906119898 119905119873119906119898where V119873119906119898 is the desired number of vertices and 119905119873119906119898 isthe desired number of triangles Given a value of V119873119906119898 thevalue of 119905119873119906119898 can be retrieved from the prerecorded edge-collapsing information [48] Thus the goal of LOD Selectionis to determine an appropriate value of V119873119906119898 for eachinstance with considerations of the available GPU memorysize and the instancersquos spatial relationship to the camera Ifan instance is outside the view frustum V119873119906119898 is set to zeroFor the instances inside the view frustum we used the LODSelection metric in [48] to compute V119873119906119898 as shown in

V119873119906119898119894 = 1198731199081120572119894

sum119870119894=11199081120572119894

where 119908119894 = 120573119860 119894119863119894

119875120573119894 120573 = 120572 minus 1

(1)

Equation (1) is the same as the first-pass algorithmpresented by Peng and Cao [48] It originates from the modelperception method presented by Funkhouser et al [49] andis improved by Peng and Cao [48 50] to accelerate therendering of large CAD models We found that (1) is also asuitable metric for the large crowd rendering In the equation119873 refers to the total number of vertices that can be retainedon the GPU which is a user-specified value computed basedon the available size of GPU memory The value of 119873 canbe tuned to balance the rendering performance and visualquality 119908119894 is the weight computed with the projected area ofthe bounding sphere of the 119894th instance on the screen (119860 119894 ) and

International Journal of Computer Games Technology 5

(a) (b) (c) (d) (e) (f) (g)

Figure 2 The examples showing different levels of detail for seven source characters From top to bottom the numbers of triangles are asfollows (a) Alien 5688 1316 601 and 319 (b) Bug 6090 1926 734 and 434 (c) Daemon 6652 1286 612 and 444 (d) Nasty 6848 1588720 and 375 (e) Ripper Dog 4974 1448 606 and 309 (f) Spider 5868 1152 436 and 257 (g) Titan 6518 1581 681 and 362

the distance to the camera (119863119894) 120572 is the perception parameterintroduced by Funkhouser et al [49] In our work the valueof 120572 is set to 3

With V119873119906119898 and 119905119873119906119898 the successive sequences ofvertices and triangles are retrieved from the vertex andtriangle repositories of the source character By applying theparallel triangle reformation algorithm [48 50] the desiredshape of the simplified mesh is generated using the selectedvertices and triangles

42 Animation In order to create character animationseach LOD mesh has to be bound to a skeleton along withskinning weights added to influence the movement of themeshrsquos vertices As a result the mesh will be deformed byrotating joints of the skeleton As we mentioned earlier eachvertex may be influenced by a maximum of four joints Wewant to note that the vertices forming the LOD mesh are asubset of the original vertices of the source character Thereis not any new vertex introduced during the preprocess ofmesh simplification Because of this we were able to useoriginal skinning weights to influence the LOD mesh Whentransformations defined in an animation frame are loadedon the joints the final vertex position will be computedby summing the weighted transformations of the skinningjoints Let us denote each of the four joints influencing avertex V as 119869119899119905119894 where 119894 isin [0 3] The weight of 119869119899119905119894 on thevertex V is denoted as 119882119869119899119905119894 Thus the final position of thevertex V denoted as 1198751015840V can be computed by using

1198751015840V = 1198663

sum119894=0

119882119869119899119905119894119879119869119899119905119894119861minus1119869119899119905119894

119875V where3

sum119894=0

119882119869119899119905119894 = 1 (2)

In (2) 119875V is the vertex position at the time when themesh is bound to the skeleton When the mesh is first loaded

without the use of animation data the mesh is placed in theinitial binding poseWhen using an animation the inverse ofthe binding pose needs to be multiplied by an animated poseThis is reflected in the equation where 119861minus1119869119899119905119894 is the inverse ofbinding transformation of the joint 119869119899119905119894 and 119879119869119899119905119894 representsthe transformation of the joint 119869119899119905119894 from the current frameof the animation 119866 is the transformation representing theinstancersquos global position and orientation Note that thetransformations 119861minus1119869119899119905119894 119879119869119899119905119894 and119866 are represented in the formof 4times4matrixThe weight119882119869119899119905119894 is a single float value and thefour weight values must sum to 1

5 Source Character and Instance Management

Geometry-instancing and pseudo-instancing techniques arethe primary solutions for rendering a large number ofinstances while allowing the instances to have differentglobal transformations The pseudo-instancing technique isused in OpenGL and calls instancesrsquo drawing functions one-by-one The geometry-instancing technique is included inDirect3D since the version 9 and in OpenGL since version33 It advances the pseudo-instancing technique in terms ofreducing the number of drawing calls It supports the use ofa single drawing call for instances of a mesh and thereforereduces the communication cost of sending call requests fromCPU to GPU and subsequently increases the performanceAs regards data storage on the GPU buffer objects are usedfor shader programs to access and update data quickly Abuffer object is a continuous memory block on the GPUand allows the renderer to rasterize data in a retained modeIn particular a vertex buffer object (VBO) stores verticesAn index buffer object (IBO) stores indices of vertices thatform triangles or other polygonal types used in our system

6 International Journal of Computer Games Technology

In particular the geometry-instancing technique requires asingle copy of vertex data maintained in the VBO a singlecopy of triangle data maintained in the IBO and a single copyof distinct world transformations of all instances Howeverif the source character has a high geometric complexity andthere are lots of instances the geometry-instancing techniquemay make the uniform data type in shaders hit the size limitdue to the large amount of vertices and triangles sent to theGPU In such case the drawing call has to be broken intomultiple calls

There are two types of implementations for instancingtechniques static batching and dynamic batching [45] Thesingle-call method in the geometry-instancing technique isimplementedwith static batching while themulticallmethodin both the pseudo-instancing and geometry-instancing tech-niques are implemented with dynamic batching In staticbatching all vertices and triangles of the instances are savedinto a VBO and IBO In dynamic batching the verticesand triangles are maintained in different buffer objects anddrawn separately The implementation with static batchinghas the potential to fully utilize the GPU memory whiledynamic batchingwould underutilize thememoryThemajorlimitation of static batching is the lack of LOD and skinningsupports This limitation makes the static batching notsuitable for rendering animated instances though it has beenproved to be faster than dynamic batching in terms of theperformance of rasterizing meshes

In our work the storage of instances is managed similarlyto the implementation of static batching while individualinstances can still be accessed similarly to the implementationof dynamic batching Therefore our approach can be seam-lessly integrated with LOD and skinning techniques whiletaking the use of a single VBO and IBO for fast renderingThis section describes the details of our contributions forcharacter and instance management including texture pack-ing UV-guided mesh rebuilding and instance indexing

51 Packing Skeleton Animation and Skinning Weights intoTextures Smooth deformation of a 3D mesh is a computa-tionally expensive process because each vertex of the meshneeds to be repositioned by the joints that influence itWe packed the skeleton animations and skinning weightsinto 2D textures on the GPU so that shader programs canaccess them quickly The skeleton is the binding pose of thecharacter As explained in (2) the inverse of the bindingposersquos transformation is used during the runtime In ourapproach we stored this inverse into the texture as the skeletalinformation For each joint of the skeleton instead of storingindividual translation rotation and scale values we storedtheir composed transformation in the form of a 4 times 4matrixEach jointrsquos binding pose transformation matrix takes fourRGBA texels for storage Each RGBA texel stores a rowof the matrix Each channel stores a single element of thematrix By usingOpenGLmatrices are stored as the format ofGL RGBA32F in the texture which is a 32-bit floating-pointtype for each channel in one texel Let us denote the totalnumber of joints in a skeleton as 119870 Then the total numberof texels to store the entire skeleton is 4119870

We used the same format for storing the skeleton to storean animation Each animation frame needs 4119870 texels to storethe jointsrsquo transformation matrices Let us denote the totalnumber of frames in an animation as 119876 Then the totalnumber of texels for storing the entire animation is 4119870119876For each animation frame the matrix elements are saved intosuccessive texels in the row order Here we want to note thateach animation frame starts from a new row in the texture

The skinning weights of each vertex are four values in therange of [0 1] where each value represents the influencingpercentage of a skinning joint For each vertex the skinningweights require eight data elements where the first fourdata elements are joint indices and the last four are thecorresponding weights In other words each vertex requirestwo RGBA texels to store the skinning weights The first texelis used to store joint indices and the second texel is used tostore weights

52 UV-Guided Mesh Rebuilding A 3D mesh is usually aseamless surface without boundary edges The mesh has tobe cut and unfolded into 2D flatten patches before a textureimage can be mapped onto it To do this some edges haveto be selected properly as boundary edges from which themesh can be cut The relationship between the vertices ofa 3D mesh and 2D texture coordinates can be described asa texture mapping function 119865(119909 119910 119911) 997888rarr (119904119894 119905119894) Innervertices (those not on boundary edges) have a one-to-onetexturemapping In other words each inner vertex ismappedto a single pair of texture coordinates For the vertices onboundary edges since boundary edges are the cutting seamsa boundary vertex needs to be mapped to multiple pairs oftexture coordinates Figure 3 shows an example that unfolds acubemesh andmaps it into a flatten patch in 2D texture spaceIn the figure 119906119894 stands for a point in the 2D texture spaceEach vertex on the boundary edges is mapped to more thanone points which are 119865(V0) = 1199061 1199063 119865(V3) = 11990610 11990612119865(V4) = 1199060 1199064 1199066 and 119865(V7) = 1199067 1199069 11990613

In a hardware-accelerated renderer texture coordinatesare indexed from a buffer object and each vertex shouldassociate with a single pair of texture coordinates Since thetexture mapping function produces more than one pairs oftexture coordinates for boundary vertices we conducted amesh rebuilding process to duplicate boundary vertices andmapped each duplicated one to a unique texture point Bydoing this although the number of vertices is increased dueto the cuttings on boundary edges the number of trianglesis the same as the number of triangles in the original meshIn our approach we initialized two arrays to store UVinformation One array stores texture coordinates the otherarray stores the indices of texture points with respect to theorder of triangle storage Algorithm 1 shows the algorithmicprocess to duplicate boundary vertices by looping throughall triangles In the algorithm 119881119890119903119905119904 is the array of originalvertices storing 3D coordinates (119909 119910 119911) 119879119903119894119904 is the arrayof original triangles storing the sequence of vertex indicesSimilar to 119879119903119894119904 119879119890119909119868119899119909 is the array of indices of texturepoints in 2D texture space and represents the same triangulartopology as the mesh Note that the order of triangle storage

International Journal of Computer Games Technology 7

v0

v1 v2

v3

v4

v5 v6

v7

(a)

u1 (v0)

u3(v0)

u4 (v4)u5 (v5)

u6 (v4) u7 (v 7)

u8 (v6)u9 (v7)

u10 (v3)u11 (v2)u2 (v1)

u12 (v3)

u13 (v7)u0 (v4)

(b)

Figure 3 An example showing the process of unfolding a cube mesh and mapping it into a flatten patch in 2D texture space (a) is the 3Dcube mesh formed by 8 vertices and 6 triangles (b) is the unfolded texture map formed by 14 pairs of texture coordinates and 6 trianglesBold lines are the boundary edges (seams) to cut the cube and the vertices in red are boundary vertices that are mapped into multiple pairsof texture coordinates In (b) 119906119894 stands for a point (119904119894 119905119894) in 2D texture space and V119894 in the parenthesis is the corresponding vertex in the cube

RebuildMesh(Input 119881119890119903119905119904 119879119903119894119904 119879119890119909119862119900119900119903119889119904119873119900119903119898119904 119879119890119909119868119899119909 119900119903119894119879119903119894119873119906119898 119905119890119909119862119900119900119903119889119873119906119898Output 1198811198901199031199051199041015840 11987911990311989411990410158401198731199001199031198981199041015840)

(1) 119888ℎ119890119888119896119890119889[119905119890119909119862119900119900119903119889119873119906119898] larr997888 119891119886119897119904119890(2) for each triangle 119894 of 119900119903119894119879119903119894119873119906119898 do(3) for each vertex 119895 in the 119894th triangle do(4) 119905119890119909119875119900119894119899119905 119894119889 larr997888 119879119890119909119868119899119909[3 lowast 119894 + 119895](5) if 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] = 119891119886119897119904119890 then(6) 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] larr997888 119905119903119906119890(7) V119890119903119905 119894119889 larr997888 119879119903119894119904[3 lowast 119894 + 119895](8) for each coordinate 119896 in the 119896th vertex do(9) 1198811198901199031199051199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119881119890119903119905119904[3 lowast V119890119903119905 119894119889 + 119896](10) 1198731199001199031198981199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119873119900119903119898119904[3 lowast V119890119903119905 119894119889 + 119896](11) end for(12) end if(13) end for(14) end for(15) 1198791199031198941199041015840 larr997888 119879119890119909119868119899119909

Algorithm 1 UV-Guided mesh rebuilding algorithm

for the mesh is the same as the order of triangle storage forthe 2D texture patches 119873119900119903119898119904 is the array of vertex normalvectors 119900119903119894119879119903119894119873119906119898 is the total number of original trianglesand 119905119890119909119862119900119900119903119889119873119906119898 is the number of texture points in 2Dtexture space

After rebuilding the mesh the number of vertices in1198811198901199031199051199041015840 will be identical to the number of texture points119905119890119909119862119900119900119903119889119873119906119898 and the array of triangles (1198791199031198941199041015840) is replacedby the array of indices of the texture points (119879119890119909119868119899119909)

53 Source Character and Instance Indexing After applyingthe data packing and mesh rebuilding methods presentedin Sections 51 and 52 the multiple data of a source char-acter are organized into GPU-friendly data structures Thecharacterrsquos skeleton skinning weights and animations arepacked into textures and read-only in shader programs on

the GPU The vertices triangles texture coordinates andvertex normal vectors are stored in arrays and retained onthe GPU During the runtime based on the LOD Selectionresult (see Section 41) a successive subsequence of verticestriangles texture coordinates and vertex normal vectorsare selected for each instance and maintained as singlebuffer objects As mentioned in Section 41 the simplifiedinstances are constructed in a parallel fashion through ageneral-purpose GPU programming framework Then theframework interoperates with the GPUrsquos shader programsand allows shaders to perform rendering tasks for theinstances Because continuous LOD and animated instancingtechniques are assumed to be used in our approach instanceshave to be rendered one-by-one which is the same as the wayof rendering animated instances in geometry-instancing andpseudo-instancing techniques However our approach needs

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

2 International Journal of Computer Games Technology

communication overhead it may suffer the lack of dynamicmesh adaption (eg continuous level-of-detail)

In this work we present a rendering system whichachieves a real-time rendering rate for a crowd composedof tens of thousands of animated characters The systemensures a fully utilization ofGPUmemory and computationalpower through the integration with continuous level-of-detail (LOD) and View-Frustum Culling techniques Thesize of memory allocated for each character is adjusteddynamically in response to the change of levels of detail asthe camerarsquos viewing parameters change The scene of thecrowd may end up with more than one hundred milliontriangles Different from existing instancing techniques ourapproach is capable of rendering all different charactersthrough a single buffer object for each type of data Thesystem encapsulates multiple data of each unique sourcecharacters into buffer objects and textures which can thenbe accessed quickly by shader programs on the GPU aswell as maintained efficiently by a general-purpose GPUprogramming framework

The rest of the paper is organized as follows Section 2reviews the previous works about crowd simulation andcrowd rendering Section 3 gives an overview of our systemrsquosrendering pipeline In Section 4 we describe fundamentalsof continues LOD and animation techniques and discusstheir parallelization on the GPU Section 5 describes how toprocess and store the source characterrsquos multiple data andhow to manage instances on the GPU Section 6 presents ourexperimental results and compares our approach with theexisting crowd rendering techniques We conclude our workin Section 7

2 Related Work

Simulation and rendering are two primary computing com-ponents in a crowd application They are often tightlyintegrated as an entity to enable a special form of in situvisualization which in general means data is rendered anddisplayed by a renderer in real time while a simulation isrunning and generating new data [17ndash19] One example isthe work presented by Hernandez et al [20] that simulated awandering crowd behavior and visualized it using animated3D virtual characters on GPU clusters Another example isthework presented byPerez et al [21] that simulated and visu-alized crowds in a virtual city In this section we first brieflyreview somepreviouswork contributing to crowd simulationThen more related to our work we focus on accelerationtechniques contributing to crowd rendering including level-of-detail (LOD) visibility culling and instancing techniques

A crowd simulator uses macroscopic algorithms (egcontinuum crowds [22] aggregate dynamics [23] vectorfields [24] and navigation fields [25]) or microscopic algo-rithms (eg morphable crowds [26] and socially plausiblebehaviors [27]) to create crowd motions and interactionsOutcomes of the simulator are usually a successive sequenceof time frames and each frame contains arrays of positionsand orientations in the 3D virtual environment Each pairof position and orientation information defines the global

status of a character at a given time frame McKenzie et al[28] developed a crowd simulator to generate noncombatantcivilian behaviors which is interoperable with a simulationof modern military operations Zhou et al [29] classified theexisting crowd modeling and simulation technologies basedon the size and time scale of simulated crowds and evaluatedthem based on their flexibility extensibility execution effi-ciency and scalability Zhang et al [30] presented a unifiedinteraction framework on GPU to simulate the behavior ofa crowd at interactive frame rates in a fine-grained parallelfashionMalinowski et al [31] were able to perform large scalesimulations that resulted in tens of thousands of simulatedagents

Visualizing a large number of simulated agents usinganimated characters is a challenging computing task andworth an in-depth study Beacco et al [9] surveyed previousapproaches for real-time crowd rendering They reviewedand examined existing acceleration techniques and pointedout that LOD techniques have been used widely in orderto achieve high rendering performance where a far-awaycharacter can be representedwith a coarse version of the char-acter as the alternative for rendering Awell-known approachis using discrete LOD representations which are a set ofoffline simplified versions of a mesh At the rendering stagethe renderer selects a desired version and renders it withoutany additional processing cost at runtime However DiscreteLODs require too much memory for storing all simplifiedversions of the mesh Also as mentioned by Cleju and Saupe[32] discrete LODs could cause ldquopoppingrdquo visual artifactsbecause of the unsmooth shape transition between simplifiedversions Dobbyn et al [33] introduced a hybrid renderingapproach that combines image-based and geometry-basedrendering techniques They evaluated the rendering qualityand performance in an urban crowd simulation In theirapproach the characters in the distance were rendered withimage-based LOD representations and they were switchedto geometric representations when they were within a closerdistance Although the visual quality seemed better thanusing discrete LOD representations popping artifacts alsooccurred when the renderer switches content between imagerepresentations and geometric representations Ulicny et al[34] presented an authoring tool to create crowd scenes ofthousands of characters To provide users immediate visualfeedback they used low-poly meshes for source charactersThe meshes were kept in OpenGLrsquos display lists on GPU forfast rendering

Characters in a crowd are polygonal meshes The meshis rigged by a skeleton Rotations of the skeletonrsquos jointstransform surrounding vertices and subsequently the meshcan be deformed to create animationsWhile LOD techniquesfor simplifying general polygonal meshes have been studiedmaturely (eg progressive meshes [35] quadric error metrics[36]) not many existing works studied how to simplifyanimated characters Landreneau and Schaefer [37] presentedmesh simplification criteria to preserve deforming featuresof animations on simplified versions of the mesh Theirapproach was developed based on quadric error metrics andthey added simplification criteria with the consideration ofverticesrsquo skinning weights from the joints of the skeleton and

International Journal of Computer Games Technology 3

the shape deviation between the characterrsquos rest pose anda deformed shape in animations Their approach producedmore accurate animations for dynamically simplified charac-ters than many other LOD-based approaches but it causeda higher computational cost so it may be challenging tointegrate their approach into a real-time application Will-mott [38] presented a rapid algorithm to simplify animatedcharacters The algorithm was developed based on the ideaof vertex clustering The author mentioned the possibilityof implementing the algorithm on the GPU However incomparison to the algorithm with progressive meshes itdid not produce well-simplified characters to preserve finefeatures of character appearance Feng et al [39] employedtriangular-char geometry images to preserve the features ofboth static and animated charactersTheir approach achievedhigh rendering performance by implement geometry imageswith multiresolutions on the GPU In their experimentthey demonstrated a real-time rendering rate for a crowdcomposed of 153 million triangles However there could bea potential LOD adaptation issue if the geometry imagesbecome excessively large Peng et al [8] proposed a GPU-based LOD-enabled system to render crowds along with anovel texture-preserving algorithm on simplified versions ofthe character They employed a continuous LOD techniqueto refine or reduce mesh details progressively during theruntime However their approach was based on the simu-lation of single virtual humans Instantiating and renderingmultiple types of characters were not possible in their systemSavoy et al [40] presented a web-based crowd renderingsystem that employed the discrete LOD and instancingtechniques

Visibility culling technique is another type of accelerationtechniques for crowd rendering With visibility culling arenderer is able to reject a character from the renderingpipeline if it is outside the view frustum or blocked by othercharacters or objects Visibility culling techniques do notcause any loss of visual fidelity on visible characters Tecchiaet al [41] performed efficient occlusion culling for a highlypopulated scene They subdivided the virtual environmentalmap into a 2D grid and used it to build a KD-tree of thevirtual environmentThe large and static objects in the virtualenvironment such as buildings were used as occludersThenan occlusion tree was built at each frame and merged withthe KD-tree Barczak et al [42] integrated GPU-acceleratedView-Frustum Culling and occlusion culling techniques intoa crowd rendering system The system used a vertex shaderto test whether or not the bounding sphere of a characterintersects with the view frustum A hierarchical Z bufferimage was built dynamically in a vertex shader in order toperform occlusion culling Hernandez and Isaac Rudomin[43] combined View-Frustum Culling and LOD SelectionDesired detail levels were assigned only to the charactersinside the view frustum

Instancing techniques have been commonly used forcrowd rendering Their execution is accelerated by GPUswith graphics API such as DirectX and OpenGL Zelsnack[44] presented coding details of the pseudo-instancing tech-nique using OpenGL shader language (GLSL) The pseudo-instancing technique requires per-instance calls sent to and

executed on the GPU Carucci [45] introduced the geometry-instancing technique which renders all vertices and trianglesof a crowd scene through a geometry shader using one callMillan and Rudomin [14] used the pseudo-instancing tech-nique for rendering full-detail characters which were closerto the camera The far-away characters with low details wererendered using impostors (an image-based approach) Ashrafand Zhou [46] used a hardware-accelerated method throughprogrammable shaders to animated crowds Klein et al [47]presented an approach to render configurable instances of3D characters for the web They improved XML3D to store3D content in a more efficient way in order to supportan instancing-based rendering mechanism However theirapproach lacked support for multiple character assets

3 System Overview

Our crowd rendering system first preprocesses source char-acters and then performs runtime tasks on the GPU Figure 1illustrates an overview of our system Our system integratesView-Frustum Culling and continuous LOD techniques

At the preprocessing stage a fine-to-coarse progressivemesh simplification algorithm is applied to every sourcecharacter In accordance with the edge-collapsing criteria[8 35] the simplification algorithm selects edges and thencollapses them by merging adjacent vertices iteratively andthen removes the triangles containing the collapsed edgesThe edge-collapsing operations are stored as data arrays onthe GPU Vertices and triangles can be recovered by splittingthe collapsed edges and are restored with respect to the orderof applying coarse-to-fine splitting operations Vertex normalvectors are used in our system to determine a proper shadingeffect for the crowd A bounding sphere is computed for eachsource character It tightly encloses all vertices in all framesof the characterrsquos animation The bounding sphere will beused during the runtime to test an instance against the viewfrustumNote that bounding spheresmay be in different sizesbecause the sizes of source characters may be different Otherdata including textures UVs skeletons skinning weightsand animations are packed into texturesThey can be accessedquickly by shader programs and the general-purpose GPUprogramming framework during the runtime

The runtime pipeline of our system is executed on theGPU through five parallel processing components We usean instance ID in shader programs to track the index ofeach instance which corresponds to the occurrence of asource character at a global location and orientation in thevirtual scene A unique source character ID is assigned toeach source character which is used by an instance to indexback to the source character that is instantiated from Weassume that the desired number of instances is providedby users as a parameter in the system configuration Theglobal positions and orientations of instances simulated froma crowd simulator are passed into our system as input Theydetermine where the instances should occur in the virtualscene The component of View-Frustum Culling determinesthe visibility of instances An instance will be considered to bevisible if its bounding sphere is inside or intersects with the

4 International Journal of Computer Games Technology

TexturesUVs

SkeletonsSkinning weights

Vertex normals

VerticesTriangles

Animation

Fine-to-coarse

simplification

Coarse-to-fine reordered primitives

Source character preprocessing on the CPU

Bounding spheres

Edge collapses

Buffer objects and textures in GPUrsquos global memory

View-Frustum Culling

LOD Selection

LOD Mesh Generation

Animating Meshes Rendering

of instances

Runtime pipeline on the GPU

Primitive budget

Global positions and orientations

of instances

DataParallel processing component

Figure 1 The overview of our system

view frustum The component of LOD Selection determinesthe desired detail level of the instances It is executed withan instance-level parallelization A detail level is representedas the numbers of vertices and triangles selected which areassembled from the vertex and triangle repositories Thecomponent of LODMesh Generation produces LODmeshesusing the selected vertices and triangles The size of GPUmemory may not be enough to store all the instances attheir finest levels Thus a configuration parameter called theprimitive budget is passed into the runtime pipeline as aglobal constraint to ensure the generated LOD meshes fitinto the GPUmemoryThe component of Animating Meshesattaches skeletons and animations to the simplified versions(LODmeshes) of the instances At the end of the pipeline therendering component rasterizes the LOD meshes along withappropriate textures and UVs and displays the result on thescreen

4 Fundamentals of LOD Selection andCharacter Animation

During the step of preprocessing the mesh of each sourcecharacter is simplified by collapsing edges Same as existingwork the collapsing criteria in our approach preservesfeatures at high curvature regions [39] and avoids collapsingthe edges onor crossing texture seams [8] Edges are collapsedone-by-one We utilized the same method presented in [48]which saved collapsing operations into an array structuresuitable for the GPU architecture The index of each arrayelement represents the source vertex and the value of theelement represents the target vertex it merges to By usingthe array of edge collapsing the repositories of vertices andtriangles are rearranged in an increasing order so that atruntime the desired complexity of a mesh can be generatedby selecting a successive sequence of vertices and trianglesfrom the repositories Then the skeleton-based animations

are applied to deform the simplified meshes Figure 2 showsthe different levels of detail of several source characters thatare used in our work In this section we brief the techniquesof LOD Selection and character animation

41 LOD Selection Let us denote 119870 as the total number ofinstances in the virtual scene A desired level of details foran instance can be represented as the pair of V119873119906119898 119905119873119906119898where V119873119906119898 is the desired number of vertices and 119905119873119906119898 isthe desired number of triangles Given a value of V119873119906119898 thevalue of 119905119873119906119898 can be retrieved from the prerecorded edge-collapsing information [48] Thus the goal of LOD Selectionis to determine an appropriate value of V119873119906119898 for eachinstance with considerations of the available GPU memorysize and the instancersquos spatial relationship to the camera Ifan instance is outside the view frustum V119873119906119898 is set to zeroFor the instances inside the view frustum we used the LODSelection metric in [48] to compute V119873119906119898 as shown in

V119873119906119898119894 = 1198731199081120572119894

sum119870119894=11199081120572119894

where 119908119894 = 120573119860 119894119863119894

119875120573119894 120573 = 120572 minus 1

(1)

Equation (1) is the same as the first-pass algorithmpresented by Peng and Cao [48] It originates from the modelperception method presented by Funkhouser et al [49] andis improved by Peng and Cao [48 50] to accelerate therendering of large CAD models We found that (1) is also asuitable metric for the large crowd rendering In the equation119873 refers to the total number of vertices that can be retainedon the GPU which is a user-specified value computed basedon the available size of GPU memory The value of 119873 canbe tuned to balance the rendering performance and visualquality 119908119894 is the weight computed with the projected area ofthe bounding sphere of the 119894th instance on the screen (119860 119894 ) and

International Journal of Computer Games Technology 5

(a) (b) (c) (d) (e) (f) (g)

Figure 2 The examples showing different levels of detail for seven source characters From top to bottom the numbers of triangles are asfollows (a) Alien 5688 1316 601 and 319 (b) Bug 6090 1926 734 and 434 (c) Daemon 6652 1286 612 and 444 (d) Nasty 6848 1588720 and 375 (e) Ripper Dog 4974 1448 606 and 309 (f) Spider 5868 1152 436 and 257 (g) Titan 6518 1581 681 and 362

the distance to the camera (119863119894) 120572 is the perception parameterintroduced by Funkhouser et al [49] In our work the valueof 120572 is set to 3

With V119873119906119898 and 119905119873119906119898 the successive sequences ofvertices and triangles are retrieved from the vertex andtriangle repositories of the source character By applying theparallel triangle reformation algorithm [48 50] the desiredshape of the simplified mesh is generated using the selectedvertices and triangles

42 Animation In order to create character animationseach LOD mesh has to be bound to a skeleton along withskinning weights added to influence the movement of themeshrsquos vertices As a result the mesh will be deformed byrotating joints of the skeleton As we mentioned earlier eachvertex may be influenced by a maximum of four joints Wewant to note that the vertices forming the LOD mesh are asubset of the original vertices of the source character Thereis not any new vertex introduced during the preprocess ofmesh simplification Because of this we were able to useoriginal skinning weights to influence the LOD mesh Whentransformations defined in an animation frame are loadedon the joints the final vertex position will be computedby summing the weighted transformations of the skinningjoints Let us denote each of the four joints influencing avertex V as 119869119899119905119894 where 119894 isin [0 3] The weight of 119869119899119905119894 on thevertex V is denoted as 119882119869119899119905119894 Thus the final position of thevertex V denoted as 1198751015840V can be computed by using

1198751015840V = 1198663

sum119894=0

119882119869119899119905119894119879119869119899119905119894119861minus1119869119899119905119894

119875V where3

sum119894=0

119882119869119899119905119894 = 1 (2)

In (2) 119875V is the vertex position at the time when themesh is bound to the skeleton When the mesh is first loaded

without the use of animation data the mesh is placed in theinitial binding poseWhen using an animation the inverse ofthe binding pose needs to be multiplied by an animated poseThis is reflected in the equation where 119861minus1119869119899119905119894 is the inverse ofbinding transformation of the joint 119869119899119905119894 and 119879119869119899119905119894 representsthe transformation of the joint 119869119899119905119894 from the current frameof the animation 119866 is the transformation representing theinstancersquos global position and orientation Note that thetransformations 119861minus1119869119899119905119894 119879119869119899119905119894 and119866 are represented in the formof 4times4matrixThe weight119882119869119899119905119894 is a single float value and thefour weight values must sum to 1

5 Source Character and Instance Management

Geometry-instancing and pseudo-instancing techniques arethe primary solutions for rendering a large number ofinstances while allowing the instances to have differentglobal transformations The pseudo-instancing technique isused in OpenGL and calls instancesrsquo drawing functions one-by-one The geometry-instancing technique is included inDirect3D since the version 9 and in OpenGL since version33 It advances the pseudo-instancing technique in terms ofreducing the number of drawing calls It supports the use ofa single drawing call for instances of a mesh and thereforereduces the communication cost of sending call requests fromCPU to GPU and subsequently increases the performanceAs regards data storage on the GPU buffer objects are usedfor shader programs to access and update data quickly Abuffer object is a continuous memory block on the GPUand allows the renderer to rasterize data in a retained modeIn particular a vertex buffer object (VBO) stores verticesAn index buffer object (IBO) stores indices of vertices thatform triangles or other polygonal types used in our system

6 International Journal of Computer Games Technology

In particular the geometry-instancing technique requires asingle copy of vertex data maintained in the VBO a singlecopy of triangle data maintained in the IBO and a single copyof distinct world transformations of all instances Howeverif the source character has a high geometric complexity andthere are lots of instances the geometry-instancing techniquemay make the uniform data type in shaders hit the size limitdue to the large amount of vertices and triangles sent to theGPU In such case the drawing call has to be broken intomultiple calls

There are two types of implementations for instancingtechniques static batching and dynamic batching [45] Thesingle-call method in the geometry-instancing technique isimplementedwith static batching while themulticallmethodin both the pseudo-instancing and geometry-instancing tech-niques are implemented with dynamic batching In staticbatching all vertices and triangles of the instances are savedinto a VBO and IBO In dynamic batching the verticesand triangles are maintained in different buffer objects anddrawn separately The implementation with static batchinghas the potential to fully utilize the GPU memory whiledynamic batchingwould underutilize thememoryThemajorlimitation of static batching is the lack of LOD and skinningsupports This limitation makes the static batching notsuitable for rendering animated instances though it has beenproved to be faster than dynamic batching in terms of theperformance of rasterizing meshes

In our work the storage of instances is managed similarlyto the implementation of static batching while individualinstances can still be accessed similarly to the implementationof dynamic batching Therefore our approach can be seam-lessly integrated with LOD and skinning techniques whiletaking the use of a single VBO and IBO for fast renderingThis section describes the details of our contributions forcharacter and instance management including texture pack-ing UV-guided mesh rebuilding and instance indexing

51 Packing Skeleton Animation and Skinning Weights intoTextures Smooth deformation of a 3D mesh is a computa-tionally expensive process because each vertex of the meshneeds to be repositioned by the joints that influence itWe packed the skeleton animations and skinning weightsinto 2D textures on the GPU so that shader programs canaccess them quickly The skeleton is the binding pose of thecharacter As explained in (2) the inverse of the bindingposersquos transformation is used during the runtime In ourapproach we stored this inverse into the texture as the skeletalinformation For each joint of the skeleton instead of storingindividual translation rotation and scale values we storedtheir composed transformation in the form of a 4 times 4matrixEach jointrsquos binding pose transformation matrix takes fourRGBA texels for storage Each RGBA texel stores a rowof the matrix Each channel stores a single element of thematrix By usingOpenGLmatrices are stored as the format ofGL RGBA32F in the texture which is a 32-bit floating-pointtype for each channel in one texel Let us denote the totalnumber of joints in a skeleton as 119870 Then the total numberof texels to store the entire skeleton is 4119870

We used the same format for storing the skeleton to storean animation Each animation frame needs 4119870 texels to storethe jointsrsquo transformation matrices Let us denote the totalnumber of frames in an animation as 119876 Then the totalnumber of texels for storing the entire animation is 4119870119876For each animation frame the matrix elements are saved intosuccessive texels in the row order Here we want to note thateach animation frame starts from a new row in the texture

The skinning weights of each vertex are four values in therange of [0 1] where each value represents the influencingpercentage of a skinning joint For each vertex the skinningweights require eight data elements where the first fourdata elements are joint indices and the last four are thecorresponding weights In other words each vertex requirestwo RGBA texels to store the skinning weights The first texelis used to store joint indices and the second texel is used tostore weights

52 UV-Guided Mesh Rebuilding A 3D mesh is usually aseamless surface without boundary edges The mesh has tobe cut and unfolded into 2D flatten patches before a textureimage can be mapped onto it To do this some edges haveto be selected properly as boundary edges from which themesh can be cut The relationship between the vertices ofa 3D mesh and 2D texture coordinates can be described asa texture mapping function 119865(119909 119910 119911) 997888rarr (119904119894 119905119894) Innervertices (those not on boundary edges) have a one-to-onetexturemapping In other words each inner vertex ismappedto a single pair of texture coordinates For the vertices onboundary edges since boundary edges are the cutting seamsa boundary vertex needs to be mapped to multiple pairs oftexture coordinates Figure 3 shows an example that unfolds acubemesh andmaps it into a flatten patch in 2D texture spaceIn the figure 119906119894 stands for a point in the 2D texture spaceEach vertex on the boundary edges is mapped to more thanone points which are 119865(V0) = 1199061 1199063 119865(V3) = 11990610 11990612119865(V4) = 1199060 1199064 1199066 and 119865(V7) = 1199067 1199069 11990613

In a hardware-accelerated renderer texture coordinatesare indexed from a buffer object and each vertex shouldassociate with a single pair of texture coordinates Since thetexture mapping function produces more than one pairs oftexture coordinates for boundary vertices we conducted amesh rebuilding process to duplicate boundary vertices andmapped each duplicated one to a unique texture point Bydoing this although the number of vertices is increased dueto the cuttings on boundary edges the number of trianglesis the same as the number of triangles in the original meshIn our approach we initialized two arrays to store UVinformation One array stores texture coordinates the otherarray stores the indices of texture points with respect to theorder of triangle storage Algorithm 1 shows the algorithmicprocess to duplicate boundary vertices by looping throughall triangles In the algorithm 119881119890119903119905119904 is the array of originalvertices storing 3D coordinates (119909 119910 119911) 119879119903119894119904 is the arrayof original triangles storing the sequence of vertex indicesSimilar to 119879119903119894119904 119879119890119909119868119899119909 is the array of indices of texturepoints in 2D texture space and represents the same triangulartopology as the mesh Note that the order of triangle storage

International Journal of Computer Games Technology 7

v0

v1 v2

v3

v4

v5 v6

v7

(a)

u1 (v0)

u3(v0)

u4 (v4)u5 (v5)

u6 (v4) u7 (v 7)

u8 (v6)u9 (v7)

u10 (v3)u11 (v2)u2 (v1)

u12 (v3)

u13 (v7)u0 (v4)

(b)

Figure 3 An example showing the process of unfolding a cube mesh and mapping it into a flatten patch in 2D texture space (a) is the 3Dcube mesh formed by 8 vertices and 6 triangles (b) is the unfolded texture map formed by 14 pairs of texture coordinates and 6 trianglesBold lines are the boundary edges (seams) to cut the cube and the vertices in red are boundary vertices that are mapped into multiple pairsof texture coordinates In (b) 119906119894 stands for a point (119904119894 119905119894) in 2D texture space and V119894 in the parenthesis is the corresponding vertex in the cube

RebuildMesh(Input 119881119890119903119905119904 119879119903119894119904 119879119890119909119862119900119900119903119889119904119873119900119903119898119904 119879119890119909119868119899119909 119900119903119894119879119903119894119873119906119898 119905119890119909119862119900119900119903119889119873119906119898Output 1198811198901199031199051199041015840 11987911990311989411990410158401198731199001199031198981199041015840)

(1) 119888ℎ119890119888119896119890119889[119905119890119909119862119900119900119903119889119873119906119898] larr997888 119891119886119897119904119890(2) for each triangle 119894 of 119900119903119894119879119903119894119873119906119898 do(3) for each vertex 119895 in the 119894th triangle do(4) 119905119890119909119875119900119894119899119905 119894119889 larr997888 119879119890119909119868119899119909[3 lowast 119894 + 119895](5) if 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] = 119891119886119897119904119890 then(6) 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] larr997888 119905119903119906119890(7) V119890119903119905 119894119889 larr997888 119879119903119894119904[3 lowast 119894 + 119895](8) for each coordinate 119896 in the 119896th vertex do(9) 1198811198901199031199051199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119881119890119903119905119904[3 lowast V119890119903119905 119894119889 + 119896](10) 1198731199001199031198981199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119873119900119903119898119904[3 lowast V119890119903119905 119894119889 + 119896](11) end for(12) end if(13) end for(14) end for(15) 1198791199031198941199041015840 larr997888 119879119890119909119868119899119909

Algorithm 1 UV-Guided mesh rebuilding algorithm

for the mesh is the same as the order of triangle storage forthe 2D texture patches 119873119900119903119898119904 is the array of vertex normalvectors 119900119903119894119879119903119894119873119906119898 is the total number of original trianglesand 119905119890119909119862119900119900119903119889119873119906119898 is the number of texture points in 2Dtexture space

After rebuilding the mesh the number of vertices in1198811198901199031199051199041015840 will be identical to the number of texture points119905119890119909119862119900119900119903119889119873119906119898 and the array of triangles (1198791199031198941199041015840) is replacedby the array of indices of the texture points (119879119890119909119868119899119909)

53 Source Character and Instance Indexing After applyingthe data packing and mesh rebuilding methods presentedin Sections 51 and 52 the multiple data of a source char-acter are organized into GPU-friendly data structures Thecharacterrsquos skeleton skinning weights and animations arepacked into textures and read-only in shader programs on

the GPU The vertices triangles texture coordinates andvertex normal vectors are stored in arrays and retained onthe GPU During the runtime based on the LOD Selectionresult (see Section 41) a successive subsequence of verticestriangles texture coordinates and vertex normal vectorsare selected for each instance and maintained as singlebuffer objects As mentioned in Section 41 the simplifiedinstances are constructed in a parallel fashion through ageneral-purpose GPU programming framework Then theframework interoperates with the GPUrsquos shader programsand allows shaders to perform rendering tasks for theinstances Because continuous LOD and animated instancingtechniques are assumed to be used in our approach instanceshave to be rendered one-by-one which is the same as the wayof rendering animated instances in geometry-instancing andpseudo-instancing techniques However our approach needs

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of Computer Games Technology 3

the shape deviation between the characterrsquos rest pose anda deformed shape in animations Their approach producedmore accurate animations for dynamically simplified charac-ters than many other LOD-based approaches but it causeda higher computational cost so it may be challenging tointegrate their approach into a real-time application Will-mott [38] presented a rapid algorithm to simplify animatedcharacters The algorithm was developed based on the ideaof vertex clustering The author mentioned the possibilityof implementing the algorithm on the GPU However incomparison to the algorithm with progressive meshes itdid not produce well-simplified characters to preserve finefeatures of character appearance Feng et al [39] employedtriangular-char geometry images to preserve the features ofboth static and animated charactersTheir approach achievedhigh rendering performance by implement geometry imageswith multiresolutions on the GPU In their experimentthey demonstrated a real-time rendering rate for a crowdcomposed of 153 million triangles However there could bea potential LOD adaptation issue if the geometry imagesbecome excessively large Peng et al [8] proposed a GPU-based LOD-enabled system to render crowds along with anovel texture-preserving algorithm on simplified versions ofthe character They employed a continuous LOD techniqueto refine or reduce mesh details progressively during theruntime However their approach was based on the simu-lation of single virtual humans Instantiating and renderingmultiple types of characters were not possible in their systemSavoy et al [40] presented a web-based crowd renderingsystem that employed the discrete LOD and instancingtechniques

Visibility culling technique is another type of accelerationtechniques for crowd rendering With visibility culling arenderer is able to reject a character from the renderingpipeline if it is outside the view frustum or blocked by othercharacters or objects Visibility culling techniques do notcause any loss of visual fidelity on visible characters Tecchiaet al [41] performed efficient occlusion culling for a highlypopulated scene They subdivided the virtual environmentalmap into a 2D grid and used it to build a KD-tree of thevirtual environmentThe large and static objects in the virtualenvironment such as buildings were used as occludersThenan occlusion tree was built at each frame and merged withthe KD-tree Barczak et al [42] integrated GPU-acceleratedView-Frustum Culling and occlusion culling techniques intoa crowd rendering system The system used a vertex shaderto test whether or not the bounding sphere of a characterintersects with the view frustum A hierarchical Z bufferimage was built dynamically in a vertex shader in order toperform occlusion culling Hernandez and Isaac Rudomin[43] combined View-Frustum Culling and LOD SelectionDesired detail levels were assigned only to the charactersinside the view frustum

Instancing techniques have been commonly used forcrowd rendering Their execution is accelerated by GPUswith graphics API such as DirectX and OpenGL Zelsnack[44] presented coding details of the pseudo-instancing tech-nique using OpenGL shader language (GLSL) The pseudo-instancing technique requires per-instance calls sent to and

executed on the GPU Carucci [45] introduced the geometry-instancing technique which renders all vertices and trianglesof a crowd scene through a geometry shader using one callMillan and Rudomin [14] used the pseudo-instancing tech-nique for rendering full-detail characters which were closerto the camera The far-away characters with low details wererendered using impostors (an image-based approach) Ashrafand Zhou [46] used a hardware-accelerated method throughprogrammable shaders to animated crowds Klein et al [47]presented an approach to render configurable instances of3D characters for the web They improved XML3D to store3D content in a more efficient way in order to supportan instancing-based rendering mechanism However theirapproach lacked support for multiple character assets

3 System Overview

Our crowd rendering system first preprocesses source char-acters and then performs runtime tasks on the GPU Figure 1illustrates an overview of our system Our system integratesView-Frustum Culling and continuous LOD techniques

At the preprocessing stage a fine-to-coarse progressivemesh simplification algorithm is applied to every sourcecharacter In accordance with the edge-collapsing criteria[8 35] the simplification algorithm selects edges and thencollapses them by merging adjacent vertices iteratively andthen removes the triangles containing the collapsed edgesThe edge-collapsing operations are stored as data arrays onthe GPU Vertices and triangles can be recovered by splittingthe collapsed edges and are restored with respect to the orderof applying coarse-to-fine splitting operations Vertex normalvectors are used in our system to determine a proper shadingeffect for the crowd A bounding sphere is computed for eachsource character It tightly encloses all vertices in all framesof the characterrsquos animation The bounding sphere will beused during the runtime to test an instance against the viewfrustumNote that bounding spheresmay be in different sizesbecause the sizes of source characters may be different Otherdata including textures UVs skeletons skinning weightsand animations are packed into texturesThey can be accessedquickly by shader programs and the general-purpose GPUprogramming framework during the runtime

The runtime pipeline of our system is executed on theGPU through five parallel processing components We usean instance ID in shader programs to track the index ofeach instance which corresponds to the occurrence of asource character at a global location and orientation in thevirtual scene A unique source character ID is assigned toeach source character which is used by an instance to indexback to the source character that is instantiated from Weassume that the desired number of instances is providedby users as a parameter in the system configuration Theglobal positions and orientations of instances simulated froma crowd simulator are passed into our system as input Theydetermine where the instances should occur in the virtualscene The component of View-Frustum Culling determinesthe visibility of instances An instance will be considered to bevisible if its bounding sphere is inside or intersects with the

4 International Journal of Computer Games Technology

TexturesUVs

SkeletonsSkinning weights

Vertex normals

VerticesTriangles

Animation

Fine-to-coarse

simplification

Coarse-to-fine reordered primitives

Source character preprocessing on the CPU

Bounding spheres

Edge collapses

Buffer objects and textures in GPUrsquos global memory

View-Frustum Culling

LOD Selection

LOD Mesh Generation

Animating Meshes Rendering

of instances

Runtime pipeline on the GPU

Primitive budget

Global positions and orientations

of instances

DataParallel processing component

Figure 1 The overview of our system

view frustum The component of LOD Selection determinesthe desired detail level of the instances It is executed withan instance-level parallelization A detail level is representedas the numbers of vertices and triangles selected which areassembled from the vertex and triangle repositories Thecomponent of LODMesh Generation produces LODmeshesusing the selected vertices and triangles The size of GPUmemory may not be enough to store all the instances attheir finest levels Thus a configuration parameter called theprimitive budget is passed into the runtime pipeline as aglobal constraint to ensure the generated LOD meshes fitinto the GPUmemoryThe component of Animating Meshesattaches skeletons and animations to the simplified versions(LODmeshes) of the instances At the end of the pipeline therendering component rasterizes the LOD meshes along withappropriate textures and UVs and displays the result on thescreen

4 Fundamentals of LOD Selection andCharacter Animation

During the step of preprocessing the mesh of each sourcecharacter is simplified by collapsing edges Same as existingwork the collapsing criteria in our approach preservesfeatures at high curvature regions [39] and avoids collapsingthe edges onor crossing texture seams [8] Edges are collapsedone-by-one We utilized the same method presented in [48]which saved collapsing operations into an array structuresuitable for the GPU architecture The index of each arrayelement represents the source vertex and the value of theelement represents the target vertex it merges to By usingthe array of edge collapsing the repositories of vertices andtriangles are rearranged in an increasing order so that atruntime the desired complexity of a mesh can be generatedby selecting a successive sequence of vertices and trianglesfrom the repositories Then the skeleton-based animations

are applied to deform the simplified meshes Figure 2 showsthe different levels of detail of several source characters thatare used in our work In this section we brief the techniquesof LOD Selection and character animation

41 LOD Selection Let us denote 119870 as the total number ofinstances in the virtual scene A desired level of details foran instance can be represented as the pair of V119873119906119898 119905119873119906119898where V119873119906119898 is the desired number of vertices and 119905119873119906119898 isthe desired number of triangles Given a value of V119873119906119898 thevalue of 119905119873119906119898 can be retrieved from the prerecorded edge-collapsing information [48] Thus the goal of LOD Selectionis to determine an appropriate value of V119873119906119898 for eachinstance with considerations of the available GPU memorysize and the instancersquos spatial relationship to the camera Ifan instance is outside the view frustum V119873119906119898 is set to zeroFor the instances inside the view frustum we used the LODSelection metric in [48] to compute V119873119906119898 as shown in

V119873119906119898119894 = 1198731199081120572119894

sum119870119894=11199081120572119894

where 119908119894 = 120573119860 119894119863119894

119875120573119894 120573 = 120572 minus 1

(1)

Equation (1) is the same as the first-pass algorithmpresented by Peng and Cao [48] It originates from the modelperception method presented by Funkhouser et al [49] andis improved by Peng and Cao [48 50] to accelerate therendering of large CAD models We found that (1) is also asuitable metric for the large crowd rendering In the equation119873 refers to the total number of vertices that can be retainedon the GPU which is a user-specified value computed basedon the available size of GPU memory The value of 119873 canbe tuned to balance the rendering performance and visualquality 119908119894 is the weight computed with the projected area ofthe bounding sphere of the 119894th instance on the screen (119860 119894 ) and

International Journal of Computer Games Technology 5

(a) (b) (c) (d) (e) (f) (g)

Figure 2 The examples showing different levels of detail for seven source characters From top to bottom the numbers of triangles are asfollows (a) Alien 5688 1316 601 and 319 (b) Bug 6090 1926 734 and 434 (c) Daemon 6652 1286 612 and 444 (d) Nasty 6848 1588720 and 375 (e) Ripper Dog 4974 1448 606 and 309 (f) Spider 5868 1152 436 and 257 (g) Titan 6518 1581 681 and 362

the distance to the camera (119863119894) 120572 is the perception parameterintroduced by Funkhouser et al [49] In our work the valueof 120572 is set to 3

With V119873119906119898 and 119905119873119906119898 the successive sequences ofvertices and triangles are retrieved from the vertex andtriangle repositories of the source character By applying theparallel triangle reformation algorithm [48 50] the desiredshape of the simplified mesh is generated using the selectedvertices and triangles

42 Animation In order to create character animationseach LOD mesh has to be bound to a skeleton along withskinning weights added to influence the movement of themeshrsquos vertices As a result the mesh will be deformed byrotating joints of the skeleton As we mentioned earlier eachvertex may be influenced by a maximum of four joints Wewant to note that the vertices forming the LOD mesh are asubset of the original vertices of the source character Thereis not any new vertex introduced during the preprocess ofmesh simplification Because of this we were able to useoriginal skinning weights to influence the LOD mesh Whentransformations defined in an animation frame are loadedon the joints the final vertex position will be computedby summing the weighted transformations of the skinningjoints Let us denote each of the four joints influencing avertex V as 119869119899119905119894 where 119894 isin [0 3] The weight of 119869119899119905119894 on thevertex V is denoted as 119882119869119899119905119894 Thus the final position of thevertex V denoted as 1198751015840V can be computed by using

1198751015840V = 1198663

sum119894=0

119882119869119899119905119894119879119869119899119905119894119861minus1119869119899119905119894

119875V where3

sum119894=0

119882119869119899119905119894 = 1 (2)

In (2) 119875V is the vertex position at the time when themesh is bound to the skeleton When the mesh is first loaded

without the use of animation data the mesh is placed in theinitial binding poseWhen using an animation the inverse ofthe binding pose needs to be multiplied by an animated poseThis is reflected in the equation where 119861minus1119869119899119905119894 is the inverse ofbinding transformation of the joint 119869119899119905119894 and 119879119869119899119905119894 representsthe transformation of the joint 119869119899119905119894 from the current frameof the animation 119866 is the transformation representing theinstancersquos global position and orientation Note that thetransformations 119861minus1119869119899119905119894 119879119869119899119905119894 and119866 are represented in the formof 4times4matrixThe weight119882119869119899119905119894 is a single float value and thefour weight values must sum to 1

5 Source Character and Instance Management

Geometry-instancing and pseudo-instancing techniques arethe primary solutions for rendering a large number ofinstances while allowing the instances to have differentglobal transformations The pseudo-instancing technique isused in OpenGL and calls instancesrsquo drawing functions one-by-one The geometry-instancing technique is included inDirect3D since the version 9 and in OpenGL since version33 It advances the pseudo-instancing technique in terms ofreducing the number of drawing calls It supports the use ofa single drawing call for instances of a mesh and thereforereduces the communication cost of sending call requests fromCPU to GPU and subsequently increases the performanceAs regards data storage on the GPU buffer objects are usedfor shader programs to access and update data quickly Abuffer object is a continuous memory block on the GPUand allows the renderer to rasterize data in a retained modeIn particular a vertex buffer object (VBO) stores verticesAn index buffer object (IBO) stores indices of vertices thatform triangles or other polygonal types used in our system

6 International Journal of Computer Games Technology

In particular the geometry-instancing technique requires asingle copy of vertex data maintained in the VBO a singlecopy of triangle data maintained in the IBO and a single copyof distinct world transformations of all instances Howeverif the source character has a high geometric complexity andthere are lots of instances the geometry-instancing techniquemay make the uniform data type in shaders hit the size limitdue to the large amount of vertices and triangles sent to theGPU In such case the drawing call has to be broken intomultiple calls

There are two types of implementations for instancingtechniques static batching and dynamic batching [45] Thesingle-call method in the geometry-instancing technique isimplementedwith static batching while themulticallmethodin both the pseudo-instancing and geometry-instancing tech-niques are implemented with dynamic batching In staticbatching all vertices and triangles of the instances are savedinto a VBO and IBO In dynamic batching the verticesand triangles are maintained in different buffer objects anddrawn separately The implementation with static batchinghas the potential to fully utilize the GPU memory whiledynamic batchingwould underutilize thememoryThemajorlimitation of static batching is the lack of LOD and skinningsupports This limitation makes the static batching notsuitable for rendering animated instances though it has beenproved to be faster than dynamic batching in terms of theperformance of rasterizing meshes

In our work the storage of instances is managed similarlyto the implementation of static batching while individualinstances can still be accessed similarly to the implementationof dynamic batching Therefore our approach can be seam-lessly integrated with LOD and skinning techniques whiletaking the use of a single VBO and IBO for fast renderingThis section describes the details of our contributions forcharacter and instance management including texture pack-ing UV-guided mesh rebuilding and instance indexing

51 Packing Skeleton Animation and Skinning Weights intoTextures Smooth deformation of a 3D mesh is a computa-tionally expensive process because each vertex of the meshneeds to be repositioned by the joints that influence itWe packed the skeleton animations and skinning weightsinto 2D textures on the GPU so that shader programs canaccess them quickly The skeleton is the binding pose of thecharacter As explained in (2) the inverse of the bindingposersquos transformation is used during the runtime In ourapproach we stored this inverse into the texture as the skeletalinformation For each joint of the skeleton instead of storingindividual translation rotation and scale values we storedtheir composed transformation in the form of a 4 times 4matrixEach jointrsquos binding pose transformation matrix takes fourRGBA texels for storage Each RGBA texel stores a rowof the matrix Each channel stores a single element of thematrix By usingOpenGLmatrices are stored as the format ofGL RGBA32F in the texture which is a 32-bit floating-pointtype for each channel in one texel Let us denote the totalnumber of joints in a skeleton as 119870 Then the total numberof texels to store the entire skeleton is 4119870

We used the same format for storing the skeleton to storean animation Each animation frame needs 4119870 texels to storethe jointsrsquo transformation matrices Let us denote the totalnumber of frames in an animation as 119876 Then the totalnumber of texels for storing the entire animation is 4119870119876For each animation frame the matrix elements are saved intosuccessive texels in the row order Here we want to note thateach animation frame starts from a new row in the texture

The skinning weights of each vertex are four values in therange of [0 1] where each value represents the influencingpercentage of a skinning joint For each vertex the skinningweights require eight data elements where the first fourdata elements are joint indices and the last four are thecorresponding weights In other words each vertex requirestwo RGBA texels to store the skinning weights The first texelis used to store joint indices and the second texel is used tostore weights

52 UV-Guided Mesh Rebuilding A 3D mesh is usually aseamless surface without boundary edges The mesh has tobe cut and unfolded into 2D flatten patches before a textureimage can be mapped onto it To do this some edges haveto be selected properly as boundary edges from which themesh can be cut The relationship between the vertices ofa 3D mesh and 2D texture coordinates can be described asa texture mapping function 119865(119909 119910 119911) 997888rarr (119904119894 119905119894) Innervertices (those not on boundary edges) have a one-to-onetexturemapping In other words each inner vertex ismappedto a single pair of texture coordinates For the vertices onboundary edges since boundary edges are the cutting seamsa boundary vertex needs to be mapped to multiple pairs oftexture coordinates Figure 3 shows an example that unfolds acubemesh andmaps it into a flatten patch in 2D texture spaceIn the figure 119906119894 stands for a point in the 2D texture spaceEach vertex on the boundary edges is mapped to more thanone points which are 119865(V0) = 1199061 1199063 119865(V3) = 11990610 11990612119865(V4) = 1199060 1199064 1199066 and 119865(V7) = 1199067 1199069 11990613

In a hardware-accelerated renderer texture coordinatesare indexed from a buffer object and each vertex shouldassociate with a single pair of texture coordinates Since thetexture mapping function produces more than one pairs oftexture coordinates for boundary vertices we conducted amesh rebuilding process to duplicate boundary vertices andmapped each duplicated one to a unique texture point Bydoing this although the number of vertices is increased dueto the cuttings on boundary edges the number of trianglesis the same as the number of triangles in the original meshIn our approach we initialized two arrays to store UVinformation One array stores texture coordinates the otherarray stores the indices of texture points with respect to theorder of triangle storage Algorithm 1 shows the algorithmicprocess to duplicate boundary vertices by looping throughall triangles In the algorithm 119881119890119903119905119904 is the array of originalvertices storing 3D coordinates (119909 119910 119911) 119879119903119894119904 is the arrayof original triangles storing the sequence of vertex indicesSimilar to 119879119903119894119904 119879119890119909119868119899119909 is the array of indices of texturepoints in 2D texture space and represents the same triangulartopology as the mesh Note that the order of triangle storage

International Journal of Computer Games Technology 7

v0

v1 v2

v3

v4

v5 v6

v7

(a)

u1 (v0)

u3(v0)

u4 (v4)u5 (v5)

u6 (v4) u7 (v 7)

u8 (v6)u9 (v7)

u10 (v3)u11 (v2)u2 (v1)

u12 (v3)

u13 (v7)u0 (v4)

(b)

Figure 3 An example showing the process of unfolding a cube mesh and mapping it into a flatten patch in 2D texture space (a) is the 3Dcube mesh formed by 8 vertices and 6 triangles (b) is the unfolded texture map formed by 14 pairs of texture coordinates and 6 trianglesBold lines are the boundary edges (seams) to cut the cube and the vertices in red are boundary vertices that are mapped into multiple pairsof texture coordinates In (b) 119906119894 stands for a point (119904119894 119905119894) in 2D texture space and V119894 in the parenthesis is the corresponding vertex in the cube

RebuildMesh(Input 119881119890119903119905119904 119879119903119894119904 119879119890119909119862119900119900119903119889119904119873119900119903119898119904 119879119890119909119868119899119909 119900119903119894119879119903119894119873119906119898 119905119890119909119862119900119900119903119889119873119906119898Output 1198811198901199031199051199041015840 11987911990311989411990410158401198731199001199031198981199041015840)

(1) 119888ℎ119890119888119896119890119889[119905119890119909119862119900119900119903119889119873119906119898] larr997888 119891119886119897119904119890(2) for each triangle 119894 of 119900119903119894119879119903119894119873119906119898 do(3) for each vertex 119895 in the 119894th triangle do(4) 119905119890119909119875119900119894119899119905 119894119889 larr997888 119879119890119909119868119899119909[3 lowast 119894 + 119895](5) if 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] = 119891119886119897119904119890 then(6) 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] larr997888 119905119903119906119890(7) V119890119903119905 119894119889 larr997888 119879119903119894119904[3 lowast 119894 + 119895](8) for each coordinate 119896 in the 119896th vertex do(9) 1198811198901199031199051199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119881119890119903119905119904[3 lowast V119890119903119905 119894119889 + 119896](10) 1198731199001199031198981199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119873119900119903119898119904[3 lowast V119890119903119905 119894119889 + 119896](11) end for(12) end if(13) end for(14) end for(15) 1198791199031198941199041015840 larr997888 119879119890119909119868119899119909

Algorithm 1 UV-Guided mesh rebuilding algorithm

for the mesh is the same as the order of triangle storage forthe 2D texture patches 119873119900119903119898119904 is the array of vertex normalvectors 119900119903119894119879119903119894119873119906119898 is the total number of original trianglesand 119905119890119909119862119900119900119903119889119873119906119898 is the number of texture points in 2Dtexture space

After rebuilding the mesh the number of vertices in1198811198901199031199051199041015840 will be identical to the number of texture points119905119890119909119862119900119900119903119889119873119906119898 and the array of triangles (1198791199031198941199041015840) is replacedby the array of indices of the texture points (119879119890119909119868119899119909)

53 Source Character and Instance Indexing After applyingthe data packing and mesh rebuilding methods presentedin Sections 51 and 52 the multiple data of a source char-acter are organized into GPU-friendly data structures Thecharacterrsquos skeleton skinning weights and animations arepacked into textures and read-only in shader programs on

the GPU The vertices triangles texture coordinates andvertex normal vectors are stored in arrays and retained onthe GPU During the runtime based on the LOD Selectionresult (see Section 41) a successive subsequence of verticestriangles texture coordinates and vertex normal vectorsare selected for each instance and maintained as singlebuffer objects As mentioned in Section 41 the simplifiedinstances are constructed in a parallel fashion through ageneral-purpose GPU programming framework Then theframework interoperates with the GPUrsquos shader programsand allows shaders to perform rendering tasks for theinstances Because continuous LOD and animated instancingtechniques are assumed to be used in our approach instanceshave to be rendered one-by-one which is the same as the wayof rendering animated instances in geometry-instancing andpseudo-instancing techniques However our approach needs

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

4 International Journal of Computer Games Technology

TexturesUVs

SkeletonsSkinning weights

Vertex normals

VerticesTriangles

Animation

Fine-to-coarse

simplification

Coarse-to-fine reordered primitives

Source character preprocessing on the CPU

Bounding spheres

Edge collapses

Buffer objects and textures in GPUrsquos global memory

View-Frustum Culling

LOD Selection

LOD Mesh Generation

Animating Meshes Rendering

of instances

Runtime pipeline on the GPU

Primitive budget

Global positions and orientations

of instances

DataParallel processing component

Figure 1 The overview of our system

view frustum The component of LOD Selection determinesthe desired detail level of the instances It is executed withan instance-level parallelization A detail level is representedas the numbers of vertices and triangles selected which areassembled from the vertex and triangle repositories Thecomponent of LODMesh Generation produces LODmeshesusing the selected vertices and triangles The size of GPUmemory may not be enough to store all the instances attheir finest levels Thus a configuration parameter called theprimitive budget is passed into the runtime pipeline as aglobal constraint to ensure the generated LOD meshes fitinto the GPUmemoryThe component of Animating Meshesattaches skeletons and animations to the simplified versions(LODmeshes) of the instances At the end of the pipeline therendering component rasterizes the LOD meshes along withappropriate textures and UVs and displays the result on thescreen

4 Fundamentals of LOD Selection andCharacter Animation

During the step of preprocessing the mesh of each sourcecharacter is simplified by collapsing edges Same as existingwork the collapsing criteria in our approach preservesfeatures at high curvature regions [39] and avoids collapsingthe edges onor crossing texture seams [8] Edges are collapsedone-by-one We utilized the same method presented in [48]which saved collapsing operations into an array structuresuitable for the GPU architecture The index of each arrayelement represents the source vertex and the value of theelement represents the target vertex it merges to By usingthe array of edge collapsing the repositories of vertices andtriangles are rearranged in an increasing order so that atruntime the desired complexity of a mesh can be generatedby selecting a successive sequence of vertices and trianglesfrom the repositories Then the skeleton-based animations

are applied to deform the simplified meshes Figure 2 showsthe different levels of detail of several source characters thatare used in our work In this section we brief the techniquesof LOD Selection and character animation

41 LOD Selection Let us denote 119870 as the total number ofinstances in the virtual scene A desired level of details foran instance can be represented as the pair of V119873119906119898 119905119873119906119898where V119873119906119898 is the desired number of vertices and 119905119873119906119898 isthe desired number of triangles Given a value of V119873119906119898 thevalue of 119905119873119906119898 can be retrieved from the prerecorded edge-collapsing information [48] Thus the goal of LOD Selectionis to determine an appropriate value of V119873119906119898 for eachinstance with considerations of the available GPU memorysize and the instancersquos spatial relationship to the camera Ifan instance is outside the view frustum V119873119906119898 is set to zeroFor the instances inside the view frustum we used the LODSelection metric in [48] to compute V119873119906119898 as shown in

V119873119906119898119894 = 1198731199081120572119894

sum119870119894=11199081120572119894

where 119908119894 = 120573119860 119894119863119894

119875120573119894 120573 = 120572 minus 1

(1)

Equation (1) is the same as the first-pass algorithmpresented by Peng and Cao [48] It originates from the modelperception method presented by Funkhouser et al [49] andis improved by Peng and Cao [48 50] to accelerate therendering of large CAD models We found that (1) is also asuitable metric for the large crowd rendering In the equation119873 refers to the total number of vertices that can be retainedon the GPU which is a user-specified value computed basedon the available size of GPU memory The value of 119873 canbe tuned to balance the rendering performance and visualquality 119908119894 is the weight computed with the projected area ofthe bounding sphere of the 119894th instance on the screen (119860 119894 ) and

International Journal of Computer Games Technology 5

(a) (b) (c) (d) (e) (f) (g)

Figure 2 The examples showing different levels of detail for seven source characters From top to bottom the numbers of triangles are asfollows (a) Alien 5688 1316 601 and 319 (b) Bug 6090 1926 734 and 434 (c) Daemon 6652 1286 612 and 444 (d) Nasty 6848 1588720 and 375 (e) Ripper Dog 4974 1448 606 and 309 (f) Spider 5868 1152 436 and 257 (g) Titan 6518 1581 681 and 362

the distance to the camera (119863119894) 120572 is the perception parameterintroduced by Funkhouser et al [49] In our work the valueof 120572 is set to 3

With V119873119906119898 and 119905119873119906119898 the successive sequences ofvertices and triangles are retrieved from the vertex andtriangle repositories of the source character By applying theparallel triangle reformation algorithm [48 50] the desiredshape of the simplified mesh is generated using the selectedvertices and triangles

42 Animation In order to create character animationseach LOD mesh has to be bound to a skeleton along withskinning weights added to influence the movement of themeshrsquos vertices As a result the mesh will be deformed byrotating joints of the skeleton As we mentioned earlier eachvertex may be influenced by a maximum of four joints Wewant to note that the vertices forming the LOD mesh are asubset of the original vertices of the source character Thereis not any new vertex introduced during the preprocess ofmesh simplification Because of this we were able to useoriginal skinning weights to influence the LOD mesh Whentransformations defined in an animation frame are loadedon the joints the final vertex position will be computedby summing the weighted transformations of the skinningjoints Let us denote each of the four joints influencing avertex V as 119869119899119905119894 where 119894 isin [0 3] The weight of 119869119899119905119894 on thevertex V is denoted as 119882119869119899119905119894 Thus the final position of thevertex V denoted as 1198751015840V can be computed by using

1198751015840V = 1198663

sum119894=0

119882119869119899119905119894119879119869119899119905119894119861minus1119869119899119905119894

119875V where3

sum119894=0

119882119869119899119905119894 = 1 (2)

In (2) 119875V is the vertex position at the time when themesh is bound to the skeleton When the mesh is first loaded

without the use of animation data the mesh is placed in theinitial binding poseWhen using an animation the inverse ofthe binding pose needs to be multiplied by an animated poseThis is reflected in the equation where 119861minus1119869119899119905119894 is the inverse ofbinding transformation of the joint 119869119899119905119894 and 119879119869119899119905119894 representsthe transformation of the joint 119869119899119905119894 from the current frameof the animation 119866 is the transformation representing theinstancersquos global position and orientation Note that thetransformations 119861minus1119869119899119905119894 119879119869119899119905119894 and119866 are represented in the formof 4times4matrixThe weight119882119869119899119905119894 is a single float value and thefour weight values must sum to 1

5 Source Character and Instance Management

Geometry-instancing and pseudo-instancing techniques arethe primary solutions for rendering a large number ofinstances while allowing the instances to have differentglobal transformations The pseudo-instancing technique isused in OpenGL and calls instancesrsquo drawing functions one-by-one The geometry-instancing technique is included inDirect3D since the version 9 and in OpenGL since version33 It advances the pseudo-instancing technique in terms ofreducing the number of drawing calls It supports the use ofa single drawing call for instances of a mesh and thereforereduces the communication cost of sending call requests fromCPU to GPU and subsequently increases the performanceAs regards data storage on the GPU buffer objects are usedfor shader programs to access and update data quickly Abuffer object is a continuous memory block on the GPUand allows the renderer to rasterize data in a retained modeIn particular a vertex buffer object (VBO) stores verticesAn index buffer object (IBO) stores indices of vertices thatform triangles or other polygonal types used in our system

6 International Journal of Computer Games Technology

In particular the geometry-instancing technique requires asingle copy of vertex data maintained in the VBO a singlecopy of triangle data maintained in the IBO and a single copyof distinct world transformations of all instances Howeverif the source character has a high geometric complexity andthere are lots of instances the geometry-instancing techniquemay make the uniform data type in shaders hit the size limitdue to the large amount of vertices and triangles sent to theGPU In such case the drawing call has to be broken intomultiple calls

There are two types of implementations for instancingtechniques static batching and dynamic batching [45] Thesingle-call method in the geometry-instancing technique isimplementedwith static batching while themulticallmethodin both the pseudo-instancing and geometry-instancing tech-niques are implemented with dynamic batching In staticbatching all vertices and triangles of the instances are savedinto a VBO and IBO In dynamic batching the verticesand triangles are maintained in different buffer objects anddrawn separately The implementation with static batchinghas the potential to fully utilize the GPU memory whiledynamic batchingwould underutilize thememoryThemajorlimitation of static batching is the lack of LOD and skinningsupports This limitation makes the static batching notsuitable for rendering animated instances though it has beenproved to be faster than dynamic batching in terms of theperformance of rasterizing meshes

In our work the storage of instances is managed similarlyto the implementation of static batching while individualinstances can still be accessed similarly to the implementationof dynamic batching Therefore our approach can be seam-lessly integrated with LOD and skinning techniques whiletaking the use of a single VBO and IBO for fast renderingThis section describes the details of our contributions forcharacter and instance management including texture pack-ing UV-guided mesh rebuilding and instance indexing

51 Packing Skeleton Animation and Skinning Weights intoTextures Smooth deformation of a 3D mesh is a computa-tionally expensive process because each vertex of the meshneeds to be repositioned by the joints that influence itWe packed the skeleton animations and skinning weightsinto 2D textures on the GPU so that shader programs canaccess them quickly The skeleton is the binding pose of thecharacter As explained in (2) the inverse of the bindingposersquos transformation is used during the runtime In ourapproach we stored this inverse into the texture as the skeletalinformation For each joint of the skeleton instead of storingindividual translation rotation and scale values we storedtheir composed transformation in the form of a 4 times 4matrixEach jointrsquos binding pose transformation matrix takes fourRGBA texels for storage Each RGBA texel stores a rowof the matrix Each channel stores a single element of thematrix By usingOpenGLmatrices are stored as the format ofGL RGBA32F in the texture which is a 32-bit floating-pointtype for each channel in one texel Let us denote the totalnumber of joints in a skeleton as 119870 Then the total numberof texels to store the entire skeleton is 4119870

We used the same format for storing the skeleton to storean animation Each animation frame needs 4119870 texels to storethe jointsrsquo transformation matrices Let us denote the totalnumber of frames in an animation as 119876 Then the totalnumber of texels for storing the entire animation is 4119870119876For each animation frame the matrix elements are saved intosuccessive texels in the row order Here we want to note thateach animation frame starts from a new row in the texture

The skinning weights of each vertex are four values in therange of [0 1] where each value represents the influencingpercentage of a skinning joint For each vertex the skinningweights require eight data elements where the first fourdata elements are joint indices and the last four are thecorresponding weights In other words each vertex requirestwo RGBA texels to store the skinning weights The first texelis used to store joint indices and the second texel is used tostore weights

52 UV-Guided Mesh Rebuilding A 3D mesh is usually aseamless surface without boundary edges The mesh has tobe cut and unfolded into 2D flatten patches before a textureimage can be mapped onto it To do this some edges haveto be selected properly as boundary edges from which themesh can be cut The relationship between the vertices ofa 3D mesh and 2D texture coordinates can be described asa texture mapping function 119865(119909 119910 119911) 997888rarr (119904119894 119905119894) Innervertices (those not on boundary edges) have a one-to-onetexturemapping In other words each inner vertex ismappedto a single pair of texture coordinates For the vertices onboundary edges since boundary edges are the cutting seamsa boundary vertex needs to be mapped to multiple pairs oftexture coordinates Figure 3 shows an example that unfolds acubemesh andmaps it into a flatten patch in 2D texture spaceIn the figure 119906119894 stands for a point in the 2D texture spaceEach vertex on the boundary edges is mapped to more thanone points which are 119865(V0) = 1199061 1199063 119865(V3) = 11990610 11990612119865(V4) = 1199060 1199064 1199066 and 119865(V7) = 1199067 1199069 11990613

In a hardware-accelerated renderer texture coordinatesare indexed from a buffer object and each vertex shouldassociate with a single pair of texture coordinates Since thetexture mapping function produces more than one pairs oftexture coordinates for boundary vertices we conducted amesh rebuilding process to duplicate boundary vertices andmapped each duplicated one to a unique texture point Bydoing this although the number of vertices is increased dueto the cuttings on boundary edges the number of trianglesis the same as the number of triangles in the original meshIn our approach we initialized two arrays to store UVinformation One array stores texture coordinates the otherarray stores the indices of texture points with respect to theorder of triangle storage Algorithm 1 shows the algorithmicprocess to duplicate boundary vertices by looping throughall triangles In the algorithm 119881119890119903119905119904 is the array of originalvertices storing 3D coordinates (119909 119910 119911) 119879119903119894119904 is the arrayof original triangles storing the sequence of vertex indicesSimilar to 119879119903119894119904 119879119890119909119868119899119909 is the array of indices of texturepoints in 2D texture space and represents the same triangulartopology as the mesh Note that the order of triangle storage

International Journal of Computer Games Technology 7

v0

v1 v2

v3

v4

v5 v6

v7

(a)

u1 (v0)

u3(v0)

u4 (v4)u5 (v5)

u6 (v4) u7 (v 7)

u8 (v6)u9 (v7)

u10 (v3)u11 (v2)u2 (v1)

u12 (v3)

u13 (v7)u0 (v4)

(b)

Figure 3 An example showing the process of unfolding a cube mesh and mapping it into a flatten patch in 2D texture space (a) is the 3Dcube mesh formed by 8 vertices and 6 triangles (b) is the unfolded texture map formed by 14 pairs of texture coordinates and 6 trianglesBold lines are the boundary edges (seams) to cut the cube and the vertices in red are boundary vertices that are mapped into multiple pairsof texture coordinates In (b) 119906119894 stands for a point (119904119894 119905119894) in 2D texture space and V119894 in the parenthesis is the corresponding vertex in the cube

RebuildMesh(Input 119881119890119903119905119904 119879119903119894119904 119879119890119909119862119900119900119903119889119904119873119900119903119898119904 119879119890119909119868119899119909 119900119903119894119879119903119894119873119906119898 119905119890119909119862119900119900119903119889119873119906119898Output 1198811198901199031199051199041015840 11987911990311989411990410158401198731199001199031198981199041015840)

(1) 119888ℎ119890119888119896119890119889[119905119890119909119862119900119900119903119889119873119906119898] larr997888 119891119886119897119904119890(2) for each triangle 119894 of 119900119903119894119879119903119894119873119906119898 do(3) for each vertex 119895 in the 119894th triangle do(4) 119905119890119909119875119900119894119899119905 119894119889 larr997888 119879119890119909119868119899119909[3 lowast 119894 + 119895](5) if 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] = 119891119886119897119904119890 then(6) 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] larr997888 119905119903119906119890(7) V119890119903119905 119894119889 larr997888 119879119903119894119904[3 lowast 119894 + 119895](8) for each coordinate 119896 in the 119896th vertex do(9) 1198811198901199031199051199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119881119890119903119905119904[3 lowast V119890119903119905 119894119889 + 119896](10) 1198731199001199031198981199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119873119900119903119898119904[3 lowast V119890119903119905 119894119889 + 119896](11) end for(12) end if(13) end for(14) end for(15) 1198791199031198941199041015840 larr997888 119879119890119909119868119899119909

Algorithm 1 UV-Guided mesh rebuilding algorithm

for the mesh is the same as the order of triangle storage forthe 2D texture patches 119873119900119903119898119904 is the array of vertex normalvectors 119900119903119894119879119903119894119873119906119898 is the total number of original trianglesand 119905119890119909119862119900119900119903119889119873119906119898 is the number of texture points in 2Dtexture space

After rebuilding the mesh the number of vertices in1198811198901199031199051199041015840 will be identical to the number of texture points119905119890119909119862119900119900119903119889119873119906119898 and the array of triangles (1198791199031198941199041015840) is replacedby the array of indices of the texture points (119879119890119909119868119899119909)

53 Source Character and Instance Indexing After applyingthe data packing and mesh rebuilding methods presentedin Sections 51 and 52 the multiple data of a source char-acter are organized into GPU-friendly data structures Thecharacterrsquos skeleton skinning weights and animations arepacked into textures and read-only in shader programs on

the GPU The vertices triangles texture coordinates andvertex normal vectors are stored in arrays and retained onthe GPU During the runtime based on the LOD Selectionresult (see Section 41) a successive subsequence of verticestriangles texture coordinates and vertex normal vectorsare selected for each instance and maintained as singlebuffer objects As mentioned in Section 41 the simplifiedinstances are constructed in a parallel fashion through ageneral-purpose GPU programming framework Then theframework interoperates with the GPUrsquos shader programsand allows shaders to perform rendering tasks for theinstances Because continuous LOD and animated instancingtechniques are assumed to be used in our approach instanceshave to be rendered one-by-one which is the same as the wayof rendering animated instances in geometry-instancing andpseudo-instancing techniques However our approach needs

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of Computer Games Technology 5

(a) (b) (c) (d) (e) (f) (g)

Figure 2 The examples showing different levels of detail for seven source characters From top to bottom the numbers of triangles are asfollows (a) Alien 5688 1316 601 and 319 (b) Bug 6090 1926 734 and 434 (c) Daemon 6652 1286 612 and 444 (d) Nasty 6848 1588720 and 375 (e) Ripper Dog 4974 1448 606 and 309 (f) Spider 5868 1152 436 and 257 (g) Titan 6518 1581 681 and 362

the distance to the camera (119863119894) 120572 is the perception parameterintroduced by Funkhouser et al [49] In our work the valueof 120572 is set to 3

With V119873119906119898 and 119905119873119906119898 the successive sequences ofvertices and triangles are retrieved from the vertex andtriangle repositories of the source character By applying theparallel triangle reformation algorithm [48 50] the desiredshape of the simplified mesh is generated using the selectedvertices and triangles

42 Animation In order to create character animationseach LOD mesh has to be bound to a skeleton along withskinning weights added to influence the movement of themeshrsquos vertices As a result the mesh will be deformed byrotating joints of the skeleton As we mentioned earlier eachvertex may be influenced by a maximum of four joints Wewant to note that the vertices forming the LOD mesh are asubset of the original vertices of the source character Thereis not any new vertex introduced during the preprocess ofmesh simplification Because of this we were able to useoriginal skinning weights to influence the LOD mesh Whentransformations defined in an animation frame are loadedon the joints the final vertex position will be computedby summing the weighted transformations of the skinningjoints Let us denote each of the four joints influencing avertex V as 119869119899119905119894 where 119894 isin [0 3] The weight of 119869119899119905119894 on thevertex V is denoted as 119882119869119899119905119894 Thus the final position of thevertex V denoted as 1198751015840V can be computed by using

1198751015840V = 1198663

sum119894=0

119882119869119899119905119894119879119869119899119905119894119861minus1119869119899119905119894

119875V where3

sum119894=0

119882119869119899119905119894 = 1 (2)

In (2) 119875V is the vertex position at the time when themesh is bound to the skeleton When the mesh is first loaded

without the use of animation data the mesh is placed in theinitial binding poseWhen using an animation the inverse ofthe binding pose needs to be multiplied by an animated poseThis is reflected in the equation where 119861minus1119869119899119905119894 is the inverse ofbinding transformation of the joint 119869119899119905119894 and 119879119869119899119905119894 representsthe transformation of the joint 119869119899119905119894 from the current frameof the animation 119866 is the transformation representing theinstancersquos global position and orientation Note that thetransformations 119861minus1119869119899119905119894 119879119869119899119905119894 and119866 are represented in the formof 4times4matrixThe weight119882119869119899119905119894 is a single float value and thefour weight values must sum to 1

5 Source Character and Instance Management

Geometry-instancing and pseudo-instancing techniques arethe primary solutions for rendering a large number ofinstances while allowing the instances to have differentglobal transformations The pseudo-instancing technique isused in OpenGL and calls instancesrsquo drawing functions one-by-one The geometry-instancing technique is included inDirect3D since the version 9 and in OpenGL since version33 It advances the pseudo-instancing technique in terms ofreducing the number of drawing calls It supports the use ofa single drawing call for instances of a mesh and thereforereduces the communication cost of sending call requests fromCPU to GPU and subsequently increases the performanceAs regards data storage on the GPU buffer objects are usedfor shader programs to access and update data quickly Abuffer object is a continuous memory block on the GPUand allows the renderer to rasterize data in a retained modeIn particular a vertex buffer object (VBO) stores verticesAn index buffer object (IBO) stores indices of vertices thatform triangles or other polygonal types used in our system

6 International Journal of Computer Games Technology

In particular the geometry-instancing technique requires asingle copy of vertex data maintained in the VBO a singlecopy of triangle data maintained in the IBO and a single copyof distinct world transformations of all instances Howeverif the source character has a high geometric complexity andthere are lots of instances the geometry-instancing techniquemay make the uniform data type in shaders hit the size limitdue to the large amount of vertices and triangles sent to theGPU In such case the drawing call has to be broken intomultiple calls

There are two types of implementations for instancingtechniques static batching and dynamic batching [45] Thesingle-call method in the geometry-instancing technique isimplementedwith static batching while themulticallmethodin both the pseudo-instancing and geometry-instancing tech-niques are implemented with dynamic batching In staticbatching all vertices and triangles of the instances are savedinto a VBO and IBO In dynamic batching the verticesand triangles are maintained in different buffer objects anddrawn separately The implementation with static batchinghas the potential to fully utilize the GPU memory whiledynamic batchingwould underutilize thememoryThemajorlimitation of static batching is the lack of LOD and skinningsupports This limitation makes the static batching notsuitable for rendering animated instances though it has beenproved to be faster than dynamic batching in terms of theperformance of rasterizing meshes

In our work the storage of instances is managed similarlyto the implementation of static batching while individualinstances can still be accessed similarly to the implementationof dynamic batching Therefore our approach can be seam-lessly integrated with LOD and skinning techniques whiletaking the use of a single VBO and IBO for fast renderingThis section describes the details of our contributions forcharacter and instance management including texture pack-ing UV-guided mesh rebuilding and instance indexing

51 Packing Skeleton Animation and Skinning Weights intoTextures Smooth deformation of a 3D mesh is a computa-tionally expensive process because each vertex of the meshneeds to be repositioned by the joints that influence itWe packed the skeleton animations and skinning weightsinto 2D textures on the GPU so that shader programs canaccess them quickly The skeleton is the binding pose of thecharacter As explained in (2) the inverse of the bindingposersquos transformation is used during the runtime In ourapproach we stored this inverse into the texture as the skeletalinformation For each joint of the skeleton instead of storingindividual translation rotation and scale values we storedtheir composed transformation in the form of a 4 times 4matrixEach jointrsquos binding pose transformation matrix takes fourRGBA texels for storage Each RGBA texel stores a rowof the matrix Each channel stores a single element of thematrix By usingOpenGLmatrices are stored as the format ofGL RGBA32F in the texture which is a 32-bit floating-pointtype for each channel in one texel Let us denote the totalnumber of joints in a skeleton as 119870 Then the total numberof texels to store the entire skeleton is 4119870

We used the same format for storing the skeleton to storean animation Each animation frame needs 4119870 texels to storethe jointsrsquo transformation matrices Let us denote the totalnumber of frames in an animation as 119876 Then the totalnumber of texels for storing the entire animation is 4119870119876For each animation frame the matrix elements are saved intosuccessive texels in the row order Here we want to note thateach animation frame starts from a new row in the texture

The skinning weights of each vertex are four values in therange of [0 1] where each value represents the influencingpercentage of a skinning joint For each vertex the skinningweights require eight data elements where the first fourdata elements are joint indices and the last four are thecorresponding weights In other words each vertex requirestwo RGBA texels to store the skinning weights The first texelis used to store joint indices and the second texel is used tostore weights

52 UV-Guided Mesh Rebuilding A 3D mesh is usually aseamless surface without boundary edges The mesh has tobe cut and unfolded into 2D flatten patches before a textureimage can be mapped onto it To do this some edges haveto be selected properly as boundary edges from which themesh can be cut The relationship between the vertices ofa 3D mesh and 2D texture coordinates can be described asa texture mapping function 119865(119909 119910 119911) 997888rarr (119904119894 119905119894) Innervertices (those not on boundary edges) have a one-to-onetexturemapping In other words each inner vertex ismappedto a single pair of texture coordinates For the vertices onboundary edges since boundary edges are the cutting seamsa boundary vertex needs to be mapped to multiple pairs oftexture coordinates Figure 3 shows an example that unfolds acubemesh andmaps it into a flatten patch in 2D texture spaceIn the figure 119906119894 stands for a point in the 2D texture spaceEach vertex on the boundary edges is mapped to more thanone points which are 119865(V0) = 1199061 1199063 119865(V3) = 11990610 11990612119865(V4) = 1199060 1199064 1199066 and 119865(V7) = 1199067 1199069 11990613

In a hardware-accelerated renderer texture coordinatesare indexed from a buffer object and each vertex shouldassociate with a single pair of texture coordinates Since thetexture mapping function produces more than one pairs oftexture coordinates for boundary vertices we conducted amesh rebuilding process to duplicate boundary vertices andmapped each duplicated one to a unique texture point Bydoing this although the number of vertices is increased dueto the cuttings on boundary edges the number of trianglesis the same as the number of triangles in the original meshIn our approach we initialized two arrays to store UVinformation One array stores texture coordinates the otherarray stores the indices of texture points with respect to theorder of triangle storage Algorithm 1 shows the algorithmicprocess to duplicate boundary vertices by looping throughall triangles In the algorithm 119881119890119903119905119904 is the array of originalvertices storing 3D coordinates (119909 119910 119911) 119879119903119894119904 is the arrayof original triangles storing the sequence of vertex indicesSimilar to 119879119903119894119904 119879119890119909119868119899119909 is the array of indices of texturepoints in 2D texture space and represents the same triangulartopology as the mesh Note that the order of triangle storage

International Journal of Computer Games Technology 7

v0

v1 v2

v3

v4

v5 v6

v7

(a)

u1 (v0)

u3(v0)

u4 (v4)u5 (v5)

u6 (v4) u7 (v 7)

u8 (v6)u9 (v7)

u10 (v3)u11 (v2)u2 (v1)

u12 (v3)

u13 (v7)u0 (v4)

(b)

Figure 3 An example showing the process of unfolding a cube mesh and mapping it into a flatten patch in 2D texture space (a) is the 3Dcube mesh formed by 8 vertices and 6 triangles (b) is the unfolded texture map formed by 14 pairs of texture coordinates and 6 trianglesBold lines are the boundary edges (seams) to cut the cube and the vertices in red are boundary vertices that are mapped into multiple pairsof texture coordinates In (b) 119906119894 stands for a point (119904119894 119905119894) in 2D texture space and V119894 in the parenthesis is the corresponding vertex in the cube

RebuildMesh(Input 119881119890119903119905119904 119879119903119894119904 119879119890119909119862119900119900119903119889119904119873119900119903119898119904 119879119890119909119868119899119909 119900119903119894119879119903119894119873119906119898 119905119890119909119862119900119900119903119889119873119906119898Output 1198811198901199031199051199041015840 11987911990311989411990410158401198731199001199031198981199041015840)

(1) 119888ℎ119890119888119896119890119889[119905119890119909119862119900119900119903119889119873119906119898] larr997888 119891119886119897119904119890(2) for each triangle 119894 of 119900119903119894119879119903119894119873119906119898 do(3) for each vertex 119895 in the 119894th triangle do(4) 119905119890119909119875119900119894119899119905 119894119889 larr997888 119879119890119909119868119899119909[3 lowast 119894 + 119895](5) if 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] = 119891119886119897119904119890 then(6) 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] larr997888 119905119903119906119890(7) V119890119903119905 119894119889 larr997888 119879119903119894119904[3 lowast 119894 + 119895](8) for each coordinate 119896 in the 119896th vertex do(9) 1198811198901199031199051199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119881119890119903119905119904[3 lowast V119890119903119905 119894119889 + 119896](10) 1198731199001199031198981199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119873119900119903119898119904[3 lowast V119890119903119905 119894119889 + 119896](11) end for(12) end if(13) end for(14) end for(15) 1198791199031198941199041015840 larr997888 119879119890119909119868119899119909

Algorithm 1 UV-Guided mesh rebuilding algorithm

for the mesh is the same as the order of triangle storage forthe 2D texture patches 119873119900119903119898119904 is the array of vertex normalvectors 119900119903119894119879119903119894119873119906119898 is the total number of original trianglesand 119905119890119909119862119900119900119903119889119873119906119898 is the number of texture points in 2Dtexture space

After rebuilding the mesh the number of vertices in1198811198901199031199051199041015840 will be identical to the number of texture points119905119890119909119862119900119900119903119889119873119906119898 and the array of triangles (1198791199031198941199041015840) is replacedby the array of indices of the texture points (119879119890119909119868119899119909)

53 Source Character and Instance Indexing After applyingthe data packing and mesh rebuilding methods presentedin Sections 51 and 52 the multiple data of a source char-acter are organized into GPU-friendly data structures Thecharacterrsquos skeleton skinning weights and animations arepacked into textures and read-only in shader programs on

the GPU The vertices triangles texture coordinates andvertex normal vectors are stored in arrays and retained onthe GPU During the runtime based on the LOD Selectionresult (see Section 41) a successive subsequence of verticestriangles texture coordinates and vertex normal vectorsare selected for each instance and maintained as singlebuffer objects As mentioned in Section 41 the simplifiedinstances are constructed in a parallel fashion through ageneral-purpose GPU programming framework Then theframework interoperates with the GPUrsquos shader programsand allows shaders to perform rendering tasks for theinstances Because continuous LOD and animated instancingtechniques are assumed to be used in our approach instanceshave to be rendered one-by-one which is the same as the wayof rendering animated instances in geometry-instancing andpseudo-instancing techniques However our approach needs

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

6 International Journal of Computer Games Technology

In particular the geometry-instancing technique requires asingle copy of vertex data maintained in the VBO a singlecopy of triangle data maintained in the IBO and a single copyof distinct world transformations of all instances Howeverif the source character has a high geometric complexity andthere are lots of instances the geometry-instancing techniquemay make the uniform data type in shaders hit the size limitdue to the large amount of vertices and triangles sent to theGPU In such case the drawing call has to be broken intomultiple calls

There are two types of implementations for instancingtechniques static batching and dynamic batching [45] Thesingle-call method in the geometry-instancing technique isimplementedwith static batching while themulticallmethodin both the pseudo-instancing and geometry-instancing tech-niques are implemented with dynamic batching In staticbatching all vertices and triangles of the instances are savedinto a VBO and IBO In dynamic batching the verticesand triangles are maintained in different buffer objects anddrawn separately The implementation with static batchinghas the potential to fully utilize the GPU memory whiledynamic batchingwould underutilize thememoryThemajorlimitation of static batching is the lack of LOD and skinningsupports This limitation makes the static batching notsuitable for rendering animated instances though it has beenproved to be faster than dynamic batching in terms of theperformance of rasterizing meshes

In our work the storage of instances is managed similarlyto the implementation of static batching while individualinstances can still be accessed similarly to the implementationof dynamic batching Therefore our approach can be seam-lessly integrated with LOD and skinning techniques whiletaking the use of a single VBO and IBO for fast renderingThis section describes the details of our contributions forcharacter and instance management including texture pack-ing UV-guided mesh rebuilding and instance indexing

51 Packing Skeleton Animation and Skinning Weights intoTextures Smooth deformation of a 3D mesh is a computa-tionally expensive process because each vertex of the meshneeds to be repositioned by the joints that influence itWe packed the skeleton animations and skinning weightsinto 2D textures on the GPU so that shader programs canaccess them quickly The skeleton is the binding pose of thecharacter As explained in (2) the inverse of the bindingposersquos transformation is used during the runtime In ourapproach we stored this inverse into the texture as the skeletalinformation For each joint of the skeleton instead of storingindividual translation rotation and scale values we storedtheir composed transformation in the form of a 4 times 4matrixEach jointrsquos binding pose transformation matrix takes fourRGBA texels for storage Each RGBA texel stores a rowof the matrix Each channel stores a single element of thematrix By usingOpenGLmatrices are stored as the format ofGL RGBA32F in the texture which is a 32-bit floating-pointtype for each channel in one texel Let us denote the totalnumber of joints in a skeleton as 119870 Then the total numberof texels to store the entire skeleton is 4119870

We used the same format for storing the skeleton to storean animation Each animation frame needs 4119870 texels to storethe jointsrsquo transformation matrices Let us denote the totalnumber of frames in an animation as 119876 Then the totalnumber of texels for storing the entire animation is 4119870119876For each animation frame the matrix elements are saved intosuccessive texels in the row order Here we want to note thateach animation frame starts from a new row in the texture

The skinning weights of each vertex are four values in therange of [0 1] where each value represents the influencingpercentage of a skinning joint For each vertex the skinningweights require eight data elements where the first fourdata elements are joint indices and the last four are thecorresponding weights In other words each vertex requirestwo RGBA texels to store the skinning weights The first texelis used to store joint indices and the second texel is used tostore weights

52 UV-Guided Mesh Rebuilding A 3D mesh is usually aseamless surface without boundary edges The mesh has tobe cut and unfolded into 2D flatten patches before a textureimage can be mapped onto it To do this some edges haveto be selected properly as boundary edges from which themesh can be cut The relationship between the vertices ofa 3D mesh and 2D texture coordinates can be described asa texture mapping function 119865(119909 119910 119911) 997888rarr (119904119894 119905119894) Innervertices (those not on boundary edges) have a one-to-onetexturemapping In other words each inner vertex ismappedto a single pair of texture coordinates For the vertices onboundary edges since boundary edges are the cutting seamsa boundary vertex needs to be mapped to multiple pairs oftexture coordinates Figure 3 shows an example that unfolds acubemesh andmaps it into a flatten patch in 2D texture spaceIn the figure 119906119894 stands for a point in the 2D texture spaceEach vertex on the boundary edges is mapped to more thanone points which are 119865(V0) = 1199061 1199063 119865(V3) = 11990610 11990612119865(V4) = 1199060 1199064 1199066 and 119865(V7) = 1199067 1199069 11990613

In a hardware-accelerated renderer texture coordinatesare indexed from a buffer object and each vertex shouldassociate with a single pair of texture coordinates Since thetexture mapping function produces more than one pairs oftexture coordinates for boundary vertices we conducted amesh rebuilding process to duplicate boundary vertices andmapped each duplicated one to a unique texture point Bydoing this although the number of vertices is increased dueto the cuttings on boundary edges the number of trianglesis the same as the number of triangles in the original meshIn our approach we initialized two arrays to store UVinformation One array stores texture coordinates the otherarray stores the indices of texture points with respect to theorder of triangle storage Algorithm 1 shows the algorithmicprocess to duplicate boundary vertices by looping throughall triangles In the algorithm 119881119890119903119905119904 is the array of originalvertices storing 3D coordinates (119909 119910 119911) 119879119903119894119904 is the arrayof original triangles storing the sequence of vertex indicesSimilar to 119879119903119894119904 119879119890119909119868119899119909 is the array of indices of texturepoints in 2D texture space and represents the same triangulartopology as the mesh Note that the order of triangle storage

International Journal of Computer Games Technology 7

v0

v1 v2

v3

v4

v5 v6

v7

(a)

u1 (v0)

u3(v0)

u4 (v4)u5 (v5)

u6 (v4) u7 (v 7)

u8 (v6)u9 (v7)

u10 (v3)u11 (v2)u2 (v1)

u12 (v3)

u13 (v7)u0 (v4)

(b)

Figure 3 An example showing the process of unfolding a cube mesh and mapping it into a flatten patch in 2D texture space (a) is the 3Dcube mesh formed by 8 vertices and 6 triangles (b) is the unfolded texture map formed by 14 pairs of texture coordinates and 6 trianglesBold lines are the boundary edges (seams) to cut the cube and the vertices in red are boundary vertices that are mapped into multiple pairsof texture coordinates In (b) 119906119894 stands for a point (119904119894 119905119894) in 2D texture space and V119894 in the parenthesis is the corresponding vertex in the cube

RebuildMesh(Input 119881119890119903119905119904 119879119903119894119904 119879119890119909119862119900119900119903119889119904119873119900119903119898119904 119879119890119909119868119899119909 119900119903119894119879119903119894119873119906119898 119905119890119909119862119900119900119903119889119873119906119898Output 1198811198901199031199051199041015840 11987911990311989411990410158401198731199001199031198981199041015840)

(1) 119888ℎ119890119888119896119890119889[119905119890119909119862119900119900119903119889119873119906119898] larr997888 119891119886119897119904119890(2) for each triangle 119894 of 119900119903119894119879119903119894119873119906119898 do(3) for each vertex 119895 in the 119894th triangle do(4) 119905119890119909119875119900119894119899119905 119894119889 larr997888 119879119890119909119868119899119909[3 lowast 119894 + 119895](5) if 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] = 119891119886119897119904119890 then(6) 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] larr997888 119905119903119906119890(7) V119890119903119905 119894119889 larr997888 119879119903119894119904[3 lowast 119894 + 119895](8) for each coordinate 119896 in the 119896th vertex do(9) 1198811198901199031199051199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119881119890119903119905119904[3 lowast V119890119903119905 119894119889 + 119896](10) 1198731199001199031198981199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119873119900119903119898119904[3 lowast V119890119903119905 119894119889 + 119896](11) end for(12) end if(13) end for(14) end for(15) 1198791199031198941199041015840 larr997888 119879119890119909119868119899119909

Algorithm 1 UV-Guided mesh rebuilding algorithm

for the mesh is the same as the order of triangle storage forthe 2D texture patches 119873119900119903119898119904 is the array of vertex normalvectors 119900119903119894119879119903119894119873119906119898 is the total number of original trianglesand 119905119890119909119862119900119900119903119889119873119906119898 is the number of texture points in 2Dtexture space

After rebuilding the mesh the number of vertices in1198811198901199031199051199041015840 will be identical to the number of texture points119905119890119909119862119900119900119903119889119873119906119898 and the array of triangles (1198791199031198941199041015840) is replacedby the array of indices of the texture points (119879119890119909119868119899119909)

53 Source Character and Instance Indexing After applyingthe data packing and mesh rebuilding methods presentedin Sections 51 and 52 the multiple data of a source char-acter are organized into GPU-friendly data structures Thecharacterrsquos skeleton skinning weights and animations arepacked into textures and read-only in shader programs on

the GPU The vertices triangles texture coordinates andvertex normal vectors are stored in arrays and retained onthe GPU During the runtime based on the LOD Selectionresult (see Section 41) a successive subsequence of verticestriangles texture coordinates and vertex normal vectorsare selected for each instance and maintained as singlebuffer objects As mentioned in Section 41 the simplifiedinstances are constructed in a parallel fashion through ageneral-purpose GPU programming framework Then theframework interoperates with the GPUrsquos shader programsand allows shaders to perform rendering tasks for theinstances Because continuous LOD and animated instancingtechniques are assumed to be used in our approach instanceshave to be rendered one-by-one which is the same as the wayof rendering animated instances in geometry-instancing andpseudo-instancing techniques However our approach needs

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of Computer Games Technology 7

v0

v1 v2

v3

v4

v5 v6

v7

(a)

u1 (v0)

u3(v0)

u4 (v4)u5 (v5)

u6 (v4) u7 (v 7)

u8 (v6)u9 (v7)

u10 (v3)u11 (v2)u2 (v1)

u12 (v3)

u13 (v7)u0 (v4)

(b)

Figure 3 An example showing the process of unfolding a cube mesh and mapping it into a flatten patch in 2D texture space (a) is the 3Dcube mesh formed by 8 vertices and 6 triangles (b) is the unfolded texture map formed by 14 pairs of texture coordinates and 6 trianglesBold lines are the boundary edges (seams) to cut the cube and the vertices in red are boundary vertices that are mapped into multiple pairsof texture coordinates In (b) 119906119894 stands for a point (119904119894 119905119894) in 2D texture space and V119894 in the parenthesis is the corresponding vertex in the cube

RebuildMesh(Input 119881119890119903119905119904 119879119903119894119904 119879119890119909119862119900119900119903119889119904119873119900119903119898119904 119879119890119909119868119899119909 119900119903119894119879119903119894119873119906119898 119905119890119909119862119900119900119903119889119873119906119898Output 1198811198901199031199051199041015840 11987911990311989411990410158401198731199001199031198981199041015840)

(1) 119888ℎ119890119888119896119890119889[119905119890119909119862119900119900119903119889119873119906119898] larr997888 119891119886119897119904119890(2) for each triangle 119894 of 119900119903119894119879119903119894119873119906119898 do(3) for each vertex 119895 in the 119894th triangle do(4) 119905119890119909119875119900119894119899119905 119894119889 larr997888 119879119890119909119868119899119909[3 lowast 119894 + 119895](5) if 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] = 119891119886119897119904119890 then(6) 119888ℎ119890119888119896119890119889[119905119890119909119875119900119894119899119905 119894119889] larr997888 119905119903119906119890(7) V119890119903119905 119894119889 larr997888 119879119903119894119904[3 lowast 119894 + 119895](8) for each coordinate 119896 in the 119896th vertex do(9) 1198811198901199031199051199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119881119890119903119905119904[3 lowast V119890119903119905 119894119889 + 119896](10) 1198731199001199031198981199041015840[3 lowast 119905119890119909119875119900119894119899119905 119894119889 + 119896] = 119873119900119903119898119904[3 lowast V119890119903119905 119894119889 + 119896](11) end for(12) end if(13) end for(14) end for(15) 1198791199031198941199041015840 larr997888 119879119890119909119868119899119909

Algorithm 1 UV-Guided mesh rebuilding algorithm

for the mesh is the same as the order of triangle storage forthe 2D texture patches 119873119900119903119898119904 is the array of vertex normalvectors 119900119903119894119879119903119894119873119906119898 is the total number of original trianglesand 119905119890119909119862119900119900119903119889119873119906119898 is the number of texture points in 2Dtexture space

After rebuilding the mesh the number of vertices in1198811198901199031199051199041015840 will be identical to the number of texture points119905119890119909119862119900119900119903119889119873119906119898 and the array of triangles (1198791199031198941199041015840) is replacedby the array of indices of the texture points (119879119890119909119868119899119909)

53 Source Character and Instance Indexing After applyingthe data packing and mesh rebuilding methods presentedin Sections 51 and 52 the multiple data of a source char-acter are organized into GPU-friendly data structures Thecharacterrsquos skeleton skinning weights and animations arepacked into textures and read-only in shader programs on

the GPU The vertices triangles texture coordinates andvertex normal vectors are stored in arrays and retained onthe GPU During the runtime based on the LOD Selectionresult (see Section 41) a successive subsequence of verticestriangles texture coordinates and vertex normal vectorsare selected for each instance and maintained as singlebuffer objects As mentioned in Section 41 the simplifiedinstances are constructed in a parallel fashion through ageneral-purpose GPU programming framework Then theframework interoperates with the GPUrsquos shader programsand allows shaders to perform rendering tasks for theinstances Because continuous LOD and animated instancingtechniques are assumed to be used in our approach instanceshave to be rendered one-by-one which is the same as the wayof rendering animated instances in geometry-instancing andpseudo-instancing techniques However our approach needs

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

8 International Journal of Computer Games Technology

(vNum0 tNum0)(vNum1 tNum1)(vNum2 tNum2)

instance 0 instance 1 instance 2

LOD selection

result

Vertices and triangles on the GPU

Generated LOD meshes

VBO and IBO

Reforming selected triangles to desired shapes

Figure 4 An example illustrating the data structures for storing vertices and triangles of three instances in VBO and IBO respectivelyThosedata are stored on the GPU and all data operations are executed in parallel on the GPU The VBO and IBO store data for all instances thatare selected from the array of original vertices and triangles of the source characters V119873119906119898 and 119905119873119906119898 arrays are the LOD section result

to construct the data within one execution call rather thandealing with per-instance data

Figure 4 illustrates the data structures of storing VBOand IBO on the GPU Based on the result of LOD Selectioneach instance is associated with a V119873119906119898 and a 119905119873119906119898 (seeSection 41) that represent the amount of vertices and trian-gles selected based on the current view setting We employedCUDAThrust [51] to process the arrays of V119873119906119898 and 119905119873119906119898using the prefix sum algorithm in a parallel fashion As aresult for example each V119873119906119898[119894] represents the offset ofvertex count prior to the 119894th instance and the number ofvertices for the 119894th instance is (V119873119906119898[119894 + 1] minus V119873119906119898[119894])

Algorithm 2 describes the vertex transformation processin parallel in the vertex shader It transforms the instancersquosvertices to their destination positions while the instanceis being animated In the algorithm 119888ℎ119886119903119873119906119898 representsthe total number of source characters The inverses ofthe binding pose skeletons are a texture array denoted as119894119899V119861119894119899119889119875119900119904119890[119888ℎ119886119903119873119906119898] The skinning weights are a texturearray denoted as 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] We used a walkanimation for each source character and the texture arrayof the animations is denoted as 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 isthe global 4 times 4 transformation matrix of the instance inthe virtual scene This algorithm is developed based on thedata packing formats described in Section 51 Each sourcecharacter is assigned with a unique source character IDdenoted as 119888 119894119889 in the algorithm The drawing calls areissued per instance so 119888 119894119889 is passed into the shader asan input parameter The function of 119866119890119905119871119900119888() computes thecoordinates in the texture space to locate which texel tofetch The input of 119866119890119905119871119900119888() includes the current vertex orjoint index (119894119889) that needs to be mapped the width (119908) andheight (ℎ) of the texture and the number of texels (119889119894119898)associating with the vertex or joint For example to retrievea vertexrsquos skinning weights the 119889119894119898 is set to 2 to retrievea jointrsquos transformation matrix the 119889119894119898 is set to 4 In thefunction of 119879119903119886119899119904119891119900119903119898119881119890119903119905119894119888119890119904() vertices of the instance aretransformed in a parallel fashion by the composed matrix(119888119900119898119901119900119904119890119889119872119886119905) computed from a weighted sum of matricesof the skinning joints The function 119878119886119898119901119897119890() takes a textureand the coordinates located in the texture space as input Itreturns the values encoded in texels The 119878119886119898119901119897119890() function

is usually provided in a shader programming frameworkDifferent from the rendering of static models animated char-acters change geometric shapes over time due to continuouspose changes in the animation In the algorithm 119891 119894119889 standsfor the current frame index of the instancersquos animation 119891 119894119889is updated in the main code loop during the execution of theprogram

6 Experiment and Analysis

We implemented our crowd rendering system on a worksta-tion with Intel i7-5930K 350GHz PC with 64GBytes of RAMand an Nvidia GeForce GTX980 Ti 6GB graphics card Therendering system is built with Nvidia CUDA Toolkit 92 andOpenGL We employed 14 unique source characters Table 1shows the data configurations of those source charactersEach source character is articulated with a skeleton and fullyskinned and animated with a walk animation Each onecontains an unfolded texture UV set along with a textureimage at the resolution of 2048 times 2048 We stored thosesource characters on theGPUAlso all source characters havebeen preprocessed by the mesh simplification and animationalgorithms described in Section 4 We stored the edge-collapsing information and the character bounding spheresin arrays on the GPU In total the source characters require18450MB memory for storage The size of mesh data ismuch smaller than the size of texture images The mesh datarequires 1650MBmemory for storage which is only 894 ofthe total memory consumed At initialization of the systemwe randomly assigned a source character to each instance

61 Visual Fidelity and Performance Evaluations As definedin Section 41 119873 is a memory budget parameter thatdetermines the geometric complexity and the visual qualityof the entire crowd For each instance in the crowd thecorresponding bounding sphere is tested against the viewfrustum to determine its visibility The value of 119873 is onlydistributed across visible instances

We created a walkthrough camera path for the renderingof the crowdThe camera path emulates a gaming navigationbehavior and produces a total of 1000 frames The entire

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of Computer Games Technology 9

GetLoc(Input 119908 ℎ 119894119889 dimOutput 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910)

(1) 119906119899119894119905 119909 larr997888 1119908(2) 119906119899119894119905 119910 larr997888 1ℎ(3) 119909 larr997888 119889119894119898 lowast (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119909(4) 119910 larr997888 (119894119889(119908119889119894119898)) lowast 119906119899119894119905 119910

TransformVertices(Input 1198811198901199031199051199041015840 119894119899V119875119900119904119890[119888ℎ119886119903119873119906119898] 119904119896119894119899119882119890119894119892ℎ119905119904[119888ℎ119886119903119873119906119898] 119886119899119894119898[119888ℎ119886119903119873119906119898] 119892119872119886119905 119888 119894119889 119891 119894119889Output 11988111989011990311990511990410158401015840)

(1) for each vertex V119894 in 1198811198901199031199051199041015840 in parallel do(2) 1198721198861199051199031198941199094 times 4 119888119900119898119901119900119904119890119889119872119886119905 larr997888 119894119889119890119899119905119894119905119910(3) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]119908119894119889119905ℎ 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905 119894 2)(4) ℎ larr997888 119904119896119894119899119908119890119894119892ℎ119905119904[119888 119894119889]ℎ119890119894119892ℎ119905(5) 1198811198901198881199051199001199034 119895119899119905119868119899119909 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (0 lowast 119906119899119894119905 119909 + 119909) 119910)(6) 1198811198901198881199051199001199034 119908119890119894119892ℎ119905119904 larr997888 119878119886119898119901119897119890(119904119896119894119899119882119890119894119892ℎ119905119904[119888 119894119889] (1 lowast 119906119899119894119905 119909 + 119909) 119910)(7) for each 119895 in 4 do(8) 1198721198861199051199031198941199094 times 4 119894119899V119861119894119899119889119872119886119905(9) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119894119899V119875119900119904119890[119888 119894119889]119908119894119889119905ℎ 119894119899V119875119900119904119890[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(10) for each 119896 in 4(11) 119894119899V119861119894119899119889119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119894119899V119875119900119904119890[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119910)(12) end for(13) 119906119899119894119905 119909 119906119899119894119905 119910 119909 119910 larr997888 119866119890119905119871119900119888(119886119899119894119898[119888 119894119889]119908119894119889119905ℎ 119886119899119894119898[119888 119894119889]ℎ119890119894119892ℎ119905 119895119899119905119868119899119909[119895] 4)(14) 119874119891119891119904119890119905 larr997888 119891 119894119889 lowast (119905119900119905119886119897119869119899119905119873119906119898119886119899119894119898[119888 119894119889]119908119894119889119905ℎ4) lowast 119906119899119894119905 119910(15) 1198721198861199051199031198941199094 times 4 119886119899119894119898119872119886119905(16) for each 119896 in 4 do(17) 119886119899119894119898119872119886119905119903119900119908[119896] larr997888 119878119886119898119901119897119890(119886119899119894119898[119888 119894119889] (119896 lowast 119906119899119894119905 119909 + 119909) 119874119891119891119904119890119905 + 119910)(18) end for(19) 119888119900119898119901119900119904119890119889119872119886119905+ = 119908119890119894119892ℎ119905119904[119895] lowast 119886119899119894119898119872119886119905 lowast 119894119899V119861119894119899119889119872119886119905(20) end for(21) 11988111989011990311990511990410158401015840[119894] = 119898119900119889119890119897119881119894119890119908119875119903119900119895119890119888119905119894119900119899119872119886119905 lowast 119892119872119886119905 lowast 119888119900119898119901119900119904119890119889119872119886119905 lowast 1198811198901199031199051199041015840[119894](22) end for

Algorithm 2 Transforming vertices of an instance in vertex shader

Table 1 The data information of the source characters used in our experiment Note that the data information represents the characters thatare prior to the process of UV-guided mesh rebuilding (see Section 52)

Names of Source Characters of Vertices of Triangles of texture UVs of JointsAlien 2846 5688 3334 64Bug 3047 6090 3746 44Creepy 2483 4962 2851 67Daemon 3328 6652 4087 55Devourer 2975 5946 3494 62Fat 2555 5106 2960 50Mutant 2265 4526 2649 46Nasty 3426 6848 3990 45Pangolin 2762 5520 3257 49Ripper Dog 2489 4974 2982 48Rock 2594 5184 3103 62Spider 2936 5868 3374 38Titan 3261 6518 3867 45Troll 2481 4958 2939 55

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

10 International Journal of Computer Games Technology

Main Camera

Reference Camera

N = 1 million N = 6 million N = 10 million(a)

(b)

Figure 5 An example of the rendering result produced by our systemusing different119873 values (a) shows the captured imageswith119873 = 1 6 10million The top images are the rendered frame from the main cameraThe bottom images are rendered based on the setting of the referencecamera which aims at the instances that are far away from the main camera (b) shows the entire crowd including the view frustums of thetwo cameras The yellow dots in (b) represent the instances outside the view frustum of the main camera

crowd contains 30000 instances spread out in the virtualscene with predefined movements and moving trajectoriesFigure 5 shows a rendered frame of the crowd with the valueof119873 set to 1 6 and 10 million respectively Themain cameramoves on the walkthrough path The reference camera isaimed at the instances far away from the main camera andshows a close-up view of those far-away instances OurLOD-based instancing method ensures the total number ofselected vertices and triangles is within the specified memorybudget while preserving the fine details of instances that arecloser to the main camera Although the far-away instancesare simplified significantly because the long distances tothe main camera their appearance in the main camera donot cause a loss of visual fidelity Figure 5(a) shows visualappearance of the crowd rendered from the viewpoint of themain camera (top images) in which far-away instances arerendered using the simplified versions (bottom images)

If all instances are rendered at the level of full detail thetotal number of triangles would be 16907 million Throughthe simulation of the walkthrough camera path we had anaverage of 15 771 instances inside the view frustum The

maximum and minimum numbers of instances inside theview frustum are 29967 and 5038 respectively We specifieddifferent values for 119873 Table 2 shows the performance break-downs with regard to the runtime processing componentsin our system In the table the ldquo of Rendered Trianglesrdquocolumn includes the minimum maximum and averagednumber of triangles selected during the runtime As we cansee the higher the value of 119873 is the more the triangles areselected to generate simplified instances and subsequentlythe better visual quality is obtained for the crowd Ourapproach is memory efficient Even when 119873 is set to a largevalue such as 20 million the crowd needs only 2623 milliontriangles in average which is only 1551 of the originalnumber of triangles When the value of 119873 is small thedifference between the averaged and the maximum numberof triangles is significant For example when 119873 is equal to5 million the difference is at a ratio (119886V119890119903119886119892119890119898119886119909119894119898119906119898)of 7396 This indicates that the number of triangles inthe crowd varies significantly according to the change ofinstance-camera relationships (including instancesrsquo distanceto the camera and their visibility) This is because a small

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of Computer Games Technology 11

Table 2 Performance breakdowns for the systemwith a precreated camera pathwith total 30000 instancesThe FPS value and the componentexecution times are averaged over 1000 frames The total number of triangles (before the use of LOD) is 16907 million

N FPS of Rendered Component Execution Times (millisecond)

Triangles (million) View-FrustumCulling +LOD Selection LOD Mesh Generation Animating Meshes +

RenderingMin Max Avg1 4791 188 970 520 068 287 26065 4330 612 1014 750 065 387 264410 3282 1164 1378 1304 065 627 294015 2300 1788 2073 1954 071 945 364320 1871 2313 2773 2623 068 1252 4285

Frames per second (FPS)

N (million)

45

40

35

30

25

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6 The change of FPS over different values of 119873 by using our approachThe FPS is averaged over the total of 1000 frames

value of 119873 limits the level of details that an instance canreach up Even if an instance is close to the camera it maynot obtain a sufficient cut from 119873 to satisfy a desired detaillevel As we can see in the table when the value of119873 becomeslarger than 10 million the ratio is increased to 94 TheView-Frustum Culling and LOD Selection components areimplemented together and both are executed in parallel atan instance level Thus the execution time of this componentdoes not change as the value of 119873 increases The componentof LOD Mesh Generation is executed in parallel at a trianglelevel Its execution time increases as the value of119873 increasesAnimating Meshes and Rendering components are executedwith the acceleration of OpenGLrsquos buffer objects and shaderprograms They are time-consuming and their executiontime increases asmore triangles need to be rendered Figure 6shows the change of FPS over different values of119873 As we cansee in the figure the FPS decreases as the value of119873 increasesWhen119873 is smaller than 4million the decreasing slope of FPSis smallThis is because the change on the number of trianglesover frames of the camera path is small When 119873 is smallmany close-up instances end down to the lowest level ofdetails due to the insufficient memory budget from119873 When119873 increases from 4 to 17 million the decreasing slop of FPSbecomes larger This is because the number of triangles overframes of the camera path varies considerably with differentvalues of119873 As119873 increases beyond 17million the decreasingslope becomes smaller again asmany instances including far-away ones reach the full level of details

62 Comparison and Discussion We analyzed two render-ing techniques and compared them against our approach

in terms of performance and visual quality The pseudo-instancing technique minimizes the amount of data dupli-cation by sharing vertices and triangles among all instancesbut it does not support LOD on a per-instance level [44 52]The point-based technique renders a complex geometry byusing a dense of sampled points in order to reduce the com-putational cost in rendering [53 54] The pseudo-instancingtechnique does not support View-Frustum Culling For thecomparison reason in our approach we ensured all instancesto be inside the view frustum of the camera by setting afixed position of the camera and setting fixed positions forall instances so that all instances are processed and renderedby our approach The complexity of each instance renderedby the point-based technique is selected based on its distanceto the camera which is similar to our LOD method Whenan instance is near the camera the original mesh is usedfor rendering when the instance is located far away fromthe camera a set of points are approximated as samplepoints to represent the instance In this comparison thepseudo-instancing technique always renders original meshesof instances We chose two different N values (119873= 5 millionand 119873= 10 million) for rendering with our approach Asshown in Figure 7 our approach results in better performancethan the pseudo-instancing technique This is because thenumber of triangles rendered by using the pseudo-instancingtechnique is much larger than the number of triangles deter-mined by the LOD Selection component of our approachThe performance of our approach becomes better than thepoint-based technique as the number of119873 increases Figure 8shows the comparison of visual quality among our approachpseudo-instancing technique and point-based technique

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

12 International Journal of Computer Games Technology

Frames per second (FPS)

Number of Instances (thousand)

Ours (N=5 million)Ours (N=10 million)Point-basedPseudo-instancing

60

50

30

40

20

8 10

10

12 14 16 18 20 22 24 26 28 30

Figure 7 The performance comparison of our approach pseudo-instancing technique and point-based technique over different numbersof instances Two values of 119873 are chosen for our approach (119873 = 5 million and 119873 = 10 million) The FPS is averaged over the total of 1000frames

(a)

(b)

(c)

(d)

(e)

(f)

Figure 8The visual quality comparison of our approach pseudo-instancing technique and point-based techniqueThe number of renderedinstances on the screen is 20000 (a) shows the captured image of our approach rendering result with 119873 = 5 million (b) shows the capturedimage of pseudo-instancing technique rendering result (c) shows the captured image of the point-based technique rendering result (d) (e)and (f) are the captured images zoomed in on the area of instances which are far away from the camera

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of Computer Games Technology 13

The image generated from the pseudo-instancing techniquerepresents the original quality Our approach can achievebetter visual quality than the point-based technique As wecan see in the top images of the figure the instances faraway from the camera rendered by the point-based techniqueappear to have ldquoholesrdquo due to the disconnection betweenvertices In addition the popping artifact appears when usingthe point-based technique This is because the techniqueuses a limited number of detail levels from the technique ofdiscrete LODOur approach avoids the popping artifact sincecontinuous LOD representations of the instances are appliedduring the rendering

7 Conclusion

In this work we presented a crowd rendering system thattakes the advantages of GPU power for both general-purposeand graphics computations We rebuilt the meshes of sourcecharacters based on the flatten pieces of texture UV setsWe organized the source characters and instances into bufferobjects and textures on the GPU Our approach is integratedseamlessly with continuous LOD and View-Frustum Cullingtechniques Our system maintains the visual appearanceby assigning each instance an appropriate level of detailsWe evaluated our approach with a crowd composed of30000 instances and achieved real-time performance Incomparison with existing crowd rendering techniques ourapproach better utilizes GPU memory and reaches a higherrendering frame rate

In the future wewould like to integrate our approachwithocclusion culling techniques to further reduce the numberof vertices and triangles during the runtime and improvethe visual quality We also plan to integrate our crowdrendering systemwith a simulation in a real game applicationCurrently we only used a single walk animation in the crowdrendering system In a real game application more animationtypes should be added and a motion graph should be createdin order to make animations transit smoothly from oneto another We also would like to explore possibilities totransplant our approach onto a multi-GPU platform wherea more complex crowd could be rendered in real timewith the supports of higher memory capability and morecomputational power provided by multiple GPUs

Data Availability

The source character assets used in our experiment werepurchased from cgtradercom The source code includingGPU and shader programs were developed in our researchlab by the authors of this paper The source code has beenarchived in our lab and available for distribution uponrequestsThe supplementary video (available here) submittedtogether with the manuscript shows our experimental resultof real-time crowd rendering

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported by the National Science FoundationGrant CNS-1464323 We thank Nvidia for donating the GPUdevice that has been used in this work to run our approachand produce experimental resultsThe source characters werepurchased from cgtradercom with the buyerrsquos license thatallows us to disseminate the rendered and moving images ofthose characters through the software prototypes developedin this work

Supplementary Materials

We submitted a demo video of our crowd rendering systemas a supplementary material The video consists of two partsThe first part is a recorded video clip showing the real-timerendering result as the camera moves along the predefinedwalkthrough path in the virtual scene The bottom viewcorresponds the userrsquos view and the top view is a referenceview showing the changes of the view frustum of the userrsquosview (red lines) In the reference view the yellow dotsrepresent the character instances outside the view frustumof the userrsquos view At the top-left corner of the screen theruntime performance and data information are plotted Theruntime performance is measured by the frames per second(FPS) The N value is set to 10 million the total number ofinstances is set to 30000 and the total number of originaltriangles of all instances is 16907 million The number ofrendered triangles changes dynamically according to theuserrsquos view changes since the continuous LOD algorithm isused in our approach The second part of the video explainsthe visual effect of LOD algorithm It uses a straight-linemoving path that allows the main camera moves from onecorner of the crowd to another corner at a constant speedThe video is played 4 times faster than the original FPS of thevideo clip The top-right view shows the entire scene of thecrowd The red lines represent the view frustum of the maincamera which is moving along the straight-line path Theblue lines represent the view frustum of the reference camerawhich is fixed and aimed at the far-away instances from themain camera As we can see in the view of reference camera(top-left corner) the simplified far-away instances are gainingmore details as the main camera moves towards the locationof the reference camera At the same time more instances areoutside the view frustum of the main camera represented asyellow dots (Supplementary Materials)

References

[1] S Lemercier A Jelic R Kulpa et al ldquoRealistic followingbehaviors for crowd simulationrdquoComputerGraphics Forum vol31 no 2 Part 2 pp 489ndash498 2012

[2] A Golas R Narain S Curtis and M C Lin ldquoHybrid long-range collision avoidancefor crowd simulationrdquo IEEE Transac-tions on Visualization and Computer Graphics vol 20 no 7 pp1022ndash1034 2014

[3] G Payet ldquoDeveloping a massive real-time crowd simulationframework on the gpurdquo 2016

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

14 International Journal of Computer Games Technology

[4] I Karamouzas N Sohre R Narain and S J Guy ldquoImplicitcrowds Optimization integrator for robust crowd simulationrdquoACM Transactions on Graphics vol 36 no 4 pp 1ndash13 2017

[5] J R Shewchuk ldquoDelaunay refinement algorithms for triangularmesh generationrdquo Computational Geometry Theory and Appli-cations vol 22 no 1-3 pp 21ndash74 2002

[6] J Pettre P D H Ciechomski J Maim B Yersin J-P LaumondandDThalmann ldquoReal-time navigating crowds scalable simu-lation and renderingrdquoComputer Animation andVirtualWorldsvol 17 no 3-4 pp 445ndash455 2006

[7] J Maım B Yersin J Pettre and D Thalmann ldquoYaQ Anarchitecture for real-time navigation and rendering of variedcrowdsrdquo IEEE Computer Graphics and Applications vol 29 no4 pp 44ndash53 2009

[8] C Peng S I Park Y Cao and J Tian ldquoA real-time systemfor crowd rendering parallel LOD and texture-preservingapproach on GPUrdquo in Proceedings of the International Confer-ence onMotion inGames vol 7060 ofLectureNotes in ComputerScience pp 27ndash38 Springer 2011

[9] A Beacco N Pelechano and C Andujar ldquoA survey of real-timecrowd renderingrdquo Computer Graphics Forum vol 35 no 8 pp32ndash50 2016

[10] R McDonnell M Larkin S Dobbyn S Collins and COrsquoSullivan ldquoClone attack Perception of crowd varietyrdquo ACMTransactions on Graphics vol 27 no 3 p 1 2008

[11] Y P Du Sel N Chaverou and M Rouille ldquoMotion retargetingfor crowd simulationrdquo in Proceedings of the 2015 Symposium onDigital Production DigiPro rsquo15 pp 9ndash14 ACM New York NYUSA August 2015

[12] F Multon L France M-P Cani-Gascuel and G DebunneldquoComputer animation of human walking a surveyrdquo Journal ofVisualization and Computer Animation vol 10 no 1 pp 39ndash541999

[13] S Guo R Southern J Chang D Greer and J J ZhangldquoAdaptive motion synthesis for virtual characters a surveyrdquoTheVisual Computer vol 31 no 5 pp 497ndash512 2015

[14] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimagemaps for gpu crowd renderingrdquoThe International Journalof Virtual Reality vol 6 no 1 pp 35ndash44 2007

[15] H Nguyen ldquoChapter 2 Animated crowd renderingrdquo in GpuGems 3 Addison-Wesley Professional 2007

[16] H Park and J Han ldquoFast rendering of large crowds using GPUrdquoin Entertainment Computing - ICEC 2008 S M Stevens andS J Saldamarco Eds vol 5309 of Lecture Notes in ComputerScience pp 197ndash202 Springer Berlin Germany 2009

[17] K-L Ma ldquoIn situ visualization at extreme scale challenges andopportunitiesrdquo IEEE Computer Graphics and Applications vol29 no 6 pp 14ndash19 2009

[18] T Sasabe S Tsushima and S Hirai ldquoIn-situ visualization ofliquid water in an operating PEMFCby soft X-ray radiographyrdquoInternational Journal of Hydrogen Energy vol 35 no 20 pp11119ndash11128 2010 Hyceltec 2009 Conference

[19] H Yu C Wang R W Grout J H Chen and K-L Ma ldquoInsitu visualization for large-scale combustion simulationsrdquo IEEEComputer Graphics and Applications vol 30 no 3 pp 45ndash572010

[20] BHernandezH Perez I Rudomin S Ruiz ODeGyves andLToledo ldquoSimulating and visualizing real-time crowds on GPUclustersrdquo Computacion y Sistemas vol 18 no 4 pp 651ndash6642015

[21] H Perez I Rudomin E A Ayguade B A Hernandez JA Espinosa-Oviedo and G Vargas-Solar ldquoCrowd simulationand visualizationrdquo in Proceedings of the 4th BSC Severo OchoaDoctoral Symposium Poster May 2017

[22] A Treuille S Cooper and Z Popovic ldquoContinuum crowdsrdquoACM Transactions on Graphics vol 25 no 3 pp 1160ndash11682006

[23] R Narain A Golas S Curtis and M C Lin ldquoAggregatedynamics for dense crowd simulationrdquo in ACM SIGGRAPHAsia 2009 Papers SIGGRAPH Asia rsquo09 p 1 Yokohama JapanDecember 2009

[24] X Jin J Xu C C L Wang S Huang and J Zhang ldquoInteractivecontrol of large-crowd navigation in virtual environments usingvector fieldsrdquo IEEEComputer Graphics andApplications vol 28no 6 pp 37ndash46 2008

[25] S Patil J Van Den Berg S Curtis M C Lin and D ManochaldquoDirecting crowd simulations using navigation fieldsrdquo IEEETransactions on Visualization and Computer Graphics vol 17no 2 pp 244ndash254 2011

[26] E Ju M G Choi M Park J Lee K H Lee and S TakahashildquoMorphable crowdsrdquoACMTransactions onGraphics vol 29 no6 pp 1ndash140 2010

[27] S I Park C Peng F Quek and Y Cao ldquoA crowd model-ing framework for socially plausible animation behaviorsrdquo inMotion in Games M Kallmann and K Bekris Eds vol 7660 ofLecture Notes in Computer Science pp 146ndash157 Springer BerlinGermany 2012

[28] F DMcKenzie M D Petty P A Kruszewski et al ldquoIntegratingcrowd-behavior modeling into military simulation using gametechnologyrdquo Simulation Gaming vol 39 no 1 Article ID1046878107308092 pp 10ndash38 2008

[29] S Zhou D Chen W Cai et al ldquoCrowd modeling andsimulation technologiesrdquo ACM Transactions on Modeling andComputer Simulation (TOMACS) vol 20 no 4 pp 1ndash35 2010

[30] Y Zhang B Yin D Kong and Y Kuang ldquoInteractive crowdsimulation on GPUrdquo Journal of Information and ComputationalScience vol 5 no 5 pp 2341ndash2348 2008

[31] A Malinowski P Czarnul K CzuryGo M Maciejewski and PSkowron ldquoMulti-agent large-scale parallel crowd simulationrdquoProcedia Computer Science vol 108 pp 917ndash926 2017 Interna-tional Conference on Computational Science ICCS 2017 12-14June 2017 Zurich Switzerland

[32] I Cleju andD Saupe ldquoEvaluation of supra-threshold perceptualmetrics for 3D modelsrdquo in Proceedings of the 3rd Symposium onApplied Perception in Graphics and Visualization APGV rsquo06 pp41ndash44 ACM New York NY USA July 2006

[33] S Dobbyn J Hamill K OrsquoConor and C OrsquoSullivan ldquoGeopos-tors A real-time geometryimpostor crowd rendering systemrdquoinProceedings of the 2005 Symposium on Interactive 3DGraphicsand Games I3D rsquo05 pp 95ndash102 ACM New York NY USAApril 2005

[34] B Ulicny P d Ciechomski and D Thalmann ldquoCrowdbrushinteractive authoring of real-time crowd scenesrdquo in Proceedingsof the Symposium on Computer Animation R Boulic and DK Pai Eds p 243 The Eurographics Association GrenobleFrance August 2004

[35] H Hoppe ldquoEfficient implementation of progressive meshesrdquoComputers Graphics vol 22 no 1 pp 27ndash36 1998

[36] M Garland and P S Heckbert ldquoSurface simplification usingquadric error metricsrdquo in Proceedings of the 24th AnnualConference on Computer Graphics and Interactive Techniques

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of Computer Games Technology 15

SIGGRAPH rsquo97 pp 209ndash216 ACMPressAddison-Wesley Pub-lishing Co New York NY USA 1997

[37] E Landreneau and S Schaefer ldquoSimplification of articulatedmeshesrdquo Computer Graphics Forum vol 28 no 2 pp 347ndash3532009

[38] A Willmott ldquoRapid simplification of multi-attribute meshesrdquoin Proceedings of the ACM SIGGRAPH Symposium on HighPerformance Graphics HPG rsquo11 p 151 ACM New York NYUSA August 2011

[39] W-W Feng B-U Kim Y Yu L Peng and J Hart ldquoFeature-preserving triangular geometry images for level-of-detail repre-sentation of static and skinned meshesrdquo ACM Transactions onGraphics (TOG) vol 29 no 2 article no 11 2010

[40] D P Savoy M C Cabral and M K Zuffo ldquoCrowd simulationrendering for webrdquo in Proceedings of the 20th InternationalConference on 3D Web Technology Web3D rsquo15 pp 159-160ACM New York NY USA June 2015

[41] F Tecchia C Loscos and Y Chrysanthou ldquoReal time ren-dering of populated urban environmentsrdquo Tech Rep ACMSIGGRAPH Technical Sketch 2001

[42] J Barczak N Tatarchuk and C Oat ldquoGPU-based scenemanagement for rendering large crowdsrdquo ACM SIGGRAPHAsia Sketches vol 2 no 2 2008

[43] B Hernandez and I Rudomin ldquoA rendering pipeline for real-time crowdsrdquo GPU Pro vol 2 pp 369ndash383 2016

[44] J Zelsnack ldquoGLSL pseudo-instancingrdquo Tech Rep 2004[45] F Carucci ldquoInside geometry instancingrdquo GPU Gems vol 2 pp

47ndash67 2005[46] G Ashraf and J Zhou ldquoHardware accelerated skin deformation

for animated crowdsrdquo in Proceedings of the 13th InternationalConference on Multimedia Modeling - Volume Part II MMM07pp 226ndash237 Springer-Verlag Berlin Germany 2006

[47] F Klein T Spieldenner K Sons and P Slusallek ldquoConfigurableinstances of 3D models for declarative 3D in the webrdquo inProceedings of the Nineteenth International ACMConference pp71ndash79 ACM New York NY USA August 2014

[48] C Peng and Y Cao ldquoParallel LOD for CAD model renderingwith effectiveGPUmemory usagerdquoComputer-AidedDesign andApplications vol 13 no 2 pp 173ndash183 2016

[49] T A Funkhouser and C H Sequin ldquoAdaptive display algo-rithm for interactive frame rates during visualization of com-plex virtual environmentsrdquo in Proceedings of the 20th AnnualConference on Computer Graphics and Interactive TechniquesSIGGRAPH rsquo93 pp 247ndash254 ACM New York NY USA 1993

[50] C Peng and Y Cao ldquoA GPU-based approach for massive modelrendering with frame-to-frame coherencerdquo Computer GraphicsForum vol 31 no 2 pp 393ndash402 2012

[51] N Bell and JHoberock ldquoThrust a productivity-oriented libraryfor CUDArdquo inGPUComputing Gems Jade Edition Applicationsof GPU Computing Series pp 359ndash371 Morgan KaufmannBoston Mass USA 2012

[52] E Millan and I Rudomn ldquoImpostors pseudo-instancing andimage maps for gpu crowd renderingrdquo Iranian Journal ofVeterinary Research vol 6 no 1 pp 35ndash44 2007

[53] L Toledo O De Gyves and I Rudomın ldquoHierarchical level ofdetail for varied animated crowdsrdquo The Visual Computer vol30 no 6-8 pp 949ndash961 2014

[54] L Lyu J Zhang and M Fan ldquoAn efficiency control methodbased on SFSM for massive crowd renderingrdquo Advances inMultimedia vol 2018 pp 1ndash10 2018

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom