generating collaborative systems for digital libraries ... · generating collaborative systems for...

16
GENERATING COLLABORATIVE SYSTEMS FOR DIGITAL LIBRARIES | MALIZIA, BOTTONI, AND LEVIALDI 171 from previous experience and from research in software engineering. Wasted effort and poor interoperability can therefore ensue, raising the costs of DLs and jeopardizing the fluidity of information assets in the future. In addition, there is a need for modeling services and data structures as highlighted in the “Digital Library Reference Model” proposed by the DELOS EU network of excellence (also called the “DELOS Manifesto”); 2 in fact, the distribution of DL services over digital networks, typically accessed through Web browsers or dedicated clients, makes the whole theme of interaction between users important, for both individual usage and remote collaboration. Designing and modeling such interactions call for considerations pertaining to the fields of human– computer interaction (HCI) and computer-supported cooperative work (CSCW). As an example, scenario- based or activity-based approaches developed in the HCI area can be exploited in DL design. To meet these needs we developed CRADLE (Cooperative-Relational Approach to Digital Library Environments), 3 a metamodel-based Digital Library Management System (DLMS) supporting collaboration in the design, development, and use of DLs, exploiting patterns emerging from previous projects. The entities of the CRADLE metamodel allow the specification of col- lections, structures, services, and communities of users (called “societies” in CRADLE) and partially reflect the DELOS Manifesto. The metamodel entities are based on existing DL taxonomies, such as those proposed by Fox and Marchionini, 4 Gonçalves et al., 5 or in the DELOS Manifesto, so as to leverage available tools and knowl- edge. Designers of DLs can exploit the domain-specific visual language (DVSL) available in the CRADLE envi- ronment—where familiar entities extracted from the referred taxonomies are represented graphically—to model data structures, interfaces and services offered to the final users. The visual model is then processed and transformed, exploiting suitable templates, toward a set of specific languages for describing interfaces and services. The results are finally transformed into platform- independent (Java) code for specific DL applications. CRADLE supports the basic functionalities of a DL through interfaces and service templates for managing, browsing, searching, and updating. These can be further specialized to deploy advanced functionalities as defined by designers through the entities of the proposed visual The design and development of a digital library involves different stakeholders, such as: information architects, librarians, and domain experts, who need to agree on a common language to describe, discuss, and negoti- ate the services the library has to offer. To this end, high-level, language-neutral models have to be devised. Metamodeling techniques favor the definition of domain- specific visual languages through which stakeholders can share their views and directly manipulate representations of the domain entities. This paper describes CRADLE (Cooperative-Relational Approach to Digital Library Environments), a metamodel-based framework and visual language for the definition of notions and services related to the development of digital libraries. A collection of tools allows the automatic generation of several services, defined with the CRADLE visual language, and of the graphical user interfaces providing access to them for the final user. The effectiveness of the approach is illustrated by presenting digital libraries generated with CRADLE, while the CRADLE environment has been evaluated by using the cognitive dimensions framework. D igital libraries (DLs) are rapidly becoming a pre- ferred source for information and documentation. Both at research and industry levels, DLs are the most referenced sources, as testified by the popularity of Google Books, Google Video, IEEE Explore, and the ACM Portal. Nevertheless, no general model is uni- formly accepted for such systems. Only few examples of modeling languages for developing DLs are available, 1 and there is a general lack of systems for designing and developing DLs. This is even more unfortunate because different stakeholders are interested in the design and development of a DL, such as information architects, to librarians, to software engineers, to experts of the spe- cific domain served by the DL. These categories may have contrasting objectives and views when deploying a DL: librarians are able to deal with faceted categories of documents, taxonomies, and document classification; software engineers usually concentrate on services and code development; information architects favor effective- ness of retrieval; and domain experts are interested in directly referring to the content of interest without going through technical jargon. Designers of DLs are most often library technical staff with little to no formal training in software engineering, or computer scientists with little background in the research findings of hypertext infor- mation retrieval. Thus DL systems are usually built from scratch using specialized architectures that do not benefit Alessio Malizia ([email protected]) is Associate Profes- sor, Universidad Carlos III, Department of Informatics, Madrid, Spain; Paolo Bottoni ([email protected]) is Associate Pro- fessor and S. Levialdi ([email protected]) is Professor, “Sa- pienza” University of Rome, Department of Computer Science, Rome, Italy. Alessio Malizia, Paolo Bottoni, and S. Levialdi Generating Collaborative Systems for Digital Libraries: a Model-Driven Approach

Upload: phamcong

Post on 16-Feb-2019

222 views

Category:

Documents


0 download

TRANSCRIPT

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 171

frompreviousexperienceandfromresearch insoftwareengineering.Wastedeffortandpoorinteroperabilitycanthereforeensue,raisingthecostsofDLsandjeopardizingthefluidityofinformationassetsinthefuture.

Inaddition,thereisaneedformodelingservicesanddata structures as highlighted in the “Digital LibraryReference Model” proposed by the DELOS EU networkof excellence (also called the “DELOS Manifesto”);2 infact,thedistributionofDLservicesoverdigitalnetworks,typically accessed through Web browsers or dedicatedclients, makes the whole theme of interaction betweenusers important, for both individual usage and remotecollaboration.Designingandmodelingsuchinteractionscallforconsiderationspertainingtothefieldsofhuman–computer interaction (HCI) and computer-supportedcooperative work (CSCW). As an example, scenario-basedoractivity-basedapproachesdevelopedintheHCIareacanbeexploitedinDLdesign.

To meet these needs we developed CRADLE(Cooperative-Relational Approach to Digital LibraryEnvironments),3 a metamodel-based Digital LibraryManagement System (DLMS) supporting collaborationin the design, development, and use of DLs, exploitingpatternsemergingfrompreviousprojects.Theentitiesofthe CRADLE metamodel allow the specification of col-lections, structures, services, and communities of users(called “societies” in CRADLE) and partially reflect theDELOSManifesto.Themetamodelentitiesarebasedonexisting DL taxonomies, such as those proposed by Foxand Marchionini,4 Gonçalves et al.,5 or in the DELOSManifesto, so as to leverage available tools and knowl-edge. Designers of DLs can exploit the domain-specificvisual language (DVSL) available in the CRADLE envi-ronment—where familiar entities extracted from thereferred taxonomies are represented graphically—tomodel data structures, interfaces and services offeredto the final users. The visual model is then processedand transformed, exploiting suitable templates, towarda set of specific languages for describing interfaces andservices.Theresultsarefinallytransformedintoplatform-independent(Java)codeforspecificDLapplications.

CRADLE supports the basic functionalities of a DLthrough interfaces and service templates for managing,browsing,searching,andupdating.Thesecanbefurtherspecializedtodeployadvancedfunctionalitiesasdefinedbydesignersthroughtheentitiesof theproposedvisual

The design and development of a digital library involves different stakeholders, such as: information architects, librarians, and domain experts, who need to agree on a common language to describe, discuss, and negoti-ate the services the library has to offer. To this end, high-level, language-neutral models have to be devised. Metamodeling techniques favor the definition of domain-specific visual languages through which stakeholders can share their views and directly manipulate representations of the domain entities. This paper describes CRADLE (Cooperative-Relational Approach to Digital Library Environments), a metamodel-based framework and visual language for the definition of notions and services related to the development of digital libraries. A collection of tools allows the automatic generation of several services, defined with the CRADLE visual language, and of the graphical user interfaces providing access to them for the final user. The effectiveness of the approach is illustrated by presenting digital libraries generated with CRADLE, while the CRADLE environment has been evaluated by using the cognitive dimensions framework.

D igital libraries (DLs) are rapidly becoming a pre-ferredsourceforinformationanddocumentation.Both at research and industry levels, DLs are the

most referenced sources, as testified by the popularityof Google Books, Google Video, IEEE Explore, and theACM Portal. Nevertheless, no general model is uni-formlyacceptedforsuchsystems.Onlyfewexamplesofmodeling languages for developing DLs are available,1andthereisageneral lackofsystemsfordesigninganddevelopingDLs.This is evenmoreunfortunatebecausedifferent stakeholders are interested in the design anddevelopmentof aDL, suchas informationarchitects, tolibrarians, to software engineers, to experts of the spe-cific domain served by the DL. These categories mayhave contrasting objectives and views when deployinga DL: librarians are able to deal with faceted categoriesof documents, taxonomies, and document classification;software engineers usually concentrate on services andcodedevelopment;informationarchitectsfavoreffective-ness of retrieval; and domain experts are interested indirectlyreferringtothecontentofinterestwithoutgoingthroughtechnicaljargon.DesignersofDLsaremostoftenlibrary technicalstaffwith little tono formal training insoftware engineering, or computer scientists with littlebackground in the research findings of hypertext infor-mationretrieval.ThusDLsystemsareusuallybuiltfromscratchusingspecializedarchitecturesthatdonotbenefit

Alessio Malizia ([email protected]) is associate Profes-sor, universidad carlos iii, Department of informatics, Madrid, Spain; Paolo Bottoni ([email protected]) is associate Pro-fessor and s. levialdi ([email protected]) is Professor, “Sa-pienza” university of rome, Department of computer Science, rome, italy.

Alessio Malizia, Paolo Bottoni, and S. Levialdi

Generating Collaborative Systems for Digital Libraries: a Model-Driven Approach

172 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

aformalfoundationfordigitallibraries,called5S,basedon the concepts of streams, (data) structures, (resource)spaces, scenarios, and societies. While being evidence of agoodmodelingendeavor,theapproachdoesnotspecifyformally how to derive a system implementation fromthemodel.

ThenewgenerationofDLsystemswillbehighlydis-tributed,providingadaptiveandinteroperablebehaviourbyadjustingtheirstructuredynamically,inordertoactindynamicenvironments(e.g.,interfacingwiththephysicalworld).13Tomanage such largeandcomplex systems,asystematic engineering approach is required, typicallyonethatincludesmodelingasanessentialdesignactivitywheretheavailabilityofsuchdomain-specificconceptsasfirst-class elements in DL models will make applicationspecificationeasier.14

While most of the disciplines related to DLs—e.g.,databases,15 information retrieval,16 and hypertext andmultimedia17—haveunderlyingformalmodelsthathaveproperlysteeredthem,littleisavailabletoformalizeDLsperse.WangdescribedthestructureofaDLsystemasadomain-specific database together with a user interfaceforqueryingtherecordsstoredinthedatabase.18Castelliet al. present an approach involving multidimensionalquerylanguagesforsearchinginformationinDLsystemsthat is based on first-order logic.19 These works modelmetadataspecificationsand thusare themainexamplesof system formalization in DL environments. Cognitivemodelsforinformationretrieval,asusedforexamplebyOddyetal.,20focusonusers’information-seekingbehav-ior (i.e., formation, nature, and properties of a users’informationneed)andonhowinformationretrievalsys-temsareusedinoperationalenvironments.

Otherapproachesbasedonmodelsandlanguagesfordescribing the entities involved in a DL are the DigitalLibraryDefinitionLanguage,21 theDSpacedatamodel22(withthedefinitionsofcommunitiesandworkflowmod-els), the Metis Workflow framework,23 and the Fedorastructoid approach.24 E/R approaches are frequentlyusedformodelingdatabasemanagementsystem(DBMS)applications,25butasE/Rdiagramsonlymodelthestaticstructure of a DBMS, they generally do not deal deeplywithdynamicaspects.Temporalextensionsadddynamicaspects to the E/R approach, but most of them are notobject-oriented.26 The advent of object-oriented technol-ogycallsforapproachesandtoolstoinformationsystemdesignresultinginobject-orientedsystems.Theseconsid-erationsdrove research towardmodelingapproachesassupportedbyUML.27

However,sincetheUMLmetamodelisnotyetwide-spread in the DL community, we adopted the E/Rformalismandcomplementeditwiththespecificationofthedynamicsmadeavailablethroughtheuserinterface,as described by Malizia et al.28 Using the metamodel,we have defined a DSVL, including basic entities and

language. CRADLE is based on the entity-relationship(E/R)formalism,whichispowerfulandgeneralenoughtodescribeDLmodelsandissupportedbymanytoolsasa metamodeling language. Moreover, we observed thatusers and designers involved in the DL environment,butnotcomingfromasoftwareengineeringbackground,maynotbefamiliarwithadvancedformalismlikeunifiedmodeling language (UML), but they usually have basicnotionsondatabasemanagementsystems,whereE/Rislargelyemployed.

■■ Literature Review

DLsarecomplexinformationsystemsinvolvingtechnolo-giesandfeaturesfromdifferentareas,suchaslibraryandinformationsystems,informationretrieval,andHCI.Thisinterdisciplinary nature is well reflected in the variousdefinitionsofDLspresentintheliterature.Asfarbackas1965, Licklider envisaged collections of digital versionsofscanneddocumentsaccessibleviainterconnectedcom-puters.6Morerecently,LevyandMarshalldescribedDLsassetsofcollectionsofdocuments,togetherwithdigitalresources,accessiblebyusersinadistributedcontext.7Tomanagetheamountofinformationstoredinsuchsystems,theyproposedsomesortofuser-assistingsoftwareagent.Other definitions include not only printed documents,but multimedia resources in general.8 However differ-entthedefinitionsmaybe,theyall includethepresenceof collections of resources, their organization in struc-tured repositories, and theiravailability to remoteusersthrough networks (as discussed by Morgan).9 Recenteffortstowardstandardizationhavebeentakenbypublicand private organizations. For example, a Delphi studyidentifiedfourmainingredients:anorganizedcollectionof resources,mechanisms forbrowsingandsearching,adistributed networked environment, and a set of objec-tifiedservices.10ThePresident’sInformationTechnologyAdvisoryCommittee (PITAC)PanelonDigitalLibrariesseesDLsasthenetworkedcollectionsofdigitaltext,doc-uments,images,sounds,scientificdata,andsoftwarethatmakeupthecoreof today’s Internetandof tomorrow’suniversally accessible digital repositories of all humanknowledge.11

When considering DLs in the context of distributedDLenvironments,onlyfewpapershavebeenproduced,contrasting with the huge bibliography on DLs in gen-eral. The DL Group at the Universidad de lasAméricasPueblainMexicointroducedtheconceptofpersonalandgroup spaces, relevant to the CSCW domain, in the DLsystem context.12 Users can share information stored intheirpersonalspacesorshareagents,thusallowingotheruserstoperformthesamesearchonthedocumentcollec-tions in theDL.Thecited textbyGonçalvesetal.gives

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 173

education as discussed by Wattenberg or Zia.33 In theNSDL program, a new generation of services has beendevelopedthat includessupport for teachingand learn-ing; this means also considering users’ activities orscenarios and not only information access. Services forimplementingpersonal contentdeliveryandsharing,ormanaging digital resources and modeling collaboration,areexamplesoftoolsintroducedduringthisprogram.

The virtual reference desk (VRD) is emerging as aninteractive service based on DLs. With VRD, users cantakeadvantageofdomainexperts’knowledgeandlibrar-ians’ experience to locate information. For example, theU.S.LibraryofCongressAskaLibrarianserviceactsasaVRDforuserswhowanthelpinsearchinginformationcategories or to interact with expert librarians to searchforaspecifictopic.34

Theinteractiveandcollaborativeaspectsofactivitiestaking place within DLs facilitate the development ofusercommunities.Socialnetworking,workpractices,andcontentsharingareallfeaturesthatinfluencethetechnol-ogy and its use. Following Borgmann,35 Lynch sees thefutureofDLsnotinbroadservicesbutinsupportingandfacilitating “customization by community,” i.e., servicestailoredfordomain-specificworkpractices.36

We also examined the research agenda on system-oriented issues in DLs and the DELOS manifesto.37 Theagendaabstracts theDL lifecycle, identifying fivemainareas,andproposeskeyresearchproblems.InparticularwetackleactivitiessuchasformalmodelingofDLsandtheircommunitiesanddevelopingframeworkscoherentwithsuchmodels.

At the architectural level, one point of interest is tosupport heterogeneous and distributed systems, in par-ticularnetworkedDLsandservices.38Forinteroperability,oneoftheissuesishowtosupportandinteroperatewithdifferentmetadatamodelsandstandardstoallowdistrib-uted cataloguing and indexing, as in the OpenArchiveInitiative(OAI).39

Finally, we are interested in the service level of theresearch agenda and more precisely in Web servicesand workflow management as crucial features whenincluding communities and designing DLs for use overnetworksandforsharingcontent.

Asaresultof thisanalysis, theCRADLEframeworkfeaturesthefollowing:

■■ avisuallanguagetohelpusersanddesignerswhenvisualmodelingtheirspecificDL(withoutknowinganytechnicaldetailapartfromlearninghowtouseavisual environment providing diagrams representa-tionsofdomainspecificelements)

■■ anenvironmentintegratingvisualmodelingandcodegenerationinsteadofsimplyprovidinganintegratedarchitecturethatdoesnothidetechnicaldetails

■■ interfacegeneration fordealingwithdifferentusers

relationships for modeling DL-related scenarios andactivities. The need for the integration of multiple lan-guages has also been indicated as a key aspect of theDSVLapproach.29Infact,complexdomainslikeDLstypi-callyconsistofmultiplesubdomains,eachofwhichmayrequireitsownparticularlanguage.

Inthecurrentimplementation,thedefinitionofDSVLsexploits themetamodelingfacilitiesofAToM3,basedongraph-grammars.30 AToM3 has been typically used forsimulation and model transformation, but we adopt ithereasatoolforsystemgeneration.

■■ Requirements for Modeling Digital Libraries

We follow the DELOS Manifesto by considering a DLasanorganization (possiblyvirtual anddistributed) formanaging collections of digital documents (digital con-tentsingeneral)andpreservingtheirimagesonstorage.ADLofferscontextualservicestocommunitiesofusers,acertainqualityofservice,andtheabilitytoapplyspecificpolicies.InCRADLEweleavethedefinitionofqualityofservicetotheservice-orientedarchitecturestandardsweemployandpartiallymodeltheapplicablepolicy,butwefocushereoncrucialinteractivityaspectsneededtomakeDLsusablebydifferentcommunitiesofusers.

In particular, we model interactive activities andservices based on librarians’ experiences in face-to-facecommunication with users, or designing exchange andintegrationproceduresforcommunicatingbetweeninsti-tutionsandmanagingsharedresources.

While librarians are usually interested in modelingmetadata across DLs, software engineers aim at provid-ing multiple tools for implementing services,31 such asindexing, querying, semantics,32 etc. Therefore we pro-videavisualmodelusefulforlibrariansandinformationarchitects to mimic the design phases they usually per-form. Moreover, by supporting component services, wehelp software engineers to specify and add services ondemandtoDLenvironments.Tothisend,weuseaservicecomponentmodel.Bysharingacommonlanguage,usersfromdifferentcategoriescancommunicatetodesignaDLsystem while concentrating on their own tasks (servicesdevelopment and design for software engineers and DLdesignforlibrariansandinformationarchitects).UsersaremodeledaccordingtotheDelosManifestoasDLEnd-users(subdividedintocontentcreators,contentconsumers,andlibrarians),DLDesigners(librariansandinformationarchi-tects),DLSystemAdministrators(typicallylibrarians),andDLApplicationDevelopers(softwareengineers).

Several activities have been started on modelingdomain specific DLs. As an example, the U.S. NationalScience Digital Library (NSDL) program promotes edu-cationalDLsandservicesforbasicandadvancedscience

174 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

■■ how that information is structured and organized(StructuralModel)

■■ thebehavioroftheDL(ServiceModel)andthediffer-entsocietiesofactors

■■ groupsofservicesactingtogethertocarryouttheDLbehavior(SocietalModel)

Figure 1 depicts the design approach supported byCRADLE architecture, namely, modeling the society ofactors and services interacting in the domain-specificscenarios and describing the documents and metadatastructure included with the library by defining a visualmodel for all these entities. The DL is built using a col-lectionof stockparts andconfigurable components thatprovidetheinfrastructureforthenewDL.Thisinfrastruc-tureincludestheclassesofobjectsandrelationshipsthatmakeuptheDL,andprocessingtoolstocreateandloadtheactuallibrarycollectionfromrawdocuments,aswellasservicesforsearching,browsing,andcollectionmain-tenance. Finally, the code generation module generatestailoredDLservicescodestubsbycomposingandspecial-izingcomponentsfromthecomponentpool.

Initially,aDLdesignerisresponsibleforformalizing(starting from an analysis of the DL requirements andcharacteristics)aconceptualdescriptionof theDLusingmetamodel concepts. Model specifications are then fedinto a DL generator (written in Python for AToM3), toproduceaDLtailoredsuitableforspecificplatformsandrequirements.Afterthesedesignphases,CRADLEgener-atesthecodefortheuserinterfaceandthepartsofcodecorresponding to services and actors interacting in thedescribedsociety.Asetoftemplatesforcodegeneration

anddesigners■■ flexiblemetadatadefinitions■■ asetofinteractiveintegratedtoolsforuseractivitieswiththegeneratedDLsystem

Tosumup,CRADLEisaDLMSaimedat supporting all the users involved inthe development of a DL system andproviding interfaces, data modeling, andservicesforuser-drivengenerationofspe-cific DLs. Although CRADLE does notyet satisfy all requirements for a genericDL system, it addresses issues focusedon developing interactive DL systems,stressing interfaces and communicationbetweenusers.Nevertheless,weemployedstandards when possible to leave it openfor further specification or enhancementsfrom the DL user community. ExtensiveuseofXML-based languagesallowsus tochange document information dependingonimplementedrecognitionalgorithmssothatexpertuserscaneasilymodeltheirDLbyselectingthebestrecognitionandindexingalgorithms.

CRADLEevolvesfromtheJDAN(Java-basedenviron-mentforDocumentApplicationsonNetworks)platform,whichmanagedbothdocumentimagesandformsonthebasis of a component architecture.40 JDAN was based onXMLtechnologies,anditsmodularityalloweditsintegra-tioninservice-basedandgrid-basedscenarios.Itsupportedtemplatecodegenerationandmodeling,butitrequiredthedesignertowriteXMLspecificationsandeditXMLschemafilesinordertomodeltheDLdocumenttypesandservices,thusrequiringtechnicalknowledgethatshouldbeavoidedtoletusersconcentrateontheirspecificdomains.

■■ Modeling Digital Library Systems

The CRADLE framework shows a unique combinationoffeatures:itisbasedonaformalmodel,exploitsasetofdomain-specific languages, and provides automatic codegeneration.Moreover,fundamentalrolesareplayedbytheconceptsofsocietyandcollaboration.41CRADLEgeneratescodefromtoolsbuiltaftermodelingaDL(accordingtotherulesdefinedby theproposedmetamodel)andperformsautomatic transformation and mapping from model tocodetogeneratesoftwaretoolsforagivenDLmodel.

The specification of a DL in CRADLE encompassesfourcomplementarydimensions:

■■ multimedia information supported by the DL(CollectionModel)

Figure 1. CRADLE architecture

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 175

socioeconomic, and environment dimen-sions.Wenowshowindetailtheentitiesandrelationsinthederivedmetamodel,showninfigure2.

Actor entities

ActorsaretheusersofDLs.ActorsinteractwiththeDLthroughservices(interfaces)that are (or can be) affected by theactorspreferencesandmessages (raisedevents). In the CRADLE metamodel,an actor is an entity with a behaviorthat may concurrently generate events.Communications with other actors mayoccursynchronouslyorasynchronously.Actors can relate through services toshapeadigitalcommunity,i.e.,thebasisof a DL society. In fact, communities ofstudents, readers, or librarians interactwithandthroughDLs,generallyfollow-ingpredefinedscenarios.Asanexample,societies can behave as query generatorservices (from the point of view of the

library) and as teaching, learning, and working services(from the point of view of other humans and organiza-tions). Communication between actors within the sameordifferentsocietiesoccurthroughmessageexchange.Tooperate,societiesneedshareddatastructuresandmessageprotocols, enacted by sending structured sequences ofqueriesandretrievingcollectionsofresults.

Theactorentityincludesthreeattributes:

1. Role identifies which role is played by the actorwithin the DL society. Examples of specific humanroles include authors, publishers, editors, maintain-ers, developers, and the library staff. Examples ofnonhuman actors include computers, printers, tele-communicationdevices,softwareagents,anddigitalresourcesingeneral.

2. Status isanenumerationofpossiblestatusesfor theactor:

I. None(defaultvalue)II. Active(presentinthemodelandactivelygenerat-

ingevents)III. Inactive(presentinthemodelbutnotgenerating

events)IV. Sleeping(presentinthemodelandawaitingfora

responsetoaraisedevent)3. Eventsdescribesalistofeventsthatcanberaisedby

the actor or received as a response message from aservice.Examplesofeventsareborrow,reserve,return,etc. Events triggered from digital resources includestore,trash,andtransfer.Examplesofresponseeventsarefound,not found,updated,etc.

havebeenbuiltfortypicalservicesofaDLenvironment.To improve acceptability and interoperability,

CRADLEadoptsstandardspecificationsublanguagesforrepresenting DL concepts. Most of the CRADLE modelprimitivesaredefinedasXMLelements,possiblyenclos-ing other sublanguages to help define DL concepts. Inmoredetail,MIME types constitute thebasis forencod-ing elements of a collection. The XML User InterfaceLanguage (XUL)42 is used to represent appearance andvisualinterfaces,andXDocletisusedintheLibGencodegenerationmodule,asshowninfigure1.43

■■ The Cradle Metamodel

In the CRADLE formalism, the specification of a DLincludes a Collection Model describing the maintainedmultimedia documents, a Structural Model of informa-tion organization, a Service Model for the DL behavior,and a Societal Model describing the societies of actorsandgroupsofservicesactingtogethertocarryouttheDLbehavior.

AsocietyisaninstanceoftheCRADLEmodeldefinedaccordingtoaspecificcollaborationframeworkintheDLdomain.Asocietyisthehighest-levelcomponentofaDLandexiststoservetheinformationneedsofitsactorsandtodescribeitscontextofusage.HenceaDLcollects,preserves,andsharesinformationartefactsforsocietymembers.

The basic entities in CRADLE are derived fromthe categorization along the actors, activities, components,

Figure 2. The CRADLE metamodel with the E/R formalism

176 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

a text document, including scientific articles and books,becomesasequenceofstrings.

the struct entity

A Struct is a structural element specifying a part of awhole. In DLs, structures represent hypertexts, taxono-mies, relationships between elements, or containment.Forexample,bookscanbestructuredlogicallyintochap-ters,sections,subsections,andparagraphs,orphysicallyinto cover, pages, line groups (paragraphs), and lines.Structuresarerepresentedasgraphs,andthestructentity(avertex)containsfourattributes:

1. Document is a pointer to the document entity thestructurerefersto.

2. Idisauniqueidentifierforastructureelement.3. Typetakesthreepossiblevalues:

I. Metadatadenotesacontentdescriptor,forinstancetitle,author,etc.

II. Layoutdenotestheassociatedlayout,e.g.,leftframe,columns,etc.

III. Itemindicatesagenericstructureelementusedforextendingthemodel.

4. Values isa listofvaluesdescribingtheelementcon-tent,e.g.,title,author,etc.

Actors interact with services in an event-driven way.Servicesare connectedviamessages (send and reply) andcanbesequential,concurrent,ortask-related(whenaser-viceactsasasubtaskofamacroservice).Servicesperformoperations(e.g.,get,add,anddel)oncollections,producingcollections of documents as results. Struct elements areconnectedtoeachotherasnodesofagraphrepresentingmetadatastructuresassociatedwithdocuments.

The metamodel has been translated to a DSVL, asso-ciatingsymbolsandiconswithentitiesandrelations(see“CRADLE Language and Tools” below). With respect tothe six core concepts of the DELOS Manifesto (content,user, functionality, quality, policy, and architecture), con-tentcanbemodeledinCRADLEascollectionsandstructs,userasactor,andfunctionalityasservice.Thequalitycon-ceptisnotdirectlymodeledinCRADLE,butforqualityofservicewesupport standardservicearchitecture.Policiescanbepartiallymodeledbyservicesmanaginginteractionbetweenactorsandcollections,makingitpossibletoapplystandard access policies. From the architectural point ofview,wefollowthereferencearchitectureoffigure1.

■■ CRADLE Language and Tools

InthissectionwedescribetheselectionoflanguagesandtoolsoftheCRADLEplatform.Toimproveinteroperability

service entities

Services describe scenarios, activities, operations, andtasks that ultimately specify the functionalities of a DL,such as collecting, creating, disseminating, evaluating,organizing, personalizing, preserving, requesting, andselecting documents and providing services to humansconcerned with fact-finding, learning, gathering, andexploringthecontentofaDL.Alltheseactivitiescanbedescribedand implementedusingscenariosandappearintheDLsettingasaresultofactorsusingservices(thussocieties).Furthermore,theseactivitiesrealizeandshaperelationshipswithinandbetweensocieties,services,andstructures.IntheCRADLEmetamodel,theserviceentitymodels what the system is required to do, in terms ofactionsandprocesses, toachievea task.Adetailed taskanalysis helps understand the current system and theinformationflowwithinitinordertodesignandallocatetasksappropriately.Theserviceentityhasfourattributes:

1. Nameisastringrepresentingatextualdescriptionoftheservice.

2. Sync states whether communication is synchronousorasynchronous,modeledbyvalueswaitandnowait,respectively.

3. Events is a list of messages that can trigger actionsamongservices(tasks);forexample,validornotValidincaseofaparsingservice.

4. Responsescontainalistofresponsemessagesthatcanreplytoraisedevents;theyareusedasacommunica-tionmechanismbyactorsandservices.

the collection entity

Collectionsaresetsofdocumentsofarbitrarytype(e.g.,bits,characters, images, etc.) used to model static or dynamiccontent. In the static interpretation, a collection definesinformationcontentinterpretedasasetofbasicelements,often of the same type, such as plain text. Examples ofdynamiccontentincludevideodeliveredtoaviewer,ani-matedpresentations,andsoon.Theattributesofcollectionarenameanddocuments.Nameisastring,whiledocumentsisalistofpairs(DocumentName,DocumentLabel),thelatterbeingapointertothedocumententity.

the Document entity

DocumentsarethebasicelementsinaDLandaremodeledwithattributeslabelandstructure.

Labeldefinesatextualstringusedbyacollectionentitytorefertothedocument.Wecanconsideritasadocumentidentifier,specifyingaclassoratypeofdocument.

Structure defines the semantics and area of appli-cation of the document. For example, any textualrepresentationcanbeseenasastringofcharacters,sothat

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 177

graphs. Model manipulation can then be expressedviagraphgrammarsalsospecifiedinAToM3.

The general process of automatic creation of coop-erative DL environments for an application is shownin figure 3. Initially, a designer formalizes a conceptualdescription of the DL using the CRADLE metamodelconcepts. This phase is usually preceded by an analysisofrequirementsandinteractionscenarios,asseenprevi-ously. Model specifications are then provided to a DLcodegenerator(writteninPythonwithinAToM3)topro-duceDLstailoredtospecificplatformsandrequirements.ThesearebuiltonacollectionoftemplatesofservicesandconfigurablecomponentsprovidinginfrastructureforthenewDL.

The sketched infrastructure includes classes forobjects (tasks), relationships making up the DL, andprocessing tools to upload the actual library collectionfrom raw documents, as well as services for searchingandbrowsingandfordocumentcollectionsmaintenance.TheCRADLEgeneratorautomaticallygeneratesdifferentkindsofoutputfortheCRADLEmodelofthecooperativeDLenvironment,suchasserviceandcollectionmanagers.

Collection managers define the logical schemata oftheDL,whichinCRADLEcorrespondtoasetofMIMEtypes, XUL and XDoclet specifications, representingdigitalobjects, theircomponentparts,andlinkinginfor-mation.Collectionmanagersalsostoreinstancesoftheir

and collaboration, CRADLE makesextensiveuseofexistingstandardspec-ification languages. Most CRADLEoutputs are defined with XML-basedformats, able to enclose other specificlanguages. The basic languages andcorresponding toolsused inCRADLEarethefollowing:

■■ MIME type.MultipurposeInternetMailExtensions(MIME)constitutethebasis forencodingdocumentsin CRADLE, supporting severalfile formats and types of charac-ter encoding. MIME was chosenbecause of wide availability ofMIME types, and standardisationof the approach. This makes it anatural choice for DLs where dif-ferent types of documents needtobemanaged(PDF,HTML,Doc,etc.). Moreover, MIME standardsfor character encoding descrip-tions help keeping the CRADLEframework open and compliantwithstandards.

■■ XUL. The XML User InterfaceLanguage(XUL)isanXML-basedmarkuplanguageused to represent appearance and visual interfaces.XUL is not a public standard yet, but it uses manyexistingstandardsand technologies, includingDTDand RDF,44 which makes it easily readable for peo-ple with a background in Web programming anddesign.ThemainbenefitofXUListhatitprovidesasimpledefinitionofcommonuserinterfaceelements(widgets).Thisdrasticallyreducesthesoftwaredevel-opmenteffortrequiredforvisualinterfaces.

■■ XDoclet. XDoclet is used for generating servicesfrom tagged-code fragments. It is an open-sourcecode generation library which enables attribute-ori-entedprogramming for Javavia insertionofspecialtags.45Itincludesalibraryofpredefinedtags,whichsimplify coding for various technologies, e.g., Webservices. The motivation for using XDoclet in theCRADLE framework is related to its approach fortemplate code generation. Designers can describetemplatesforeachservice(browse,query,andindex)andtheXDocletgeneratedcodecanbeautomaticallytransformed into the Java code for managing thespecifiedservice.

■■ AToM3.AToM3 is ametamodeling system tomodelgraphical formalisms. Starting from a metaspecifi-cation (in E/R), AToM3 generates a tool to processmodels described in the chosen formalism. Modelsare internally represented using abstract syntax

Figure 3. Cooperative DL generation process with CRADLE framework

178 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

and(3)themetadataoperationsbox.The right column manages visualization and mul-

timedia information obtained from documents. Thebasic featuresprovidedwith theUI templatesaredocu-ment loading, visualization, metadata organization, andmanagement.

Thelayouttemplate,inthecollectionbox,managesthevisualizationofthedocumentscontainedinacollection,while thevisualization templateworksaccording to thedata (MIME) type specified by the document. Actually,by selecting a document included in the collection, thecorresponding data file is automatically uploaded andvisualizedintheUI.

Themetadatavisualizationinthecodetemplatereflectsthe metadata structure (a tree) represented by a struct,specifying the relationship between parent and childnodes.ThustheXULtemplateincludesanarea(themeta-databox)formanagingtreestructuresasdescribedinthevisualmodelof theDL.Although the tree-likevisualiza-tionhaspotentialdrawbacks if therearemanymetadataitems,thereshouldbenorealconcernwithmediumloads.

TheUItemplatealsoincludesaboxtoperformopera-tionsonmetadata,suchasinsert,delete,andedit.Userscan select a value in the metadata box and manipulatethepresentedvalues.Figure4showsanexampleofaUIgeneratedfromabasictemplate.

service templates

Toachieveautomatedcodegeneration,weuseXDoclettospecifyparametersandservicecodegenerationaccordingtosuchparameters.CRADLEcanautomaticallyannotateJava files with name–value pairs, and XDoclet providesa syntax forparameter specification.Codegeneration is

classes and function as search engines for the system.Servicesclassesalsoaregeneratedandarerepresentedasattribute-orientedclassesinvolvingpartsandfeaturesofentities.

■■ CRADLE platform

The CRADLE platform is based on a model-drivenapproachforthedesignandautomaticgenerationofcodefor DLs. In particular, the DSVL for CRADLE has fourdiagramtypes(collection,structure,service,andactor)todescribethedifferentaspectsofaDL.

In this section we describe the user interface (UI)and service templates used for generating the DL tools.In particular, the UI layout is mainly generated fromthe structured information provided by the document,struct,andcollectionentities.TheUIeventsaremanagedby invoking the appropriate services according to theimportedXULtemplates.Attheserviceandcommunica-tion levels, theXDocletcode isgeneratedby theserviceandactorentities,exploitingtheirrelationships.Wealsoshow how code generation works and the advancedplatformfeatures,suchasautomaticservicediscovery.Attheendofthesectionarunningexampleisshown,rep-resenting all the phases involved in using the CRADLEframework for generating the DL tools for a typicallibraryscenario.

user interface templates

The generation of the UI is driven by the visual modeldesigned by the CRADLE user. Specifically, the modelentitiesinvolvedinthisprocessaredocument,structandcollection(seefigure2)forthebasiccomponentsandlay-outoftheinterfaces,whilelinkedservicesaredescribedintheappropriatetemplates.

The code generation process takes place throughtransformations implemented as actions in the AToM3metamodel specification, where graph-grammar rulesmayhaveaconditionthatmustbesatisfiedfor theruleto be applied (preconditions), as well as actions to beperformedwhentheruleisexecuted(postconditions).Atransformation is described during the visual modelingphase in terms of conditions and corresponding actions(inserting XUL language statements for the interface intheappropriatecode templateplaceholders).Thegener-ateduserinterfaceisbuiltonasetofXULtemplatefilesthat are automatically specialized on the basis of theattributesandrelationshipsdesignedinthevisualmod-elingphase.

The layout template for theuser interface isdividedintotwocolumns(seefigure4).Theleftcolumnismadeofthreeboxes:(1)thecollectionbox(2)themetadatabox,

Figure 4. An example of an automatically generated user inter-face. (A) document area; (B) collection box; (C) metadata box; (D) metadata operations box.

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 179

"msg arguments.argname">{ "<XDtField : fieldName/>" ,"<XDtField : fieldTagValue tagName=

"msg arguments.argname"paramName="name"/>""<XDtField : fieldTagValue tagName=

"msg arguments.argname"paramName=" desc "/>"} ,</XDtField : ifHasFieldTag></XDtField : forAllFields> };

ThefirsttwolinesdeclareaclasswithanameclassnameImpl that extends the class name. The XDoclettemplate tag XDtClass:className denotes the name oftheclassintheannotatedJavafile.AllstandardXDoclettemplatetagshaveanamespacestartingwith“XDt.”

The rest of the template uses XDtField : forAllField to iterate through the fields. For each field with a tagnamedmsg arguments.argname(checkedusingXDtField : ifHasFieldTag),itcreatesasubarrayofstringsusingthevaluesobtainedfromthe field tagparameters.XDtField : fieldName gives the name of the field, while XDtField : fieldTagValue retrieves the value of a given field tagparameter.CharactersthatarenotpartofsomeXDoclettemplate tags are directly copied into the generatedcode. The following code segment was generated byXDocletusing theannotated fieldsand theabove tem-platesegment:

public class MSGArgumentsImpl extends MSGArguments {

public static String[ ][ ] argumentNames = new String[ ][ ]{ {

"eventMsg" ," event " ," eventstring "} ,{" responseMsg " ," response " ," responsestring "} ,};}Similarly, we generate the getter and setter

methods for each field:<XDtField : forAllFields > <XDtField : ifHasFieldTagtagName="msg arguments.argname">public <XDtField : fieldType/> get <XDtField :

fieldName />() {return <XDtField : fieldName />;}public void set <XDtField : fieldName />

( String value ) {

based on code templates. Hence service templates areXDoclet templates for transforming XDoclet code frag-mentsobtainedfromthemodeledserviceentities.

The basic XDoclet template manages messagesbetween services, according to the event and responseattributes described in “CRADLE Language and Tools”above. In fact, CRADLE generates a Java application(a service) that needs to receive messages (event) andreply to them (response) as parameters for the serviceapplication.InXDoclet,thesecanbeattachedtothecor-responding field by means of annotation tags, as in thefollowingcodesegments:

public class MSGArguments {. . . . . ./** @msg arguments.argname name="event "

desc="event_string "*/ protected String eventMsg = null;/** @msg arguments.argname name="response"* desc="response_string "*/ protected String responseMsg = null;}

Eachmsg arguments.argname relatedtoafieldiscalledafieldtag.Eachfieldtagcanhavemultipleparameters,listedafterthefieldtag.Inthetagnamemsg arguments.argname,theprefixservesasthenamespaceofalltagsforthisparticularXDocletapplication,thusavoidingnamingconflictswithotherstandardorcustomizedXDoclettags.Notonly fieldscanbeannotated,butalsootherentitiessuchasclassandfunctionscanhavetagstoo.

XDoclet enables powerful code generation requir-ing littleornocustomization (dependingonhowmuchis provided by the template). The type of code to begenerated using the parameters is defined by the corre-spondingXDoclettemplate.

We have created template files composed of Javacodes and special XDoclet instructions in the form ofXMLtags.TheseXDocletinstructionsallowconditionals(if) and loops (for), thus providing us with expressivepowerclosetoaprogramminglanguage.Inthefollowingexample, we first create an array containing labels andotherinformationforeachargument:

public class <XDtClass : classOf><XDtClass : className/>Impl</XDtClass :

classOf> extends<XDtClass : classOf><XDtClass : className/>

</XDtClass : classOf> {public static String[ ][ ] argumentNames = new

String[ ][ ] {<XDtField : forAllFields><XDtField : ifHasFieldTag tagName=

180 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

becausedifferentdesignchoicesinthetemplatecanleadtovastlydifferentcode.WehaveincludedanincrementalmechanismbywhichuserscanmodifythevisualmodelofaDLandregenerate(XULinterface)codeonlyforthemodifications. By employing this solution, librariansandDLdesigners canworkas theywouldonpaperbydesigning the visual scheme and collaboratively updat-ing and changing it. They can generate the code, verifytheimplementation,and,ifsomethinghastobechanged,go back to the visual model, apply modification, andgeneratecodeinanewiterationoftheprocess.Oncethevisual model has been modified, the system incremen-tally updates the code by examining only those modelpartsaffectedbytheeditandmodifyingthecorrespond-ingpartsofthegeneratedcode.

The same approach could be used for services butwithadifferent technique. In fact,predefined templatesexistforbasicservices,e.g.,indexing,uploading,andque-rying.Toallowserviceproviderstoaddnewcodetotherestoftheservicecomponentlist,wehaveimplementedaregistrylistingtheavailableservicetemplates.Whentheuserrunsthecodegenerationprocess,aroutineverifiesif theservice templates included in themodelareavail-ableintheregistryandloadsitintomemoryforthecodegenerationprocess.

We are planning to support a standard mechanismbased on the Universal Description, Discovery, andIntegration registry.47 Moreover, we have developed anadvancedinterfacetemplatethatembedsvalidationcodeinto theXUL templates for the interfaces to lookup thelistofservicesmadeavailablebytheinterfaceatrun-time.If there are services embedded in the interface but notavailable, the interface is modified to prevent access tothem.Forinstance,supposethataninterfaceisspecifiedwithbuttonstoaccess to thedocumentuploadandeditservices. If,atrun-time,thecheckdoesnotfindtheeditserviceavailable, theinterfacewillpresentonlythebut-tonfortheuploadservice.

■■ Generating a Digital Library Environment

As a first step in designing the digital library environ-ment in the CRADLE framework, designers model thesociety involved in the specific scenario. We define arunning example, called Library, to show the process,startingfromthebasicentitiesofthemodel.WeconsidermodelingasimpleDLenvironment.Theinvolvedactorsarestudentsandlibrarians.TheDLCollectionconsistsofDigital Paper Documents with publication, author, andtitlemetadatainformation(structentities).Infigure5,theCRADLEenvironment(asociety)isshowntogetherwiththedefinedentities.Circlesrepresentactorsinthemodel,rectangles render services, multiple rectangles represent

setValue ( "<XDtField : fieldName/>" , value ) ;}</XDtField : ifHasFieldTag></XDtField : forAllFields >This translates into the following generated code:public java.lang.String get eventMsg ( ) {return eventMsg ;}public void set eventMsg ( String value ) {setValue ( "eventMsg" , value ) ;}public java.lang.String getresponseMsg ( ) {return getresponseMsg ;}public void setresponseMsg ( String value ) {setValue ( " responseMsg " , value ) ;}

Thesametemplateisusedformanagingthenameandsyncattributesofserviceentities.

code Generation, service Discovery, and Advanced Features

A service or interface template only describes the solu-tion to a particular design problem—it is not code.Consequently,userswillfinditdifficulttomaketheleapfromthe templatedescriptiontoaparticular implemen-tation even though the template might include samplecode. Others, like software engineers, might have notrouble translating the template into code, but they stillmay find it a chore, especially when they have to do itrepeatedly. The CRADLE visual design environment(based onAToM3) helps alleviate these problems. Fromjustafewpiecesofinformation(thevisualmodel),typi-cally application-specific names for actors and servicesinaDLsocietyalongwithchoices for thedesign trade-offs,thetoolcancreateclassdeclarationsanddefinitionsimplementing the template. The ultimate goal of themodeling effort remains, however, the production ofreliable and efficiently executable code. Hence a codegenerationtransformationproducesinterface(XUL)andservice(JavacodefromXDoclettemplates)codefromtheDLmodel.

We have manually coded XUL templates specifyingthestaticsetupoftheGUI,thevariouswidgetsandtheirlayout. This must be complemented with code gener-ated from a DL model of the systems dynamics codedinto services. While other approaches are possible,46 weemployed the solution implemented within the AToM3environment according to its graph grammar modelingapproachtocodegeneration.

CRADLE supports a flexible iterative process forvisual design and code generation. In fact, a designchange might require substantial reimplementation

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 181

selecting one, the UI activates the metadata operationsbox—figure 6(D). The selected metadata node will thenbe presented in the lower (metadata operations) box,labeled “set MetaData Values,” replacing the default“None” value as shown in figure 6.After the metadataitemispresented,theusercanedititsvalueandsaveitbyclickingonthe“setvalue”button.Theassociatedactionsavesthemetadatainformationandcausesitsdisplayinthe intermediate box (tree-like structure), changing thevisualizationaccordingtothenewvalues.

The code generation process for the Do_Search andFront Desk services is based on XDoclet templates. Inparticular,amessagelistenertemplateisusedtogeneratetheJavacodefortheFrontDeskservice.Infact,theFrontDeskservice isasynchronousandmanagescommunica-tions between actors. The actors classes are generatedalso by using the services templates since they haveattributes, events, and messages, just like the services.TheDo_Searchservicecodeisbasedontheproducerandconsumer templates, since it is synchronous by defini-tion in themodeledscenario.Agetmethodretrievingacollection of documents is implemented from the gettertemplate.

Theroutineinvokedbythetransformationactionforstructentitiesperformsabreadth-firstexplorationofthemetadata tree in the visual model and attaches the cor-responding XUL code for displaying the struct node inthecorrectpositionwithinthegraphstructureoftheUI.

collections, while a single rectangleconnectedtoacollectionrepresentsadocumententity;thecircleslinkedto the document entity are thestruct (metadata) entities.Metadataentitiesare linked to thenode rela-tionships (organized as a tree) andlinked to the document entity by ametadataLinkTyperelationship.

The search service is synchro-nous (sync attribute set to “wait”).It queries the document collec-tion (get operation) looking for therequested document (using meta-data information provided by theborrow request), and waits for theresult of get (a collection of docu-ments). Based on this result, theservice returns a Boolean message“Is_Available,”whichisthenpropa-gatedasa response to the librarianand eventually to the student, asshowninfigure5.

When the library designer hasbuilt the model, the transformationprocess can be run, executing thecode generation actions associatedwith the entities and services represented in the model.The code generation process is based on template codesnippets generated from theAToM3 environment graphtransformationengine, following thegenerative rulesofthemetamodel.Wealsousepre–andpostconditionsonapplicationoftransformationrulestohavecodegenera-tiondependonverificationofsomeproperty.

ThegeneratedUIispresentedinfigure6.Ontherightside,thedocumentareaispresentedaccordingtotheXULtemplate. Documents are managed according to theirMIMEtype:thePDFfileoftheexampleisloadedwiththeappropriateAdobeAcrobatReaderplug-in.

OntheleftcolumnoftheUIarethreeboxes,accordingto the XUL template. The collection box—figure 6(B)—presentsthelistofdocumentscontainedinthecollectionspecifiedbythedocumentsattributeofthelibrarycollec-tionentity,andallowsuserstointeractwithdocuments.After selecting a document by clicking on the list, it ispresented in the document area—figure 6(A)—where itcanbemanaged(edit,print,save,etc.).

In the metadata box—figure 6(C)—the tree structureof themetadata isdepictedaccording to the categoriza-tionmodeledbythedesigner.TheXULtemplatecontainsall the basic layout and action features for managing atree structure. The generated box contains the parentand child nodes according to the attributes specified inthecorrespondingstructelements.Theusercanclickonthe root for compactingorexploding the treenodes;by

Figure 5. The Library model, alias the model of the Library society

182 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

workflow system. The Release collection maintains theimagefilesinapermanentstorage,whiledataiswrittento the target database or content management software,togetherwithXMLmetadatasnippets (e.g., tobestoredinXMLnativeDBMS).

A typical configuration would have the Recognitionservice running on a server cluster, with many Data-Entryservicesrunningondifferentclients(WebbrowsersdirectlysupportXULinterfaces).Whereascurrentdocu-ment capture environments are proprietary and closed,thedefinitionofanXML-basedinterchangeformatallowsthesuitableassemblyofdifferentcomponent-basedtech-nologiesinordertodefineacomplexframework.

The realization of the JDAN DL system within theCRADLEframeworkcanbeconsideredasapreliminarystepinthedirectionofastandardmultimediadocumentmanaging platform with region segmentation and clas-sification,thusaimingatautomaticrecognitionofimagedatabase and batch acquisition of multiple multimediadocumentstypesandformats.

Personal and collaborative spaces

Apersonalspace isavirtualarea(withintheDLsociety)that is modeled as being owned and maintained by auser including resources (document collections, services,etc.), or references to resources, which are relevant to atask,orsetoftasks,theuserneedstocarryoutintheDL.Personal spaces may thus contain digital documents inmultiple media, personal schedules, visualization tools,anduseragents(shapedasservices)entitledwithvarioustasks. Resources within personal spaces can be allocated

■■ Designing and Generating Advanced Collaborative DL Systems

InthissectionweshowtheuseofCRADLEasananalyti-caltoolhelpfulincomprehendingspecificDLphenomena,to present the complex interplays that occur betweenCRADLEcomponentsandDLconceptsinarealDLappli-cation, and to illustrate the possibility of using CRADLEas a tool to design and generate advanced tools for DLdevelopment.

Modeling Document images collections

WithCRADLE,thedesignercanprovidethevisualmodeloftheDLSocietyinvolvedindocumentmanagementandthe remaining phases are automatically carried out byCRADLEmodulesandtemplates.Wehaveprovidedtheuser with basic code templates for the recognition andindexing services, the data-entry plug-in, and archiverelease. The designer can thus simply translate the par-ticular DL society into the corresponding visual modelwithintheCRADLEvisualmodelingeditor.

Asaproofofconcept,figure7modelstheJDANarchi-tecture,introducedin“RequirementsforModelingDigitalLibraries,” exploiting the CRADLE visual language. TheRecognitionServiceperformstheautomaticdocumentrec-ognitionandstores the correspondingdocument images,together with the extracted metadata in theArchive col-lection. It interactswith theScanneractor, representingamachineorahumanoperatorthatscanspaperdocuments.Designers can choose their own segmentation methodor algorithm; what is required to be compliant with theframework is to produce an XDoclet template. It storesthedocumentimagesintotheArchivecollection,withitsdifferentregionslayoutinformationaccordingtotheXMLmetadata schema provided by the designer. If there is atleast one region marked as “not interpreted,” the Data-Entryserviceisinvokedonthe“notinterpreted”regions.

TheData-Entry serviceallowsOperators toevaluatethe automatic classification performed by the systemand edit the segmentation for indexing. Operators canalso edit the recognized regions with the classificationengine (included in the Recognition service) and adjusttheirvaluesandsizes.TheoutputofthisphaseisanXMLdescriptionthatwillbeimportedintheIndexingserviceforindexing(andeventuallyquerying).

TheArchivecollectionstoresallofthebasicinforma-tionkeptinJDAN,suchastextlabels,whiletheIndexingservice, based on a multitier architecture, exploitingJBoss3.0,hasaccess to them.Thisservice is responsiblefor turning the data fragments in theArchive collectionintousefulformstobepresentedtothefinalusers,e.g.,areportoraqueryresult.

The final stage in the recognition process could beto release each document to a content management or

Figure 6. The UI generated by CRADLE transforming the Library model in XUL and XDocLet code

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 183

and metadata, but also can shareinformation with the various com-mitteescollaboratingforcertaintasks.

■■ Evaluation

In this section we evaluate the pre-sented approach from three differentperspectives:usabilityoftheCRADLEnotation, its expressiveness, andusabilityofthegeneratedDLs.

usability of crADle Notation

We have tested it by using thewell known Cognitive Dimensionsframework for notations and visuallanguage design.48 The dimensionsare usually employed to evaluatethe usability of a visual languageor notation, or as heuristics to drivethe design of innovative visual lan-guages.The significant results areasfollows.

Abstraction GradientAnabstractionisagroupingofelementstobetreatedasoneentity.Inthissense,CRADLEisabstraction-tolerant.It provides entities for high-level abstractions of com-munication processes and services. These abstractionsare intuitive as they are visualized as the process theyrepresent (serviceswitheventsandresponses)andeasyto learn as their configuration implies few simple attri-butes.AlthoughCRADLEdoesnotallowusers tobuildnewabstractions,theE/Rformalismispowerfulenoughtoprovidebasicabstractionlevels.

closeness of MappingCRADLEelementshavebeenassignediconstoresembletheir real-world counterparts (e.g., a collection is repre-sentedasasetofpapersheets).Theelementsthatdonothaveacorrespondencewithaphysicalobject intherealworld have icons borrowed from well-known notations(e.g.,structsrepresentedasgraphnodes).

consistencyA notation is consistent if a user knowing some of itsstructure can infer most of the rest. In CRADLE, whentwoelements represent thesameentitybutcanbeusedeither as input or as output, then their shape is equalbutincorporatesanincomingoranoutgoingmessageinordertodifferentiatethem.See,forexample,theiconsforservices or those for graph nodes representing either a

according to the user’s role. For example, a conferencechair would have access to conference-specific materi-als, visualization tools and interfaces to upload papersfor reviewbya committee. Similarly,wedenoteagroupspace as a virtual area in which library users (the entireDL society) can meet to conduct collaborative activitiessynchronously or asynchronously. Explicit group spacesare created dynamically by a designer or facilitator whobecomes(orappoints)theownerofthespaceanddefineswhotheparticipantswillbe.Inadditiontodirectuser-to-usercommunication,usersshouldbeabletoaccesslibrarymaterials andmakeannotationson them for everyothergrouptosee.Ideally,usersshouldbeabletoact(andcarryDL materials with them) between personal and groupspacesoramonggroupspacestowhichtheybelong.

Itmayalsobethecase,however,thatagivenresourceis referenced in several personal or group spaces. Basicfunctionalityrequiredforpersonalspacesincludescapa-bilities for viewing, launching, and monitoring libraryservices, agents, and applications. Like group spaces,personalspacesshouldprovideuserswiththemeanstoeasily become aware of other users and resources thatarepresentinagivengroupspaceatanytime,aswellasmechanismstocommunicatewithotherusersandmakeannotationsonlibraryresources.

WeemployedthispersonalandgroupspaceparadigminmodelingacollaborativeenvironmentintheAcademicConferencesdomain,whereaConferenceChaircanhavea personal view of the document collections (resources)

Figure 7. The CRADLE model for the JDAN framwork

184 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

of “Sapienza” University of Rome (undergraduate stu-dents),showninfigure5,and(2)anapplicationemployedwith a project of Records Management in a collabora-tion between the Computer Science and the ComputerEngineering Department of “Sapienza” University, asshowninfigure7.

usability of the Generated tools

Environments for single-view languagesgeneratedwithAToM3 have been extensively used, mostly in an aca-demic setting, in different areas like software and Webengineering,modeling,andsimulation;urbanplanning;etc. However, depending on the kind of the domain,generatingtheresultsmaytakesometime.Forinstance,the state reachability analysis in the DL example takesa fewminutes;wearecurrentlyemployingaversionofAToM3thatincludesPetri-netsformalismwherewecantest the services states reachability.49 In general, fromapplication experience, we note the general agreementthat automated syntactical consistency support greatlysimplifies the design of complex systems. Finally, someusers pointed out some technical limitations of the cur-rentimplementation,suchasthefactthatitisnotpossibletoopenseveralviewsatatime.

Altogether,webelievethisworkcontributestomakemore efficient and less tedious the definitionand main-tenance of environments for DLS. Our model-basedapproach must be contrasted with the programming-centricapproachofmostCASEtools,wherethelanguageand the code generation tools are hard-coded so thatwheneveramodificationhastobedone(whetheronthelanguageoronthesemanticdomain)developershavetodiveintothecode.

■■ Conclusions and Future Work

DLs are complex information systems that integratefindingsfromdisciplinessuchashypertext, informationretrieval, multimedia, databases, and HCI. DL design isoften a multidisciplinary effort, including library staffand computer scientists. Wasted effort and poor inter-operability can therefore ensue. Examining the relatedbibliography, we noted that there is a lack of tools orautomaticsystemsfordesigninganddevelopingcoopera-tiveDLsystems.Moreover,thereisaneedformodelinginteractionsbetweenDLsandusers,suchasscenariooractivity-basedapproaches.

TheCRADLEframeworkfulfillsthisgapbyprovidingamodel-drivenapproachforgeneratingvisualinteractiontoolsforDLs,supportingdesignandautomaticgenerationofcodeforDLs.Inparticular,weuseametamodelmadeofdifferentdiagramtypes(collection,structures,service,and

structoranactor,withdifferentcolors.

Diffuseness/tersenessAnotationisdiffusewhenmanyelementsareneededtoexpress one concept. CRADLE is terse and not diffusebecauseeachentityexpressesameaningonitsown.

error-PronenessData flow visualization reduces the chance of errorsat a first level of the specification. On the other hand,somemistakescanbeintroducedwhenspecifyingvisualentities, since it is possible to express relations betweensourceandtargetmodelswhichcannotgeneratesemanti-cally correct code. However, these mistakes should beconsidered “programming errors more than slips,” andmaybedetectedthroughprogressiveevaluation.

Hidden DependenciesAhiddendependencyisarelationbetweentwoelementsthatisnotvisible.InCRADLE,relevantdependenciesarerepresentedasdataflowsviadirectedlinks.

Progressive evaluationEach DL model can be tested as soon as it is defined,withouthavingtowaituntilthewholemodelisfinished.ThevisualinterfacefortheDLcanbegeneratedwithjustoneclick,andservicescanbesubsequentlyaddedtotesttheirfunctionalities.

viscosityCRADLE has a low viscosity because making smallchanges in a part of a specification does not imply lotsof readjustments in the restof it.Onecanchangeprop-erties, events or responses and these changes will haveonlylocaleffect.Theonlylocalchangesthatcouldimplyperformingfurtherchangesbyhandaredeletingentitiesorchangingnames;however,thiswouldimplyminimalchanges (just removing or updating references to them)andwouldonlyaffectasmallsetofsubsequentelementsinthesamedataflow.

visibilityADLspecificationconsistsofasinglesetofdiagramsfit-ting inonewindow.Empirically,wehaveobservedthatthismodelusuallyinvolvesnomorethanfifteenentities.Different,independentCRADLEmodelscanbesimulta-neouslyshownindifferentwindows.

expressiveness of crADle

ThepaperhasillustratedtheexpressivenessofCRADLEbydefiningdifferententitiesendrelationshipsfordiffer-entDLrequisites.Tothisend,twodifferentapplicationshave been considered: (1) a basic example elaboratedwiththecollaborationoftheInformationScienceSchool

GeNerAtiNG cOllABOrAtive sYsteMs FOr DiGitAl liBrAries | MAliziA, BOttONi, AND leviAlDi 185

Retrieval(Reading,Mass.:Addison-Wesley,1999).17. D. Lucarella andA. Zanzi, “A Visual Retrieval Environ-

mentforHypermediaInformationSystems,”ACM Transactions on Information Systems 14(1996):3–29.

18. B. Wang, “A Hybrid System Approach for SupportingDigitalLibraries,”International Journal on Digital Libraries2(1999):91–110,.

19. D. Castelli, C. Meghini, and P. Pagano, “Foundations ofa Multidimensional Query Language for Digital Libraries,” inProc. ECDL ’02,LNCS2458(Berlin:Springer,2002):251–65.

20. R. N. Oddy et al., eds., Proc. Joint ACM/BCS Symposium in Information Storage & Retrieval(Oxford:Butterworths,1981).

21. K.Maly,M.Zubairetal.,“ScalableDigitalLibrariesBasedon NCSTRL/DIENST,” in Proc. ECDL ’00 (London: Springer,2000):168–79.

22. R. Tansley, M. Bass and M. Smith, “DSpace as an OpenArchivalInformationSystem:CurrentStatusandFutureDirec-tions,” Proc. ECDL ’03, LNCS 2769 (Berlin: Springer, 2003):446–60.

23. K.M.Andersonetal.,“Metis:Lightweight,Flexible,andWeb-Based Workflow Services for Digital Libraries,” Proc. 3rd ACM/IEEE-CS JCDL ’03 (Los Alamitos, Calif.: IEEE ComputerSociety,2003):98–109.

24. N.Dushay,“LocalizingExperienceofDigitalContentviaStructuralMetadata,”InProc. 2nd ACM/IEEE-CS JCDL ’02(NewYork:ACM,2002):244–52.

25. M.Gogollaetal.,“IntegratingtheERApproachinanOOEnvironment,”Proc. ER, ’93(Berlin:Springer,1993):376–89.

26. Heidi Gregersen and Christian S. Jensen, “TemporalEntity-Relationship Models—A Survey,” IEEE Transactions on Knowledge & Data Engineering11(1999):464–97.

27. B. Berkem, “Aligning IT with the Changes using theGoal-DrivenDevelopmentforUMLandMDA,”Journal of Object Technology4(2005):49–65.

28. A. Malizia, E. Guerra, and J. de Lara, “Model-DrivenDevelopmentofDigitalLibraries:GeneratingtheUserInterface,”Proc. MDDAUI ’06, http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-214/(accessedOct18,2010).

29. D.L.Atkinsetal.,“MAWL:ADomain-SpecificLanguageforForm-BasedServices,”IEEE Transactions on Software Engineer-ing25(1999):334–46.

30. J. de Lara and H. Vangheluwe, “AToM3: A Tool forMulti-Formalism and Meta-Modelling,” Proc. FASE ’02 (Berlin:Springer,2002):174–88.

31. J. M. Morales-Del-Castillo et al., “A Semantic Model ofSelective Dissemination of Information for Digital Libraries,”Journal of Information Technology & Libraries28(2009):21–30.

32. N. Santos, F. C.A. Campos, and R. M. M. Braga, “Dig-ital Libraries and Ontology,” in Handbook of Research on Digital Libraries: Design, Development, and Impact, ed.Y.-L.Thengetal. (Hershey,Pa.:IdeaGroup,2008):1:19.

33. F. Wattenberg, “A National Digital Library for Science,Mathematics, Engineering, and Technology Education,” D-Lib Magazine3no.10(1998),http://www.dlib.org/dlib/october98/wattenberg/10wattenberg.html (accessed Oct 18, 2010); L. L.Zia, “The NSF National Science, Technology, Engineering, andMathematicsEducationDigitalLibrary (NSDL)Program:NewProjects and a Progress Report,” D-lib Magazine, 7, no. 11(2002), http://www.dlib.org/dlib/november01/zia/11zia.html(accessedOct18,2010).

34. U.S.LibraryofCongress,AskaLibrarian,http://www.loc

society),whichdescribethedifferentaspectsofaDL.WehavebuiltacodegeneratorabletoproduceXULcodefromthe design models for the DL user interface. Moreover,we use template code generation integrating predefinedcomponents for the different services (XDoclet language)accordingtothemodelspecification.

ExtensionsofCRADLEwithbehavioraldiagramsandthe addition of analysis and simulation capabilities areunderstudy.ThesewillexploitthenewAToM3capabili-tiesfordescribingmultiviewDSVLs,towhichthisworkdirectlycontributed.

References

1. A.M.Gonçalves,E.AFox,“5SL:alanguagefordeclara-tivespecificationandgenerationofdigitallibraries,”Proc.JCDL ’02(NewYork:ACM,2002):263–72.

2. L. Candela et al., “Setting the Foundations of DigitalLibraries: The DELOS Manifesto,” D-Lib Magazine 13 (2007),http://www.dlib.org/dlib/march07/castelli/03castelli.html(accessedOct18,2010).

3. A.Maliziaetal.,“ACooperative-RelationalApproachtoDigitalLibraries,”Proc. ECDL 2007,LNCS4675(Berlin:Springer,2007):75–86.

4. E.A.FoxandG.Marchionini,“TowardaWorldwideDig-italLibrary,”Communications of the ACM 41(1998):29–32.

5. M. A. Gonçalves et al., “Streams, Structures, Spaces,Scenarios,Societies(5s):AFormalModelforDigitalLibraries,”ACM Transactions on Information Systems 22(2004):270–312.

6. J.C.R.Licklider,Libraries of the Future(Cambridge,Mass.:MITPr.,1965).

7. D.M.LevyandC.C.Marshall,“GoingDigital:ALookatAssumptions Underlying Digital Libraries,” Communications of the ACM38(1995):77–84.

8. R. Reddy and I. Wladawsky-Berger, “Digital Librar-ies: Universal Access to Human Knowledge—A Report to thePresident,”2001,www.itrd.gov/pubs/pitac/pitac-dl-9feb01.pdf(accessedMar.16,2010).

9. E.L.Morgan,“MyLibrary:ADigitalLibraryFrameworkand Toolkit,” Journal of Information Technology & Libraries 27(2008):12–24.

10. T.R.KochtanekandK.K.Hein,“DelphiStudyofDigitalLibraries,”Information Processing Management35(1999):245–54.

11. S.E.Howeetal.,“ThePresident’sInformationTechnologyAdvisoryCommittee’sFebruary2001DigitalLibraryReportandItsImpact,”InProc.JCDL ’01(NewYork:ACM,2001):223–25.

12. N.Reyes-FarfanandJ.A.Sanchez,“PersonalSpacesintheContextofOA,”Proc. JCDL ’03 (IEEEComputerSociety,2003):182–83.

13. M. Wirsing, Report on the EU/NSF Strategic Workshop on Engineering Software-Intensive Systems,2004,http://www.ercim.eu/EU-NSF/sis.pdf(accessedOct18,2010)

14. S. Kelly and J.-P. Tolvanen, Domain-Specific Modeling: Enabling Full Code Generation(Hoboken,N.J.:Wiley,2008).

15. H.R.TurtleandW.BruceCroft,“EvaluationofanInfer-ence Network-Based Retrieval Model,” ACM Transactions on Information Systems9(1991):187–222.

16. R.A.Baeza-Yates,B.A.Ribeiro-Neto,Modern Information

186 iNFOrMAtiON tecHNOlOGY AND liBrAries | DeceMBer 2010

.mozilla.org/En/XUL(accessedMar.16,2010).43. XDoclet, Welcome! What is XDoclet? http://xdoclet

.sourceforge.net/xdoclet/index.html(accessedMar.16,2010).44. W3C, Extensible Markup Language (XML) 1.0 (Fifth

Edition), http://www.w3.org/TR/2008/REC-xml-20081126/(accessedMar.16,2010);W3C,ResourceDescriptionFramework(RDF),http://www.w3.org/RDF/(accessedMar.16,2010).

45. H. Wada and J. Suzuki, “Modeling Turnpike FrontendSystem:AModel-DrivenDevelopmentFrameworkLeveragingUML Metamodeling and Attribute-Oriented Programming,”Proc. MoDELS ’05,LNCS3713(Berlin:Springer,2005):584–600.

46. I.Horrocks,Constructing the User Interface with Statecharts(Boston:Addison-Wesley,1999).

47. Universal Discover, Description, and Integration OASISStandard, Welcome to UDDI XML.org, http://uddi.xml.org/(accessedMar.16,2010).

48. T.R.G.GreenandM.Petre,“UsabilityAnalysisofVisualProgramming Environments: A ‘Cognitive Dimensions Frame-work,’”Journal of Visual Languages & Computing7(1996):131–74.

49. J. de Lara, E. Guerra, and A. Malizia, “Model DrivenDevelopmentofDigitalLibraries—Validation,AnalysisandFor-mal Code Generation,” Proc. 3rd WEBIST ’07 (Berlin: Springer,2008).

.gov/rr/askalib/(accessedonMar.16,2010).35. C.L.Borgmann,“WhatareDigitalLibraries?Competing

Visions,”Information Processing & Management25(1999):227–43.36. C. Lynch, “Coding with the Real World: Heresies and

Unexplored Questions about Audience, Economics, and Con-trolofDigitalLibraries,”InDigital Library Use: Social Practice in Design and Evaluation,ed.A.P.Bishop,N.A.VanHouse,andB.Buttenfield(Cambridge,Mass.:MITPr.,2003):191–216.

37. Y. Ioannidis et al., “Digital Library Information-Technol-ogy Infrastructure,” International Journal of Digital Libraries 5(2005):266–74.

38. E.A.Foxetal.,“TheNetworkedDigitalLibraryofThesesandDissertations:ChangesintheUniversityCommunity,” Jour-nal of Computing Higher Education13(2002):3–24.

39. H.VandeSompelandC.Lagoze,“Notes fromthe Inter-operability Front:A Progress Report on the OpenArchives Ini-tiative,”Proc. 6th ECDL, 2002,LNCS2458(Berlin:Springer2002):144–57.

40. F.DeRosaetal.,“JDAN:AComponentArchitecture forDigitalLibraries,”DELOS Workshop: Digital Library Architectures,(Padua,Italy:EdizioniLibreriaPeogetto,2004):151–62.

41. Definedasasetofactors(users)playingrolesandinter-actingwithservices.

42. Mozilla Developer Center, XUL, https://developer