Managing and sharing collaborative files through WWW

Download Managing and sharing collaborative files through WWW

Post on 02-Jul-2016




0 download

Embed Size (px)


  • Future Generation Computer Systems 17 (2001) 10391049

    Managing and sharing collaborative files through WWWRuey-Kai Sheu a, Yue-Shan Chang b,, Shyan-Ming Yuan a

    a Department of Computer and Information Science, National Chiao-Tung University, Hsin-Chu, Taiwan, ROCb Department of Electronic Engineering, Ming Hsing Institute of Technology, 1 Hsin-Hsing Road, Hsin-Fong, Hsin-Chu 304, Taiwan, ROC


    The increasing complexity and geographical separation of design data, tools and teams have demanded for a collaborativeand distributed management environment. In this paper, we present a practical system, the CFMS, which is designed to managecollaborative files on WWW. Wherever remote developers are, they can navigate the evolutions and relationships betweentarget files, check in and check out them conveniently on web browsers using CFMS. The capabilities of CFMS includethe two-level navigation mechanism for version selection, the relationship model that conceptually manages the relationshipsbetween physical files as well as the prefix-based naming scheme that can uniquely identify requested objects. 2001 ElsevierScience B.V. All rights reserved.

    Keywords: Collaborative file management; Prefix-based naming scheme; Relationship model

    1. Introduction

    The increasing complexity and geographical sepa-ration of design data, tools and teams have demandedfor a collaborative and distributed management envi-ronment. That is, companies need good managementtools to leverage the skills of manipulating the mosttalented and experienced resources, wherever in thecompany or the somewhere they are located.

    Major challenges for collaborative file managementinclude version control, configuration management,concurrency access control, environment heterogene-ity, and the dispersion of developers [1]. Take theelectronic design automation (EDA) paradigm, e.g.,design teams realize that the number of tools neededto implement complex systems is ever increasing,

    Corresponding author. Tel.: +886-3-557-2930;fax: +886-3-559-1402.E-mail addresses: (R.-K. Sheu), (Y.-S. Chang), Yuan).

    and such tools are generally provided by many dif-ferent suppliers [2]. At the same time, developersat different sites would parallelize the developmentswith many variants of each component, which mightcontain and share many files and many versions ofeach file. When controlling versions, developers haveto deal with versions and configurations that are or-ganized by files and directories. This is inconvenientand error-prone, since there is a gap between dealingwith source codes and managing configurations. It isnecessary for engineers to be capable of identifyingthe items which they are developing or maintaining.

    In this paper, we concentrate on issues of developinga web-based and visualized collaborative file manage-ment system. The web-based design has advantages ofeliminating the need to install and administrate clientsoftwares, resolving the cross-platform problems, pro-viding the accessibility for distributed developers, andhaving a uniform and compatible interface for differ-ent users. Not only does the proposed CFMS have allthe advantages of web-based environments, but alsocaptures and manages change requests among team

    0167-739X/01/$ see front matter 2001 Elsevier Science B.V. All rights reserved.PII: S0 1 6 7 -7 3 9X(01 )00045 -0

  • 1040 R.-K. Sheu et al. / Future Generation Computer Systems 17 (2001) 10391049

    members transparently. Through the proposed two-level navigation version selection mechanism, CFMSusers can define and choose the configuration ofthe developed component dynamically and visually.Without directly operating the bothersome chaos ofphysical files, users only need to manipulate the con-ceptual relationships between requested files on web.

    The rest of the paper is organized as follows. InSection 2, we discuss the related works about collab-oration management tools. In Section 3, CFMS sys-tem models are detailed. Section 4 briefly introducesthe CFMS system architecture. Section 5 shows theimplementation of CFMS. Finally, the conclusion isgiven in Section 6.

    2. Related works

    There are several collaboration management toolsin many paradigms. To illustrate, RCS [3], SCCS [4],DSEE [5], and ClearCase [6] concern the activitiesfor cooperative system development. RCS and SCCSare the most widely known version management sys-tem. These systems are built on the notion of vault[7] from where users must extract all of the sources,even you do not plan to change them. Besides, paral-lel development will result in even more copies. Thesecopies are not under the control of configuration sys-tems and may lead to more problems because they donot support concurrency control facilities. There areseveral later tools, such as CVS [8], built on top ofRCS. These tools improve the management of privatework areas, but do not really solve the fundamentalproblem inherent in vaults.

    DSEE and ClearCase are tools of the second-generation configuration management environment,which advance upon earlier tools for defining andbuilding configurations. They solve the problemsinherent in vaults through the use of a virtual file sys-tem. Intercepting native operating system I/O calls,they hook into the regular file system at user-definedpoints by user-specified rules. Intercepting systemcalls will tight the private work areas with the cen-tral data center together and introduce problemsencountered in traditional systems. Additionally, therule-based configuration mechanism is too complexto be used for web users.

    To solve problems of old systems we designthe CFMS as a moderate three-tier architecture. A

    middle-tier is introduced to decouple the front-endusers and the back-end file system, and simplifythe system complexity. In CFMS, a conceptual rela-tion model is proposed. Instead of the text-based orrule-based methodologies used in traditional tools,front-end users could traverse the relationship graphto define or navigate configurations The web-basedCFMS will promote the collaboration managementparadigm into the distributed heterogeneous environ-ment.

    3. System model

    Three essential system models which compose ofthe three-tier model, the relationship model and theprefix-based naming scheme for collaborative filemanagement are clearly identified in this section.These three models are orthogonal and are integratedto form the building blocks of the CFMS.

    3.1. Three-tier model

    In comparison with the vault-based and virtual filedesign of traditional collaboration management tools,the three-tier model looses the tight-couple relationbetween the client side representations of requestedobjects and the back-end file systems. In the front-end,users of different sites could define configurations fortheir private workspaces and use SCCS-like checkin/out operations to access the requested files. In themiddle-tier, the relationships between all the managedfiles are constructed and presented by a bi-directiongraph. Front-end users can surf back-end files bytraveling the relationship graph. The third-tier is thevast file system, which consists of many types of filesgenerated by different tools. Each physical file couldbe uniquely identified and conceptually versioned inthe second-tier. That is, the second-tier hides the com-plexity of directly manipulating the physical files andprovides higher flexibility and extensibility for CFMS.Fig. 1 shows the three-tier architecture of the proposedweb-based collaborative file management system.Front-end browsers show the user-selected configura-tion graphs, which might share nodes in the centralrelationship graph. The middle-tier hides the com-plexity to manage the relations for the third-tier flatfiles.

  • R.-K. Sheu et al. / Future Generation Computer Systems 17 (2001) 10391049 1041

    Fig. 1. Three-tier architecture of CFMS.

    3.2. Relationship model

    The CFMS uses logic structures to manage files,which are organized into a three-level hierarchicalarchitecture and they are projects, libraries and filesfrom the highest level down to the lowest one. Aproject is the achievement of a specific objective. Itinvolves several libraries as well as stand-alone files.A library is a frequently used group of files. The re-lationship model identifies the correlations betweenmanaged logic structures and classifies them into sixcategories: version, build, configuration, equivalence,hyper-link and sibling. The main contribution of thispaper is to identify and clarify these concepts intoorthogonal, rather than sequential relationships.

    Version. Version relationships are composed ofbranches and derived items, and can be representedby two-dimensional graphs. In Fig. 2, the generic

    Fig. 2. The generic version graph of a logic structure.

    version graph is described. Object A.1 is the root ofthe version history for the logic structure A. ObjectA.1.2 is a new version of A.1 after the creation timeof A.1.1. The relationship between A.1 and A.1.2 isthe branch relation. Other edges in the version historyare derived relations. The logic structure A could bea file, a library or a project.

    Build. Build relation stands for the aggregation ofa specific logic structure. Fig. 3 shows the versiongraph as well as the build relation of the logic structureA. The logic structure A is composed of two logicstructures X and Y. Both X and Y could be any logicstructures that are lower or equal to the level of A inthe hierarchy of logic structures.

    Equivalence. Equivalence relationships are dif-ferent representations of a logic structure in differ-ent development stages or platforms. For example,a C++ source program could be compiled as a

  • 1042 R.-K. Sheu et al. / Future Generation Computer Systems 17 (2001) 10391049

    Fig. 3. The build relation of the logic structure A.

    PC-based or workstation-based object files. Thesefiles are of the same meaning logically, but of differ-ent representations in the real world. In Fig. 4, weassume that X.1.Y.1 is created from X.1 and @.Y.1 iscreated from X.1.1.1 by tools which take X-versionedobjects as inputs and generate outputs objects ofthe same type as Y. Here, edge (X.1, X.1.Y.1) andedge (X.1.1.1, @.Y.1) are two equivalence relations.Where the @ notation is used to represent the prefixdiscussed in next section.

    Sibling. Siblings are relationships between parallel-developed versions of a logic structure. In other words,siblings are parallel version history paths of a logicstructure. Object Y.1, X.1.Y.1 and @.Y.1 are siblingsof Y in Fig. 4. When someone creates an new itemof Y type from X.1.1.1, @.Y.1 (@=X.1.1.1) is the

    Fig. 5. The configuration and hyper-link relation of the logic structure A.

    Fig. 4. The equivalence and sibling relation of the logic structureA.

    most adequate identity of version name than others.Neither Y.1.1.1 nor X.1.Y.1 is the parent of @.Y.1 forthe derived relation. If we connect the edge (X.1.Y.1,@.Y.1) or (Y.1.1.1, @.Y.1) with relations rather thansibling, users will be confused with them.

    Configuration. Configuration relationships consistof snapshots of higher-level logic structures. Fig. 5illustrates the snapshots of the logic structure A. Ob-ject A.1 consists of X.1 and Y.1. Object A.1.1 iscomposed of X.1.1 and Y.1. Object A.1.2.1 involvesX.1.1.1 and X.1.Y.1.

    Hyper-link. Hyper-link relation tights the manipu-lated files as well as the documentations that describethe related information of them. In Fig. 5, we showsthe examples of hyper-link relations that record the

  • R.-K. Sheu et al. / Future Generation Computer Systems 17 (2001) 10391049 1043

    hyperlinks to web pages describing related informa-tion, including why they are improved, when the ob-ject is updated, who did the changes, where did theauthor locate, how it is operated, etc.

    3.3. Prefix-based naming schemeEngineers must be able to identify the items which

    they are developing or maintaining. In particular, eachitem must have a unique name and the name mustbe meaningful. In CFMS, the proposed prefix-basedname has the meaning of the evolution of the item,the logic structure type and name, and the correspond-ing development stages. The prefix-based name is pri-marily used to help user to traverse and navigate themulti-dimensional relation graph. It can also be usedfor text-based query for version selection based onlogic structure names.

    The followings are the productions for version nameresolution, where stands for non-terminals and oth-ers are terminals. Version names can be abstracted asa prefix concatenating with the relationship name. Tobe specific, the version name of a new item for thederived relation will be the prefix, the original old ver-sion name, concatenating with a derived name. Theprefix-based naming scheme gives each relationship aunique name, respectively. Because the prefix-basednaming scheme gives a unique identity for each or-thogonal relation, the version name for each objectwill guarantee uniqueness even versioned objects havethe same prefix.

    The BNF of the prefix-based naming scheme:

    version name prefix.relationrelation equivalence|derived|branchbranch .(number of branches+ 1)derived .1|derivedequivalence prefix.base.(number of

    siblings+ 1)prefix base.derived|base.branch|

    version namebase logic name|logic name.1logic name P name|L name|

    F name|F name stagestage (number of development steps+

    number of platforms)name string of character set

    excluding the dot

    4. System architecture

    The CFMS is a three-tier architecture, as shownin Fig. 6. The front-end user interfaces are WWWbrowsers. Through HTTP, users download the JavaApplet into the client sites. The Java Applet is re-sponsible for drawing the partial relationship graphbased on configuration for users. The reason whyJava Applet is used here is that current SGML-basedstandard markup languages are not suitable to drawthe complex multi-dimensional relationship graph. Inthe middle-tier, the CFMS is composed of relation-ship manager, query manager, concurrency controlmanager and the resource manager. The relationshipmanager administrates the relations between collab-orative files. The query manager provides facilitiesfor text-based query. Users can submit search termsto select the logic structures that satisfy the searchcriteria. Concurrency control manager is responsiblefor the management of parallel developments on tar-get files. CFMS supplies the SCCS-like check in/outoperations to access files. It allows multiple read re-quests for a specific file at a given time. Only the userwho gets the write lock can update that file. Resourcemanager controls all the physical files through thephysical file I/O system calls.

    5. System implementation

    In this section, we discuss the most significantportions, which represent the major prominent dis-crepancies between CFMS and other collaborationmanagement tools.

    5.1. Logic structure

    Logic structure is the main idea to manage the logicrelationships of files. In CFMS, a database is used tokeep the mapping between r...