enterprise-widecomputingjl3aq/courses/cs851_grid/week1/enterprise_wide... · system i could submit...

pressing industrial and scientific problems.Recently, the use of high-performance

computing techniques has led to the devel-opment of methods for building and access-

ing VLKBs with the faster processing andlarge memories enabled by modem super-

computers. Whereas entering the knowl-edge base for CYC has taken tens of per-

son-years, these new techniques permit theautomatic generation of VLKBs in muchshorter times. In addition, new accessingtechniques provide searches that are severalorders of magnitude faster than serial algo-rithms for matching complex patterns on

relatively unstructured data. Althoughthese techniques may not obviate the needfor CYC-like projects (after all, common

sense is hard to find, even among people),they open many intriguing possibilities.Some examples are mentioned here:

1) Large case-based systems. Instead ofsolving problems from scratch, systems can

solve new problems by analogy to previoussolutions (2). Such "case-based"' reasoningrequires very large memories of previousproblem solutions. Building these memoriesby hand can be an enormously difficultknowledge engineering task. Recent workhas shown that large case bases can be au-

tomatically generated with AI techniques(3) and accessed extremely efficiently byparallel inferencing techniques (4).

2) Hybrid knowledge and databases. Manylarge corporate and scientific databasescan be used to create AI knowledge bases.First, specific information about the do-main of discourse is used to encode knowl-edge about the underlying characteristicsand functions of objects in that domain.Following this, traditional database queriesare used to create a knowledge base relatingspecific instances from the database to themore generic AI knowledge. The resultinghybrid knowledge and database can be usedto combine searching and inferencing, withsupercomputing techniques again providingefficient pattern-matching capabilities thatare difficult to encode and inefficient torun in the unaugmented database. Thistechnique is particularly important in ap-

plications where old data must be exploredin novel ways to see if recently discoveredpatterns were previously existent in the da-tabase (examples include epidemiology andpharmaceutical research).

3) Software agents. A recent innovationin AI technology is the creation of intelli-gent agents to help users explore complexunstructured information, such as that inthe millions of documents distributedacross the Internet. Although not yet a

commercial technology, software agents are

expected to become an important mecha-nism in providing access and navigation

aids to the large amounts of informationstored in so-called cyberspace. Current In-

892

ternet agents, for example, provide knowl-edge-based interface tools for making thenet more user friendly (5) and use AI tech-niques to help filter out vast amounts of ir-relevant information (6). A recentlystarted project in my laboratory, for ex-

ample, focuses on the use of parallel infer-encing techniques to provide a basis forcreating agents that wander through thenetwork, explore the information residingtherein, and process it to create a large,knowledge-based memory for use by inter-face, search, and filtering agents.

It is still early in the development stageof this exciting new technology, and muchimportant research remains to be done.High-performance computing support formassive knowledge bases requires the ex-

ploration of several areas, which are cur-

rently receiving only minimal funding (de-spite the vast budget resources being putinto the national information infrastruc-ture). To continue scaling AI, we must ex-

plore the use of secondary storage in ways

that are amenable to the irregular memoryusage patterns of AI's knowledge-base tech-nology (very different from the more regu-

For over 30 years, science fiction writershave spun yams featuring worldwide net-works of interconnected computers that be-have as a single entity. Until recently, suchfantasies have been just that. Technologi-cal changes are now occurring that may ex-

pand computational power just as the in-vention of desktop calculators and personalcomputers did. In the near future, computa-tionally demanding applications will no

longer be executed primarily on supercom-

puters and single workstations dependenton local data sources. Instead, enterprise-wide systems, and someday nationwide sys-

tems, will be used that consist of worksta-tions, vector supercomputers, and parallelsupercomputers connected by local andwide-area networks. Users will be presentedthe illusion of a single, very powerful com-

puter, rather than a collection of disparatemachines. The system will schedule appli-cation components on processors, managedata transfer, and provide communicationand synchronization so as to dramaticallyimprove application performance. Further,boundaries between computers will be in-

The author is in the Department of Computer Science,University of Virginia, Charlottesville, VA 22903, USA.

SCIENCE * VOL. 265 * 12 AUGUST 1994

lar accesses used in most scientific super-

computing). New inference classes and ac-

cess mechanisms must be developed for ef-ficiently exploring the ever larger knowl-edge repositories available to users. Newtechniques must also be found for miningthe data stored in scientific and medical da-tabases, and reasoning mechanisms must bedeveloped for extending software agentstechnology, tailoring it to the creation anduse of large knowledge-based memories. Asthese techniques become more commer-

cially viable, we can expect to see the nextgreat stride in the use of AI technology inindustrial, government, and scientific ap-

plications.

References

1

2.

3.

4.

5.

6.

D. Lenat and E. Feigenbaum, Artif. Intell. 47, 1(1991).J. Kolodner, Case-Based Reasoning (MorganKaufman, San Francisco, CA, 1993).B. Kettler, J. Hendler, W. Andersen, M. Evett,IEEE Expert, 9 (no. 1), 8 (1994).M. Evett, J. Hendler, L. Spector, J. Parallel Distrib.Comput., in press.0. Etzioni and D. Weld, Commun. ACM 37, 72(1994).P. Maes, ibid., p. 30.

visible, as will the location of data and thefailure of processors.

To illustrate the concept of an enter-prise-wide system, first consider the work-station or personal computer on your desk.By itself it can execute applications at a

rate that is loosely a function of its cost,manipulate local data stored on local disks,and make printouts on local printers. Shar-ing of resources with other users is minimaland difficult. If your workstation is attachedto a department-wide local area network(LAN), not only are the resources of yourworkstation available to you but so are thenetwork file system and network printers.This allows expensive hardware such as

disks and printers to be shared and allowsdata to be shared among users on the LAN.With department-wide systems, processor

resources can be shared in a primitive fash-ion by remote log-in to other machines. Torealize an enterprise-wide system, many de-partment-wide systems within a larger orga-

nization, such as a university, company, or

national lab, are connected, as are more

powerful resources such as vector super-computers and parallel machines. However,connection alone does not make an enter-prise-wide system. If it did, then we would

Enterprise-Wide ComputingAndrew S. Grimshaw

Sfl, il

have enterprise-wide systems today. To con-vert a collection of machines into an en-terprise-wide system requires software thatmakes the sharing of resources such as data-bases and processor cycles as easy as sharingprinters and files on a LAN; it is just thatsoftware that is now being developed.

The potential benefits of enterprise-wide systems include more effective collab-oration by putting co-workers in the samevirtual workplace, higher application per-formance as a result of parallel executionand exploitation of off-site resources, im-proved access to data, im-proved productivity re-sulting from more effec-tive collaboration, and aconsiderably simpler pro-gramming environment forapplications programmers.

Three key technolog-ical changes make en-terprise-wide computingpossible. The first is themuch heralded informa-tion superhighway or na-tional information infra-structure (NII) and thegigabit (109bits per sec-ond) networks that areits backbone. These net-works can carry ordersof magnitude more datathan current systems. Theeffect is to "shrink thedistance" between com-puters connected by the Sharing resourcnetwork. This, in turn, chine that providlowers the cost of compu- chine and proceter-to-computer commu- competing usersnication, enabling com- executables andputers to more easily ex-change both information and work to beperformed.

The second technological change is thedevelopment and maturation of parallel-izing compiler technology for distributed-memory parallel computers. Distributed-memory parallel computers consist of manyprocessors, each with its own memory andcapable of running a different program,connected together by a network. Parallel-izing compilers are programs that takesource programs in a language such asHigh-Performance Fortran and generateprograms that execute in parallel acrossmultiple processors, reducing the time re-quired to perform the computation. De-pending on the application and the equip-ment used, the performance improvementcan be from a modest factor of 2 or 3 to asmuch as two orders of magnitude. Most dis-tributed-memory parallel computers to datehave been tightly coupled, where all of theprocessors are in one cabinet, connected bya special purpose, high-performance net-

work. Loosely coupled systems, constructedof high-performance workstations andLANs, are now competitive with tightlycoupled distributed-memory parallel com-puters on some applications. These work-station farms (1, 2) have become increas-ingly popular as cost-effective alternativesto expensive parallel computers.

The third technological change is thematuration of heterogeneous distributedsystems technology (3). A heterogeneousdistributed system consists of multiple com-puters, called hosts, connected by a network.

processor faults, and operating system dif-ferences can be managed.

Today enterprise-wide computing is justbeginning. As of yet, these technologieshave not been fully integrated. However,projects are under way at the University ofVirginia and elsewhere (4) that if successfulwill lead to operational enterprise-wide sys-tems. For the moment, systems available foruse with networked workstations fall intothree non-mutually exclusive categories,(i) heterogeneous distributed systems, (ii)throughput-oriented parallel systems, and

ces. The user's view of an ideal enterprise-wide system is one of a very powerful monolithic virtual ma-Jes computational and data storage services. The user need not be concerned with the details of ma-,ssor type, data representation and physical location of processors and data, or the existence of others. The user does not see system boundaries, only objects, both application objects, for example,Idata objects, such as files. Object location and representation are hidden.

The distinguishing feature is that the hostshave different processors (80486 versus68040), different operating systems (Unixversus VMS), and different available re-sources (memory or disk). These differencesand the distributed nature of the system in-troduce complications not present in tradi-tional, single-processor mainframe systems.After 20 years of research, solutions havebeen found to many of the difficulties thatarise in heterogeneous distributed systems.

The combination of mature parallelizingcompiler technology and gigabit networksmeans that it is possible for applications torun in parallel on an enterprise-wide sys-tem. The gigabit networks also permit ap-plications to more readily manipulate dataregardless of its location because they willprovide sufficient bandwidth to eithermove the data to the application or tomove the application to the data. The ad-dition of heterogeneous distributed systemtechnology to the mix means that issuessuch as data representation and alignment,


(iii) response time-oriented parallel sys-tems. Heterogeneous distributed systemsare ubiquitous in research labs today. Suchsystems allow different computers to inter-operate and exchange data. The most sig-nificant feature is the shared file system,which permits users to see the same file sys-tem, and thus share files, regardless ofwhich machine they are using or its type.The single file-naming environment sig-nificantly reduces the barriers to collabora-tion and increases productivity. Through-put-oriented systems focus on exploitingavailable resources in order to service thelargest number of jobs, where a job is a sin-gle program that does not communicatewith other jobs. The benefit of these sys-tems is that available, otherwise idle, pro-cessor resources within an organization canbe exploited. Although no single job runsany faster than it would on the owner'sworkstation, the total number of jobs ex-ecuted in the organization can be signifi-cantly increased. For example, in such a

893

1

-., ffi 90-.ale 1,11,11 1 M ..

system I could submit five jobs at the sametime in a manner reminiscent of old-stylebatch systems. The system would then se-lect five idle processors on which to ex-ecute my jobs. If insufficient resources wereavailable, then some of the jobs would bequeued for execution at a later time. Re-sponse time-oriented systems are con-cerned with minimizing the execution timeof a single application, that is, with har-nessing the available workstations to act asa virtual parallel machine. The purpose isto more quickly solve larger problems thanwould otherwise be possible on a singleworkstation. Unfortunately, to achieve theperformance benefits an application mustbe rewritten to use the parallel environ-ment. The difficulty of parallelizing appli-cations has limited the acceptance of paral-lel systems.

The Legion project at the University ofVirginia is working toward providing sys-tem services that provide the illusion of asingle virtual machine to users, a virtualmachine that provides both improved re-sponse time and greater throughput (5).Legion is targeted toward nationwide com-puting. Rather than construct a full-scalesystem from scratch, we have chosen toconstruct a campus-wide test-bed, thecampus-wide virtual computer, by extend-

ing Mentat, an existing parallel processingsystem. Even though the campus-wide sys-tem is smaller, and the components muchcloser together, than a full-scale nation-wide system, it presents many of the samechallenges. The processors are heterogene-ous; the interconnection network is irregu-lar, with orders of magnitude differences inbandwidth and latency; and the machinesare currently in use for on-site applicationsthat must not be hampered. Further, eachdepartment operates essentially as an islandof service, with its own file system.

The campus-wide system is both a work-ing prototype and a demonstration project.The objectives are to demonstrate the use-fulness of network-based, heterogeneous,parallel processing to computational sci-ence problems; provide a shared high-per-formance resource for university researchers;provide a given level of service (as mea-sured by turn-around time) at reduced cost;and act as a test-bed for the large-scale Le-gion. The prototype is now operational andconsists of over 60 workstations from threedifferent manufacturers in four differentbuildings. At the University of Virginia, weare using two production applications forperformance testing: complib, a biochemis-try application that compares DNA andprotein sequences, and ATPG, an electrical

engineering application that generates testpatterns for very large scale integrated cir-cuits. Early results are encouraging. Otherproduction applications are planned.

I believe that projects such as Legionwill lead to the widespread availability anduse of enterprise-wide systems. Althoughsignificant challenges remain, there is noreason to doubt that such systems can bebuilt. The introduction of enterprise-widesystems will result in another leap forwardin the usefulness of computers. Productivitywill increase owing to the more powerfulcomputing environment, the tearing downof barriers to collaboration between geo-graphically separated researchers, and in-creased access to remote information suchas digital libraries and databases.

References and Notes

1. B. Buzbee, Science 261, 852 (1993).2. J. A. Kaplan and M. L. Nelson, NASA Tech.

Memo. 109025 (NASA Langley Research Center,Langley, VA, 1993).

3. D. Notkin et al., Commun. ACM 30, 132 (1987).4. R. Rouselle et al., The Virtual Computing Environ-

ment: Proceedings of the Third International Sym-posium on High Performance Distributed Com-puting (IEEE Computer Society Press, LosAlamitos, CA, in press).

5. For more information on heterogeneous parallelprocessing, use Mosaic to access http://uvacs.cs.virginia.edu:/-mentat/science.html.


.l

894

enterprise-widecomputingjl3aq/courses/cs851_grid/week1/enterprise_wide... · system i could submit...

Documents