building your internet data center

100
DELL POWER SOLUTIONS ISSUE 3 2000 THE MAGAZINE FOR DIRECT ENTERPRISE SOLUTIONS I NSIDE T HIS I SSUE ISSUE 3 2000 $9.95 Building Your Internet Data Center Building Your Internet Data Center P OWER E DGE C LUSTERS WITH S TORAGE A REA N ETWORKS M ICROSOFT .NET E NTERPRISE S ERVERS H IGH P ERFORMANCE I NTERNET D ATA C ENTERS

Upload: others

Post on 12-Sep-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building Your Internet Data Center

DELL P

OW

ER S

OLU

TIO

NS

ISS

UE 3

20

00

T H E M A G A Z I N E F O R D I R E C T E N T E R P R I S E S O L U T I O N S

I N S I D E T H I S I S S U E

I S S U E 3 2 0 0 0 $ 9 . 95

Building Your InternetData CenterBuilding Your InternetData Center

P O W E R E D G E C L U S T E R S W I T H S T O R A G E A R E A N E T W O R K S

M I C R O S O F T . N E T E N T E R P R I S E S E R V E R S

H I G H P E R F O R M A N C E I N T E R N E T D ATA C E N T E R S

Page 2: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 1

E X E C U T I V E N O T E S

The phrase mission critical is an understatement when itcomes to data centers and data center management in

the modern business enterprise. The explosive growth ofbusiness and the unprecedented demand for data centerresources create this mission-critical state. Therein lies thechallenge: to balance—and manage—the business growthand the demand for resources to create successful enterprises.

E-commerce has fueled today’s economy with its contin-ued expansion. Global Internet business is expected to growat a compound annual rate of 100 percent to almost $3.2trillion by 2003, and to $6 trillion by 2004, according toForrester Research. Companies currently deploying anInternet business model are seeking answers to the chal-lenges of managing successful, mission-critical data centers.

Data Centers Tie Global Operations TogetherDell data centers are excellent examples of the mission-critical data center. These centers are the lifeblood of the enterprise, because they host the infrastructure for a$50 million per-day online business. In addition, their75,000 square feet of raised floor is home to business-critical Dell application servers, contact center platforms,and the data communications gear that ties the globaloperation together.

Reliability and stability in a distributed data center environment are the tenets of manageability and successfor Dell. This means the creation, growth, and mainte-nance of a scalable infrastructure is strategic to the corpo-ration. High-availability resources and a balanced business

continuity plan are essential to the success of a $30 billioncompany with a 30 percent annual growth rate potential.

Technology Offers AnswersToday’s technology landscape makes it possible to drive updata center effectiveness while driving down the total cost ofownership. Dell is positioned to assist you with the tools youneed to meet these challenges.

We hope that you find this issue and the articles on datacenter best practices interesting, informative, and useful as youcontinue to plan and manage your data center environment.

John S. CraparoVice President, IT OperationsDell Computer

Dell TimeEquals

Uptime!

Page 3: Building Your Internet Data Center

Managing Editor Eddie Ho

Art Director Iva Frank

Designers Liz Fiorentino, Mark Mastroianni, Cynthia Webb

Publication ServicesThe TDA GroupFour Main Street, Suite 100Los Altos, CA 94022

Subscriptions and Address ChangesSubscriptions are free to qualified readers who complete the subscriptioncard found in each issue. To subscribe or change your address, complete and return the business reply card in this issue or visit us at www.dell.com/powersolutions.

About Dell ComputerDell Computer Corporation, headquartered in Round Rock, Texas, near Austin,is the world’s leading direct computer systems company. Dell is the number2 and fastest growing among all major computer systems companies world-wide, with more than 26,100 employees around the globe. Dell uses thedirect business model to sell its high-performance computer systems, andworkstation and storage products to all types of enterprises. For more infor-mation, please visit our Web site at www.dell.com.

Dell4, the Dell logo, OpenManage™, PowerEdge4, PowerVault4—Dell Computer Corporation; 3Com4,Dynamic Access4—3Com Corporation; Adobe4, Acrobat4—Adobe Systems, Inc.; Alteon4—AlteonWebSystems, Inc.; Macintosh4—Apple Computer, Inc.; ANSI4—American National Standards Institute;Asymetrix4—Asymetrix Corporation; Axil™—Axil Computer, Inc.; Baan™—Baan Development B.V.;BrioQuery Designer4—Brio Technology, Inc.; CenterLine4—CenterLine Software Inc.; Cisco4—CiscoSystems, Inc; Citrix4, MetaFrame™—Citrix Systems, Inc.; Compaq4—Compaq Computer Corporation;Computer Associates4, ArcServeIT™—Computer Associates International; Profusion4, Corollary4—Corollary,Inc.; F5 Networks™—F5 Networks, Inc.; Giganet4, cLAN™—Giganet, Inc.; HP™, OpenView4—Hewlett-Packard Company; Informix4—Informix Software, Inc.; IEEE4—Institute of Electrical and ElectronicsEngineers; Intel4, LANDesk4, Pentium4, Xeon4—Intel Corporation; AS/4004, CICS4, DB24, IBM4—InternationalBusiness Machines Corporation; JD Edwards4, OneWorld4—J.D. Edwards Company; EtherChannel4—Kalpana, Inc., a subsidiary of Cisco Systems, Inc.; Linux4—Linus Torvalos; Lotus4, Lotus Notes4—LotusDevelopment Corporation; ActiveX4, BizTalk™, IntelliMirror4, Microsoft4, Windows4, Windows NT4—Microsoft Corporation; Desktop DNA™—Miramar Systems, Inc.; NCR4, Octascale4—NCR Corporation;NetIQ4—NetIQ Corporation; Double-Take4, NSI Software4—Network Specialists, Inc.; ConsoleOne™,GroupWise4, NetWare4, NetWare Loadable Module4, Novell4, ZENWorks™—Novell, Inc.; ON CommandCCM™—ON Technology Corporation; Oracle4—Oracle Corporation; PeopleSoft4—PeopleSoft, Inc.; QuestSoftware™—Quest Software, Inc.; Red Hat4—Red Hat Software, Inc.; SAP™, R/34—SAP Aktienges-ellschaft; Sequent4—Sequent Computer; Siebel4—Siebel Systems, Inc.; Origin2000™, SGI™—SiliconGraphics, Inc.; Solaris4—Sun Microsystems, Inc.; Sybase4—Sybase, Inc.; UnixWare4—The Santa CruzOperation; Tivoli4—Tivoli Systems, Inc.; TPC-C4—Transaction Processing Performance Council; Unisys4—Unisys; UNIX4—Licensed exclusively by X/Open Company Ltd. in the United States and other countries.

Dell Power Solutions is published quarterly by Enterprise Systems Group, DellComputer Corporation, One Dell Way, Round Rock, Texas 78682. This publicationis also available online at www.dell.com/powersolutions. No part of this publi-cation may be reprinted or otherwise reproduced without permission from the editor. Dell does not provide any warranty as to the accuracy of any informationprovided through Dell Power Solutions. The information in this publication is sub-ject to change without notice. Any reliance by the end user on the informationcontained herein is at the end user’s risk. Dell will not be liable for information inany way, including but not limited, to its accuracy or completeness.

© Dell Computer Corporation. All rights reserved. Printed in the U.S.A.

September 2000

� Executive NotesDell Time Equals Uptime!By John S. Craparo

� Editor’s CommentsData Centers Meeting the e-Business ChallengeBy Eddie Ho

� Data Center EnvironmentDell IT Data Center—A Standards and Best-Practices PerspectiveBy Jim McGrath

Building High-Performance Data CentersBy Jamie Gruener

Improving e-Business Server AvailabilityBy Rich Hernandez

Solving Server Bottlenecks with Intel Server AdaptersBy Gary Gumanow

Integrating PowerEdge Clusters with Storage Area NetworksBy Mike Kosacek and Edward Yardumian

NetWare Cluster Services: Deployment Considerations and TuningBy Richard Lang

Geographic Application and Data Availability with GeoCluster and MSCSBy Andrew Thibault and David Demlow

Dell Installs NetWare 5.1 for Bay City Public SchoolsBy Rod Gallagher

Microsoft Business Solutions Practice—A Deployment PerspectiveBy Jim Plas

SQL Server Replication RevealedBy Rudy Lee Martinez

� Internet EnvironmentNext-Generation Internet: Microsoft .NET Enterprise Servers and Windows 2000By Darcy Gibbons Burner

Choosing the Right Internet Traffic and Content Management SolutionBy David Barclay

Linux Firewalls on Dell PowerApp ServersBy Zafar Mahmood

� Product ReviewA Look at Eight-Way Server ScalabilityBy John Bass

� High Performance ComputingDesign Choices for a Cost-Effective High-Performance Beowulf ClusterBy Jenwei Hsieh, Ph.D.

� Enterprise ManagementMigrating to Windows 2000—AutomaticallyBy Phil Neray

Windows 2000 Desktop Deployment at DellBy Max Thoene

Start Here…for Fast, Reliable Operating System InstallationBy Geoff Meyer

Windows NT Performance Monitor—A Practical ApproachBy Paul Del Vecchio

� Knowledge ManagementOutlook Web Access—Bringing Exchange 2000 Server to the MassesBy David Sayer

1

3

4

7

11

17

22

28

44

50

55

59

39

65

70

77

81

34

85

91

95

101

T A B L E O F C O N T E N T S

2 PowerSolutions

www.dell.com/powersolutions

Page 4: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 3

E D I T O R ’ S C O M M E N T S

Data CentersMeeting thee-Business Challenge

The enormous impact of the Internet is creating a shift inbusiness computing. Companies can no longer afford to

focus primarily on their back-end systems and internal com-puting requirements. Instead, they must redirect significantresources to establish and enhance their entire Internet infra-structure, including the customer-facing applications. Already,these changes are having a major effect on business strategies,creating new criteria for success that virtually every IT organ-ization will have to address.

The focus of this Power Solutions issue is achieving 100percent uptime with the integration of applications, tools, andinfrastructure. Achieving 100 percent uptime requires effec-tive tactics and strategies that address single-server bottle-necks, “scaling out” and “scaling up,” single-site designs, andmultisite architectures. This issue provides insight and dis-cusses best practices in uptime strategies for mission-critical e-business data centers.

The Changing Role of Traditional Data CentersTraditional data centers have provided businesses with well-behaved and predictable computing environments. Capabili-ties could be added in a relatively planned fashion. The emer-gence of the Internet introduced a new computing modelbased on fault resilience through distributed and redundantsystems and services.

Today, these two computing models are in conflict as ITorganizations scramble to link their traditional data centersto the Internet. Many advanced systems that were built for“glass house” reliability are being stressed to the point offailure by new e-business requirements.

Scaling Out for Performance and AvailabilityOne solution to this problem can be found in the Internetmodel itself. Companies can “scale out”—that is, add affordableservers to the existing infrastructure, which in turn adds server-level redundancy throughout the computing infrastructure.

This scale-out strategy has significant advantages. Com-puting resources can be expanded in a modular fashion to

more precisely match e-business growth and to meet chang-ing demands caused by the ongoing Internet revolution.New applications and services also can be deployed morequickly, and workloads can be balanced across multiple sys-tems to improve performance and availability.

This incremental approach helps businesses take more rapidadvantage of the technological innovations that are continuouslyarising in the horizontal server marketplace. Once an appropriatelevel of redundancy has been established by scaling out, a scale-up strategy can be used. Scaling up means consolidating existingservers with larger systems to meet increasing demand whilemanaging fewer servers. By scaling out and scaling up within aplanned and controlled architecture, businesses can benefit fromthe economics of volume, while quickly deploying the most cost-effective and efficient solutions available.

As more critical systems are deployed outside the datacenter, the customer-facing systems become the new center-piece of e-business. It is therefore essential that IT organiza-tions incorporate the best data center practices throughouttheir entire computing environment. The best of bothdeployment models can be achieved by adding systemsdesigned for Internet-style TCP/IP redundancy, and deploy-ing proven management techniques of more traditional datacenters to deliver more robust and highly available systems.

This issue also continues the High-Performance Com-puting (HPC) series on Beowulf clusters by Dr. Hsieh, andanother server leadership review about the PowerEdge 8450written by John Bass for Network World.

We hope you enjoy this issue. Be sure to drop me a noteif you achieve a major milestone on one of your more impor-tant projects; I am sure our readers would like to hear aboutyour experience.

Eddie HoManaging Editorwww.dell.com/powersolutions

Page 5: Building Your Internet Data Center

4 PowerSolutions

The challenges that face information technology (IT)departments that are bombarded with ever-changing

technologies present a significant test for today’s systemsengineers. The ability to keep up with the latest technolo-gies and, at the same time, deliver robust, stable systems isan even greater task. Add to the equation that Dell is one ofthe fastest growing companies in the world, and you can betthat success is only achieved through well-designed andthoroughly tested standards and best practices.

Standards and best practices have several differences.Standards are detailed technical guidelines used to establishuniformity in a functional computing area. Departmentalstandards are usually created through a formal process basedon the work of a cooperative group or committee of subjectmatter experts. De facto standards are those standards whosestatus is conferred by their use in the marketplace.

Best practices, on the other hand, identify the optimalimplementation and uses of particular technologies. Thesepractices are documented and shared in both formal and

informal ways by technical experts involved in the opera-tional use of the technologies.

Standards and best practices are important in an era ofdeclining budgets, where managers feel the increasing pres-sure to clearly demonstrate added value and cost savingswhen implementing new services and technologies. Thechallenge for all companies is to ensure that technical infra-structures serve the varying mandate requirements of eachfunctional area, while reliably providing quality service to itsusers—both within and outside services. Furthermore, theinfrastructure must be affordable and sustainable over its lifecycle and have the ability to evolve or scale to meet newuser requirements or technological advances.

The New, Emerging StandardsToday, the existing organizations and established processes thatdevelop standards are now generally recognized as too slow tobe effective. New processes and organizations have emerged toprovide the framework for developing new standards. Certain

Dell ITData Center–

A Standards and Best-PracticesPerspective

The challenges for information technology departments involve not only

the effective and efficient deployment of the latest technologies, but also

procedures and processes that ensure success. The standards and best

practices used in the Dell data center reflect an on-going emphasis on

providing robust, stable systems—efficiently and cost-effectively.

By Jim McGrath

D A T A C E N T E R E N V I R O N M E N T

Page 6: Building Your Internet Data Center

standards, however, will continue to be developed exclusivelywithin the traditional standards organizations, such asAmerican National Standards Institute (ANSI®) that providesstandards for the database Structured Query Language (SQL).

All standards have strengths and weaknesses. Every stan-dard has a use and a user. But no single standard or set ofstandards can satisfy the requirements of all users, in allplaces, at all times. Using appropriate standards and bestpractices can assist departments in balancing the need fortechnical flexibility to meet organizational objectives withthe need to maintain a level of compatibility that enablesthe user to share or exchange resources. Properly utilized,standards and best practices can assist organizations in theirefforts to successfully cope with emerging new technologies.

It is advantageous to use standards and current best prac-tices whenever possible, to further organizational goals andto encourage consistency across departments and betweenthe company and its business partners. However, standardsand current best practices alone are not sufficient to ensureinteroperability. We also will need to share technical infor-mation, understand user needs, and communicate and coop-erate across the organization.

The Data Center EnvironmentThe Windows NT® environment encompasses all functionalbusiness units, such as manufacturing, service, sales, finance,and human resources. Larger applications are built anddeployed following the N-tier architecture, keeping eachlayer specific to its purpose (only the database software anddata on database servers), whereas simpler applications witha smaller footprint may share a single server and contain theapplication and data all in one instance.

Service level agreements determine the technologiesthat will be used to meet system availability requirements.If an application can be unavailable for aperiod that allows us to perform a recoveryfrom tape, then we can save the cost of addi-tional hardware. Applications that requireless downtime may need clustering, or possi-bly even online disaster recovery to provideredundancy and quicker recovery.

Approximately 12 percent of Dell ITservers require high availability using tech-nologies such as Microsoft® Windows® LoadBalancing Service (WLBS), Microsoft ClusterServer (MSCS), Oracle® Fail Safe (OFS),Oracle Parallel Server (OPS), or GeoClusterfrom Sunbelt Software. The hardware archi-tecture includes Dell® PowerEdge® serverswith SCSI controllers, PowerVault® FibreChannel technology; and direct attached,

storage area network (SAN) and network-attached storage(NAS) and tape storage subsystems.

Dell IT MethodologyDell’s approach to standards in the Windows NT environ-ment was to create consistency and efficiency in the buildprocess, with a strong focus on reducing human interventionwherever possible. We used these steps as a guideline:1. Develop an image of the hard drive of each server model;

that is, one image for each model—PowerEdge 6300,4300, or 2400

2. Develop detailed how-to documents that demonstrate the hardware assembly and configuration (includingscreenshots)

3. Develop checklists for tracking the step-by-step proce-dures necessary to complete the build process

The server image includes the Windows NT operatingsystem and only third-party software products that havebeen thoroughly tested for compatibility. We tried torestrict changes to the environment during deployment toonly those that had been integrated into the processes andwere part of our overall standards.

The design of our system images involved a combinationof the following: � Vendor reviews of the configuration and setup process� Experiences from the industry via subject matter experts� Extensive reviews of available white papers� Lessons learned from the production environments

Figure 1 shows the iterative refinement of the buildprocess that is required to move a system image to “gold”status, which makes it production ready.

Each change to an image goes through rigorous regressiontesting using scripted test plans. These testplans are reviewed by engineering, deployment,and operations support teams to ensure anextensive review of any possible failure.

Once an image is considered final, our cus-tomers (deployment engineers) perform a finalaudit before the image is accepted for the pro-duction environment. After the audit is com-plete, the image is considered production readyand is given to the technical writers for processdocumentation.

Each document details the step-by-stepbuild process, including a troubleshooting sec-tion that discusses the most common problemsfaced by deployment and operations engineers.The purpose of this document is not to teachengineers how to deploy servers, but how to

www.dell.com/powersolutions PowerSolutions 5

Successful

deployments are

ultimately based on

effective solutions that

relate as much

to the deployment as

they do to the specific

technology being

deployed.

Page 7: Building Your Internet Data Center

build and configure the servers and storage systems. To con-firm the accuracy of the documents, we use nontechnicalpersonnel who follow the procedures line by line. When thebuild is completed, an engineer audits the system and workswith the tech writers to make any required changes.

The engineering group continues to update drivers,patches, and firmware; in addition, they also look forareas to improve processes and techniques. The complexsetup for clustering, failover, and load balancing usuallyremains consistent, whereas the improvement of software

and firmware require more frequent changes. Updates arelimited to quarterly releases of the images to keep theprocess manageable, with full regression testing for back-ward compatibility.

Partnerships and Collaboration In Dell IT, we maintain close relationships with our globalIT partners for information sharing and collaboration onour engineering efforts. We hold regularly scheduled ses-sions for sharing technology and published system imagesfor the Americas and worldwide regions (40 bit and 128bit encryption), and maintain a global knowledge basethat is shared among groups.

Dell and its partners collaborate closely for the configura-tion and setup of operating systems, hardware, third-partyproducts—down to the vendor’s review and buyoff of theinternal documentation along with regularly scheduled oper-ational readiness reviews. In the past, software vendors havepushed for upgrading to the latest version of their product;however, if they are involved with defining the most effec-tive combinations to produce a single solution, the configu-ration that results can be more easily supported.

Successful Deployments Involve Both Technology and ProceduresSuccessful deployments are ultimately based on effectivesolutions that relate as much to the deployment as theydo to the specific technology being deployed. Many tech-nology variables affect any system—hardware, software,data or environmental dependencies, and software tech-nologies. Many factors, however, that affect successfuldeployment are organizational and procedural. Each mustbe developed through systematic, carefully planned, andcorrectly applied procedures.

Following established methods and standards has enabledDell to be very successful in deploying complex state-of-the-art technologies with engineers who have limited experi-ence. Yet, at the same time, Dell has developed efficienciesin the deployment process.

Jim McGrath ([email protected]) is a senior manager inDell IT Operations with responsibility for Desktop, Infrastructure,Server, and Storage Systems Engineering. Jim has over 20 years ofexperience in the industry—14 years prior to Dell as manager ofInformation Systems and four years as president of an Austin-based software company.

6 PowerSolutions

Figure 1. The Process for Moving a System Image to Gold Status

Analyze Requirementsand Implement

Changes

Deployment/OperationsTeam Audit

PerformRegression Tests

Add OnLayered Products

Pass?

Yes

NoAnalyze Failure

and Correct

GOLD

Build Base Image

Initial RequirementsGathering

Vendors PerformReadiness Assessment

Pass?

Yes

No

Pass?

Yes

No

Analyze Requirementsand Implement

Changes

� Visit us online atWWW.DELL.COM/POWERSOLUTIONS

Page 8: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 7

Traditionally, database and network administrators differedon deployment strategies—each focused on their own

requirements for improving application and network perfor-mance and scalability, independent of each other. Today, data-base and network administrators need to join strategies as thedependency between applications and the underlying net-working increases with the heavy demands of the Internet.The choice of a network is now integral to ensuring the per-formance and scalability of databases, as well as the database-dependent applications within the data center.

Recognizing the impact that the underlying network canhave on database acceleration and scalability, Microsoft hasimplemented the Virtual Interface (VI) Architecture standardand support for Giganet® VI-based networks within SQL Server2000 Enterprise Edition (EE). The VI standard defines efficientcommunications among application servers—such as Web,database, and enterprise resource planning (ERP)—bypassingmany of the limitations of traditional networks. VI delivers highthroughput, reduced system overhead, and lower latencyto application and database server communication. Over 130 companies support the VI standard, including Intel®

Corporation, Microsoft Corporation, Dell Computer, andGiganet. The underlying benefits of VI enable cost-effective,commodity systems networked together within server farms tosupport the higher performance and scalability requirements ofthe data center.

This article examines how Dell customers can buildhigh-performance data centers, using SQL Server 2000, DellPowerEdge servers, and Giganet cLAN™ VI-based networksto scale out their database and application environments.Key to the success of these deployments is understanding thedatabase and application networking requirements, VI bene-fits, and how SQL Server 2000 EE optimizes its performanceand scalability using Giganet’s cLAN networks.

Communications Bottlenecks Slow Server Performance Interconnecting a group of servers, workstations, and I/Odevices as a server farm or scalable computing cluster presentsa unique set of communication challenges. Most networkinterconnect technologies support multilayered communica-tions protocols designed to deliver messages to geographicallydistributed, heterogeneous end points.

BuildingHigh-Performance

Data CentersThe introduction of Microsoft SQL Server 2000 is a milestone in the race to build

the next generation of Internet data centers. These new data centers are made up

of tiers of servers, now commonly referred to as server farms, which generally

are divided into client services servers (Web servers), application/business logic

servers, and data servers supporting multiple instances of databases such as

SQL Server 2000.

By Jamie Gruener

D A T A C E N T E R E N V I R O N M E N T

Page 9: Building Your Internet Data Center

To ensure reliable communications between nodes on anetwork, multilayered communications protocols require thatdata be copied several times. The two communicating serversmust execute several protocols and exchange multiple messagesduring the data transmission between servers. It is this commu-nication that adds to the latency on the data center network,and consumes valuable server CPU resources that could other-wise support more users and faster application processing.

VI Architecture Clears the WayThe VI Architecture was created to accelerate server and workstation I/O by providing a more effective communicationtechnique. Because this architecture has lower system overheadand lower server CPU utilization, it provides crucial perfor-mance boosts to database servers that are generally CPU-boundin their performance.

VI introduces a standards-based application programminginterface (API) for low-latency, high-bandwidth message pass-ing between interconnected servers within a cluster or serverfarm. Low latency and sustained high bandwidth are achievedby avoiding intermediate copies of data and bypassing theoperating system when sending and receiving messages.

VI eliminates CPU interrupts by transferring data directlyfrom memory onto network interface cards (NICs), rather thanthrough the operating system’s traditional network stack. Forexample, while traditional networks may use 7,000 instructionsto do routine data transfers between servers, VI networks canaccomplish the same tasks in 50 instructions (see Figure 1). VIexpedites application to database communications, improvingthe overall responsiveness of the database and scalability ofthe application.

VI provides the following performance boosts:� Higher throughput for faster application to application

communications� Reduced system overhead for greater scalability� Lower message latency (delays) for greater application

responsiveness

Microsoft’s Scale-out StrategyMicrosoft’s SQL Server 2000 EE supports the VI Architectureas one of its key features to scale out SQL database envi-ronments. EE also supports several other reliability andscalability features that are directly impacted by the per-formance boosts offered by VI. This includes distributedpartitioned views (enabling federated databases) and four-node failover cluster support.

Key to Microsoft’s scale-out strategy is distributed parti-tioned views, complemented by the performance achievedthrough the support of VI. With distributed partitioned views,

SQL Server 2000 shares the database load across a server farmby horizontally partitioning the SQL Server data. Theseservers cooperate in managing the partitioned data but oper-ate autonomously, a step toward shared-nothing clusters.

The partitioning of data within SQL Server 2000 istransparent to applications accessing the database. Allservers communicate with each other across a data centerVI-enabled network (that is, Giganet cLAN), and processboth queries and updates, distributing scans and updates asneeded. SQL Server 2000 EE contains several enhancementsthat make these views updatable.1

By scaling out SQL Server 2000 with the use of Giganet VI-enabled networks, customers can now divide (or partition)the database workload across multiple commodity-based servers.By scaling SQL Server 2000 distributed partitioned views acrossmultiple servers, enterprises can add additional servers onlywhen workloads require it, building server farms as data centerdemands increase. This is more cost-effective than buying largersymmetric multiprocessing (SMP) systems with resources thatmay or may not ever be fully utilized. Server farms can also pro-vide better reliability by having no single point of failure. Thisapproach allows SQL Server 2000 environments, combinedwith VI-enabled networks, to set new industry benchmark per-formance records described in more detail below.

SQL Server 2000 databases communicating to applica-tion servers through VI boost application performance byincreasing the responsiveness of database queries, loweringsystem overhead, and lowering the latency required for thecommunication. SQL Server 2000 with VI support benefits allclient interface models, including open database connectivity(ODBC), DB library (DBLIB), and OLE DB. Applicationswritten to these interfaces will be able to take advantage ofthe low system overhead, low latency, and high throughputattributes provided by VI networks.

8 PowerSolutions

Figure 1. VI Bypasses the OS and Protocol Stacks

Traditional Solution Giganet VI Solution

Application Application

Socket VI API

TCP/IP Stack

Device Driver

NetworkHardware

50Instructions

7,000Instructions

NetworkHardware

1More information about Microsoft’s federated database approach can be found at http://msdn.microsoft.com/msdn-online/start/features/highperform.asp

Page 10: Building Your Internet Data Center

Reliable, Highly Available SQL Server ClustersOther SQL Server 2000 EE availability features include two- orfour-node VI failover support that provides enhanced highavailability for SQL Server nodes in a cluster. Giganet cLANVI-enabled networks support SQL Server clustering high-availability failover to ensure continuation of service and theavailability of data. Mission-critical applications, such as backup,can take advantage of this high-availability environment.

Scaling Out SQL Server 2000 with Giganet VI-based NetworksGiganet makes host adapters (NICs) and 8-port and 30-portswitches that enable customers to scale their distributed data-base and application server environments as their needschange with an easy-to-install high-performance network.Giganet cLAN networks also include a management consoleto monitor data transmission across the VI-enabled network.Dell is a Giganet partner and provides cLAN to customersneeding high-performance server farms.

Demonstrated PerformanceSQL Server 2000 with VI Architecture support has been testedwith Giganet cLAN products, demonstrating impressive per-formance in SAP™ R/3®, TPC-C® (Transaction ProcessingPerformance Council Class C) benchmarks, and application-specific benchmarks. When interconnecting application serversand Web servers to SQL Server 2000 EE, system overhead canbe reduced by as much as 40 percent, resulting in increased pro-cessing resources and faster database response times. The avail-ability of greater processing resources also increases the numberof users supported by 45 percent. Mission-critical services suchas network backup demonstrate a minimum of 50 percent per-formance improvement. While performance will vary in Dellcustomer deployments, the performance gains demonstrate thatMicrosoft and Giganet support of VI provides the foundationfor a more powerful data center environment.

Specific TPC-C and application benchmarks show that VIcan significantly boost database and application performance.Dell recently published a TPC-C benchmark using a singleSQL Server 2000 EE database server (Dell PowerEdge 8450) toproduce a record-leading performance of 57,014.93 tcmC whilesupporting 46,000 users. A performance test conducted withBaan™ also demonstrated dramatic performance improvementswith VI networks. Using SQL Server 2000 with VI, benchmarkengineers switched out a Gigabit Ethernet network withGiganet cLAN, which boosted server capacity by 40 percent.

Implementing SQL Server 2000 and Giganet cLAN NetworksSQL Server 2000 and Giganet cLAN networks will dramati-cally boost database and application performance within the

data center network. In addition to increasing the perfor-mance of the SQL Server 2000 database itself, GiganetcLAN improves the efficiency of application server to data-base server communications. As a result, online transactionprocessing (OLTP), ERP, data warehousing, and e-commerceenvironments can send and receive data faster and withoutthe delays caused by message latency or CPU overload.

Deployment scenarios, many of which can be done acrossthe same Giganet VI network, include: � Federated SQL Server databases (distributed partitioned

views): Scaling SQL Server 2000 database workloadacross multiple servers in a near shared-nothing approachto clustering called a “federation” of independentlyadministered servers that cooperate to jointly manageworkload, shown in Figure 2

� SQL Server 2000 cluster databases: Providing a high-availability VI network for a clustered two- and four-nodeSQL Server 2000 database with Windows 2000Datacenter Server with faster recovery time (as well asthe application and backup servers that connect to thatnetwork); see Figure 3

� Redundant networks: Supports high-availability appli-cations to ensure continuation of service and the avail-ability of data, including redundant backup servers; seeFigure 4

� Multitier database server to application server net-works: Delivering a VI network backbone between SQL Server 2000 and application/business logic tierservers as well as Web servers—providing high through-put, low latency, and low system overhead across the data center network

www.dell.com/powersolutions PowerSolutions 9

Figure 2. Distributed Partitioned Views in Federated SQL Server Databases

SQL Server 2000Database Servers

PowerVaultStorage

Page 11: Building Your Internet Data Center

� Backup networks for SQL Server 2000: Combining newbackup features of SQL Server 2000 that enable differen-tial backup between SQL servers and backup standbyservers and integrate log shipping into database mainte-nance with a VI network that eliminates network conges-tion, and thus, the need for backup windows (Giganetintegrates with backup software such as ComputerAssociates® ArcServeIT™); see Figure 5

VI Architecture Boosts Performance in Data Center Networks The VI Architecture, supported by Microsoft SQL Server2000, Giganet, and Dell, provides demonstrated perfor-mance boosts for today’s data center networks—enabling anew generation of data centers that will combine Intel-based servers like Dell’s PowerEdge servers with new net-work backbones like Giganet cLAN VI networks.

VI empowers SQL databases to scale out over multipleservers and improve the data flow between the database andapplication servers. Giganet cLAN and SQL Server 2000with VI provide a network and database infrastructurewithin Internet data centers for high-performance, scalable,and reliable data delivery.

Jamie Gruener ([email protected]) is manager of E-Commerce Marketing at Giganet, where he manages ISV marketing. Previously, he was managing director ofWindows 2000 Platforms at Aberdeen Group, Inc.

10 PowerSolutions

Web Servers

Internet

Router

Application ServersDatabase Servers

BackupServers

PowerVaultTape Libraries

Figure 5. Backup Networks for SQL Server 2000

Figure 4. Redundant Giganet cLAN Server Farm Networks

Web andApplication

Servers

Internet

Router

Redundant Microsoft SQL Servers• Dual NIC Support• Redundant Network Fabrics• Automatic Failover• High-Availability Backup

PowerVaultStorage

Figure 3. Clustered SQL Server 2000 Databases

SQL Server 2000Database Servers

PowerVaultStorage

For more information, technical support, and service, contact

Giganet at 978-461-0402 or visit www.giganet.com.

Page 12: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 11

Client/server networks are increasingly important tools fordaily business operations in small businesses as well as

large enterprises. Network servers provide vital computingservices and access to vital corporate data. Given the mission-critical nature of servers, it is not surprising that reliability heads the list of the most important considerationsin selecting a server, ahead of price/performance, networkfunctionality, and I/O speed.

The growing use of applications and Web servers in thedata center to handle mission-critical tasks has increased theimportance of server reliability and resiliency. Server avail-ability is critical because downtime means lost revenue. APC Week survey of 400 large companies indicated thatannual downtime costs exceed $700,000 for systems with99.9 percent availability (see Figure 1).

Single-homed servers (with one interface to the net-work) represent a single point of failure that significantlyaffects overall server availability. Enterprises use severalapproaches to improve reliability. The most common is toprovide redundancy for the critical components in a server.

Dell servers ship with RAID controllers that provide disksubsystem resiliency and include redundant power suppliesand cooling systems. In a typical server, one adapter providesconnectivity to a network. If the server loses the link or theadapter fails, all users lose connectivity to the server.

One way to avoid loss of connectivity is to use multipleadapters configured as network teams or groups. These aresimilar to RAID configurations, which rely on a team ofdisks to improve the reliability of disk access. Redundantteams of network interface devices play an important role inbuilding high-availability server systems.

High Availability, High PerformanceUse of redundant devices is far from the only way toenhance reliability. Advanced network driver functionsavailable with Dell servers yield a significant improvementin the server’s ability to provide uninterrupted service.

This article describes three advanced capabilities imple-mented by means of an intermediate driver for Microsoft Windows NT 4.0, Microsoft Windows 2000 Server, andNovell® NetWare® 5.x: link aggregation, load balancing, andfault tolerance. The article focuses on Fast Ethernet and Gigabit Ethernet adapters since over 90 percent of all networkservers are connected via these two adapters.

Link aggregation is a method of combining multiplephysical network links into a single logical link. For exam-ple, two 10/100 Mbps network interface card (NIC) portscan be combined into one team or group for a maximumcombined capacity of 400 Mbps full-duplex. The networkand software stack will perceive these two NIC ports as one

Improvinge-Business

Server AvailabilityHigh availability and high performance are mandatory in information

infrastructures. This article describes advanced network driver functions

available with Dell servers that help enterprises improve server availability

and performance through link aggregation, load balancing, and fault tolerance.

By Rich Hernandez

D A T A C E N T E R E N V I R O N M E N T

Page 13: Building Your Internet Data Center

virtual adapter capable of maintaining highly available net-work connections while improving network performance tothe server. Network bandwidth can scale incrementally asmore ports are added to a team or group.

When one or more channels in a group fail, the softwareautomatically detects the failure (link or hardware failure)and rebalances the traffic across the remaining links withoutloss of data. Once the failed link is restored, traffic is auto-matically reconfigured to use all active network links. Thisload balancing is transparent to the end user, who experi-ences no host protocol timeouts and no dropped sessions.

A fault-tolerant NIC team eliminates single points of fail-ure. The fault-tolerant team provides dynamic failover acrossmultiple redundant connections to the network. When a badcable, a lost link, or a failed adapter causes a failure on the pri-mary network interface device (NID) link, the intermediatedriver software will switch to the secondary adapter.

Link Aggregation MethodsDell servers support a number of vendor-proprietary NICimplementations for link aggregation, load balancing, andfault tolerance. These include Cisco® Fast EtherChannel®

(FEC) and Gigabit EtherChannel (GEC), Intel AdvancedNetwork Services (iANS), 3Com® DynamicAccess®, andAlteon® Fault Tolerance. The Intel, 3Com, and Alteonimplementations are switch independent. FEC and GECrequire a matching link partner that supports FEC or GEC.

In addition, in March 2000 the Institute of Electrical andElectronics Engineers (IEEE®) approved the 802.3ad PortAggregation standard (http://standards.ieee.org). The standardoffers link aggregation or trunking for increased bandwidthand failover between links in a group for redundancy. All NICvendors and switch vendors are expected to adopt the 802.3ad

standard in order to ensure multivendor interoperability oflink aggregation technology.

Increased throughput and availability can be realized withthe use of real-time automatic load balancing and failover.The bandwidth aggregation is a function of the number ofNIC ports supported by the software and the number of PCIslots in the server.

Network Interface DeviceThe NID is an electronic device that provides a connectionpoint into the network. The NID is implemented as a LAN-on-motherboard (LOM) integrated solution or as an I/Ostandup PCI adapter that fits into an expansion slot inside acomputer. The NID operates at the Open Systems Intercon-nection (OSI) Layer 1 (physical layer) and Layer 2 (data-linklayer) to format the data coming from the OS software stack asrequired by the protocol being used (for example, Ethernet orToken Ring). The data will then be transferred from the trans-mitting station to the destination as specified by the destina-tion Media Access Control (MAC) address.

The MAC address is a unique 48-bit number pro-grammed into the adapter during manufacture. It is part ofthe Ethernet frame header, as shown in Figure 2. In IEEE802 networks, the MAC layer is one of the sublayers thatmake up the data-link layer of the OSI reference model. TheNID provides the MAC and physical layer implementations,as shown in Figure 3. Ethernet is a LAN protocol that oper-ates at the physical and data-link layers. Consequently, eachdifferent type of network requires its own MAC layer and, ingeneral, a different adapter or NIC.

A network interface in a computer consists of networkprotocols, the NID (NIC or LOM), network drivers, and the driver standards interface. Network protocols are a set ofrules or procedures that govern the format and timing ofdata transmission between two nodes. The most commonlyused are TCP/IP and Internet Packet Exchange (IPX)/Sequenced Packet Exchange (SPX).

An NID connects a node such as a computer to a network. It implements the MAC and the physical inter-face. An Ethernet NID consists of a physical layer (PHY)transceiver and a MAC application-specific integratedchip (ASIC), as shown in Figure 4.

A device driver is a program that translates between a deviceand the programs that use it. The driver software connects an

12 PowerSolutions

Availability (%) Downtime (hours) Annual Cost ($)

100 0 0

99.99 0.88 7,000

99.9 8.76 736,400

99.5 43.80 3,679,200

99 87.60 7,358,400

Source: PC Week

Figure 1. Annual Downtime Hours and Costs

Preamble Destination Source MAC Data Length Data Frame Check(8 bytes) MAC Address Address (2 bytes) (46–1500 bytes) (4 bytes)

(6 bytes) (6 bytes)

ETHERNET FRAME

Figure 2. Ethernet Frame Header

Page 14: Building Your Internet Data Center

NID to the network protocols. The computer can then use an NID to send and receive data over a network.

The OS defines the interface between the network proto-col and the driver. Microsoft systems use the Network DriverInterface Specification (NDIS). For example, Windows NTuses NDIS4, and Windows 2000 uses NDIS5. NetWare operat-ing systems use the Open Data-Link Interface (ODI).

These driver specifications support protocol multiplexing;that is, multiple protocol stacks can coexist in the samehost. For example, Windows NT and Windows 2000 sys-tems can simultaneously support IP, IPX, and NetBIOSEnhanced User Interface (NetBEUI) over the same NID. Inaddition, NDIS supports multiple NDIS-conforming networkdrivers and NIDs in the same system.

Miniport and Intermediate DriversThe I/O Manager interacts with a miniport driver (seeFigure 5). The NDIS miniport driver is a family of network-ing driver standards that includes LAN, WAN, and interme-diate driver standards.

Miniport drivers are restricted from calling functions sup-plied by the I/O Manager. Instead, they use the interfacesand functions provided by the NDIS Wrapper. The wrapperperforms common processing, while the miniport driver han-dles hardware-specific interactions.

The NDIS Wrapper, or library (see Figure 6), is a kernel-mode dynamic-link library (DLL) that implements the pro-gramming interface for components that refer to it, such asthe transport driver interface (TDI) drivers and the NDIS

intermediate drivers. It defines the structure and interfacesused by the NDIS drivers.

A call and return mode of operation is used when driverscall functions in the NDIS Wrapper. The wrapper isolates theNDIS miniport and intermediate drivers from the OS, providescommon functions to reduce the required size of the miniportdrivers, and handles a set of network-specific functions.

The NDIS intermediate driver processes data from thenetwork protocols; in other words, the network protocols arebound to the intermediate drivers, while the device hard-ware is bound to the miniport driver. The intermediatedriver serves multiple purposes by implementing specializedprotocols or advanced functions such as NID load balancing,NID fault tolerance, LAN emulation over asynchronoustransfer mode (ATM), and packet scheduling for quality ofservice (QoS) and virtual LAN (VLAN) tagging.

For example, a prescan driver on NetWare or an interme-diate driver in Windows NT functions as a shim between theminiport base driver and the protocol stack. The intermediatedriver hides the multiple physical adapters from the OS, creat-ing a virtual driver for each network team. The intermediatedriver advertises the MAC address of the primary adapter.

Spanning Tree Protocol ConsiderationsThe IEEE 802.1d Spanning Tree Protocol (STP) preventsloops on a bridged or switched network. STP is requiredbecause bridged or switched networks lack a time-to-live(TTL) mechanism that can eventually stop looping packets.

www.dell.com/powersolutions PowerSolutions 13

Figure 3. Windows NT Networking Model

7. Application Layer TCP Applications

Redirect Servers

NIC Driver

NIC Hardware

2a. Logical Link Control

NDIS Interface

2b. Media Access Control

Transport Driver Interface

IP

UDPTCP

1. Physical Layer

2. Data-Link Layer

3. Network Layer

4. Transport Layer

5. Session Layer

6. Presentation Layer

I/O Manager

Executive Services

Object I/O Process Memory

Kernel Mode

User Mode

Figure 4. Network Interface Card

Magnetics LED

RJ-45

Host BusI/F (PCI)

IntegratedMAC and PHY

EEPROM

MAC PHY

Figure 5. Miniport Driver

MiniportDriver

I/OManager

Miniport Wrapper

Page 15: Building Your Internet Data Center

STP determines the topology of a network by assigningcosts to interfaces working toward the root of the tree—oneof the bridges or switches in the network. When multiplepaths exist, STP will cause the switch to forward data overthe most efficient path while blocking any other redundantinterfaces. If the most cost-effective interface fails, STPautomatically reconfigures the network to block the failedinterface and forward the packet over another path thatleads to the same destination.

Although STP can enable failover among redundantlinks, the protocol is limited to the bandwidth of a singlelink. In addition, STP’s convergence time can be fairly high,up to 30 to 50 seconds, depending on the topology and com-plexity of the network. In general, STP should be disabledon the trunked ports connected to a server with multipleNIDs running link-aggregation and load-balancing software.

Load-Balancing MethodsA group, team, or array of NIDs consists of multiple devices,each of which has a unique MAC address. When a device istransmitting data over an Ethernet network, the machinename is converted to an IP address with a name resolutionmethod such as Domain Name Service (DNS), WindowsInternet Name Service (WINS), or host file. An AddressResolution Protocol (ARP) then maps the IP address to theMAC address of the destination. When the MAC address ofthe destination is known, the device can transmit a packetover the network.

When teams of multiple ports are used for load balanc-ing, a selection algorithm is required to determine which ofthe network ports should be used for transmission. Networkload balancing can be implemented on a server to controlonly the outgoing traffic or both outgoing and incomingpackets. The latter typically requires support from the switchdevice at the other end of the link.

Intermediate driver developers can select from a numberof different algorithms to balance the traffic by selecting adifferent interface to transmit the data for a given session.The algorithms include round robin, MAC address, IPaddress, and IP address and TCP port address.

Round-Robin AlgorithmIn a round-robin implementation, the intermediate driverselects an NID port for each packet, starting with the firstport in the network group. The next packet will be sent overthe following port, and so on. As shown in Figure 7, theround robin starts over again with the first NID after the lastNID port in the group is used.

Round robin is a simple algorithm, but it guarantees thatthe traffic load is equally distributed across all the networklinks while minimizing CPU processing. However, sinceeach frame in a client/server session is transmitted over a different link, frames can reach clients out of order. Framesreceived out of order may have to be retransmitted, andretransmission will lower the actual throughput.

To ensure that frames do not arrive out of order at thedestination when the round-robin algorithm is used, distrib-ute sessions rather than individual frames across the physicallinks in the trunked link. A session is a stream of frameswith the same source and destination addresses.

Some implementations use the round-robin algorithmto provide load balancing for traffic as it is received. The intermediate driver software will respond to or gen-erate an ARP request with a different MAC address, asshown in Figure 8. The client will then use this MACaddress to send the packets to the server. The serversends each client a different MAC address in response toan ARP request.

14 PowerSolutions

Figure 6. Windows NDIS Wrapper

RDR AFD SRV NPFS

TCP/IP NetBEUI IPX/SPX

NDISIntermediate

NDISWAN

NDISMiniport

NDISWAN Miniport

I/O System Services

I/OManager

Hardware Abstraction Layer (HAL)

TDIClients

TDIProtocolDrivers

NDISLibrary

orWrapper

RDR: Networking RedirectorSRV: Networking File Server

AFD: Kernel Mode side of the Winsock InterfaceNPFS: Named Pipe File System

Figure 7. Balancing Traffic with the Round-Robin Method

1

2

3

4

5

6

Packets Load-Balancing Team

Inte

rmed

iate

Driv

er

NIC 1Driver

NIC 2Driver

NIC 3Driver

NIC 4Driver

1 5

2 6

3

4

Page 16: Building Your Internet Data Center

MAC Address AlgorithmAn alternative to the round-robin method is the use of theMAC, or Layer 2, address. The algorithm executes a hashingfunction on the destination MAC address to ascertain theoutgoing NIC. (Some products, such as Cisco FEC andGEC, use both the source and destination MAC addresses.)All frames reach the destination in order since all frames ina session are sent out over the same link.

But since the algorithm balances MAC addresses ratherthan traffic, the load may not be equally balanced across alllinks. In theory, one link could reach 100 percent utilizationwhile other links in a group have low utilization. In practice,most clients in a client/server environment use comparableamounts of bandwidth when connecting to the server withteamed NIDs.

Cisco FEC groups together multiple full-duplex 802.3uFast Ethernet links to provide port aggregation and fault tolerance for server connections to a network via a CiscoFEC-capable switch. A source MAC address and destinationMAC address pair resolves the link to be used to establishthe connection. The link is selected as the result of an X-OR operation on the last two bits of the source MACaddress and the destination MAC address. Therefore, bidirectional traffic between a client and server will be forwarded over the same link.

For traffic from the server (outgoing), no load balancingwill be possible when the client and the server are on separate

subnets and traffic must be forwarded by the default gateway(router). When the destination does not reside on the localnetwork, the ARP request is forwarded to a router. The routerresponds with a proxy ARP using its own MAC address. In theexample shown in Figure 9, when the server initiates a connec-tion to either Client 1 or Client 2, the traffic will always betransmitted over the same network link in the server since thedestination address is always the routers.

IP Address AlgorithmThe destination IP address, or Layer 3 address, can be usedinstead of the MAC address as a port selection method foroutbound packet transmission. The benefits of channelassignments based on the IP address are that client sessionsacross a router will be assigned to different ports in a team,and packets will reach the client in order.

IP Address and TCP Port Address AlgorithmThe TCP port address of the packet can be used in additionto the destination IP address to ensure that different sessionsfrom a client are assigned to different ports in the team. As a result, the traffic load is balanced not only across the IPclients, but also at the application socket level. If a singleclient is running multiple applications or multiple instancesof an application, the traffic will not all be forwarded overthe same interface. Instead, each application socket will bebalanced across all the ports in the team.

www.dell.com/powersolutions PowerSolutions 15

Client ARPs with Server IP to obtain

a MAC address

Server performshash function on the

information in the ARP

Server sends ARP responseover the primary NIC with the

MAC address of the selected NICs

Any traffic fromthe server is sent

over the selected NIC

Figure 8. Interface Selection Process

Client 12.2.2.1

00c0.0000.0004

Client 23.3.3.1

0090.0000.0003

Switch

Switch

RouterSwitch

Server

2.2.2.0002c.0000.0003

3.3.3.0002c.0000.0004

1.1.1.0002c.0000.0002

1.1.1.100b0.0000.0001

NIC Team

Connection initiated by Source MAC Address Destination MAC Address Client Load Balance

Server 00d0.0000.0001 002c.0000.0002 1 No (uses the same link)

00d0.0000.0001 002c.0000.0002 2 No (uses the same link)

Client 0090.0000.0003 002c.0000.0002 1 Yes (uses a different link)

00c0.0000.0004 002c.0000.0002 2 Yes (uses a different link)

Figure 9. FEC Example of MAC Address Load Balancing

Page 17: Building Your Internet Data Center

Load-Balancing TeamsThe intermediate driver processes packets when a net-work virtual team is operating in a system. Slight perfor-mance degradation is detected as higher CPU utilizationand lower throughput. Performance suffers because theintermediate driver must perform additional processing torun the dynamic balancing algorithm and queue packetsin multiple NICs. Typically the performance penalty is nogreater than five percent.

A number of factors contribute to the bandwidth scala-bility of a virtual NIC team, but the number of CPUs andtheir speed are probably the most important. For example,four 10/100 adapters in a load-balancing team will consumeapproximately 100 percent of a 400 MHz CPU. Therefore,throughput will be constrained unless more or faster CPUsare installed in the system.

Fault-Tolerant MethodsGrouping a primary adapter and one or more secondaryadapters in a logical or virtual team of adapters creates a fault-tolerant team. Failover from the primary to the secondary

adapter requires that the secondary adapter take the MACaddress of the failed adapter. Failover is triggered when noactivity or link is detected on the primary adapter. Thefailover time depends on the time the NIC takes to switchMAC addresses. For effective fault tolerance, this timemust be fast enough to prevent application session time-outs, so that failover is transparent to end stations, routers,and network protocols.

Teaming ExamplesMultiple NICs in servers offer the benefit of traffic segmen-tation and failure isolation. As shown in Figure 10, a failureon NIC1 does not affect the connectivity on the links of theother NICs. Only the clients on the NIC1 subnet areaffected. However, this configuration is not fault tolerantand cannot scale easily. In addition, no bandwidth is avail-able beyond what each NIC can provide.

To use multiple network ports in a server more effec-tively, an enterprise can create a logical or virtual adapter bygrouping together multiple physical adapters linked by anintermediate driver (see Figure 11). The software stack inthe OS treats such teams as one logical adapter. If a link orphysical adapter fails, traffic is dynamically rebalanced overthe remaining adapters in the case of a load-balancing teamor shifted from the primary adapter to the secondary adapterfor a fault-tolerant team.

Highly Available Network StrategyThe data network is a critical component of the businessinfrastructure, and it is rapidly assuming greater responsibil-ity in the daily operations of most businesses. Business-critical information typically flows over the network fromservers to clients. Therefore, high availability and high per-formance in the IT infrastructure are mandatory.

The IT strategy for a highly available network includesnetwork link aggregation, load balancing, and fault toler-ance, especially on the server. Link aggregation scales theavailable bandwidth by grouping multiple physical linkstogether to form a single logical or virtual link. The redun-dant links provide both load balancing and fault tolerance.Traffic flows are redirected around a failed NIC or cablewithout interrupting applications.

Rich Hernandez ([email protected]) is a senior engi-neer with the Server Networking and Communications Groupwithin the Departmental and Workgroup Server Division at Dell.Rich has been in the computer and internetworking industry for16 years. He has a B.S. in Electrical Engineering from the University of Houston and has engaged in postgraduate studies at Colorado Technical University.

16 PowerSolutions

Segment1

NIC1

NIC2

NIC3

NIC4Server

Segment2

Segment3

Segment4

Figure 10. Server Physical Segmentation

NIC1

NIC2

NIC3

NIC4Server

Segment2

Segment3

Segment1

Figure 11. Load-Balancing Team

Page 18: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 17

A s the sophistication and importance of network-basedapplications continue to grow, pressures are mounting

on network servers. The Internet, corporate intranets, data-bases, video conferencing, and other high-bandwidth appli-cations are placing heavier demands on server performanceand network bandwidth. At the same time, more users aredependent upon these applications to do their work. When akey server fails or slows down, it can hamper productivity formany users. In many cases, sales and other customer interac-tions suffer as well.

Two major causes of server downtime include:� Server bottlenecks, resulting from bandwidth-intensive

applications, such as the Web, video conferencing, orintranets; more powerful PC network connections (fasterbus technologies and 100 Mbps adapters); more high-performance clients attached to the network

� Failed network connections, resulting from broken orloose cables, hub or switch port failures, adapter hardwarebreakdown, or PCI slot malfunctions

Fortunately, enterprises can significantly increase serverbandwidth without a major network overhaul. Load balanc-ing across multiple Fast Ethernet or Gigabit Ethernet serveradapters provides a simple and scalable solution. Since theseload-balancing technologies automatically support redundantnetwork links, they also increase server availability as well asperformance.

The sidebar describes each of these advanced technologiesthat provide both scalability and high availability for servers.

Resolving the Server BottleneckAs sophisticated applications and more powerful desktopPCs drive network traffic to new levels, a single 100 Mbpschannel does not provide sufficient bandwidth for criticalserver connections—particularly with the increase in thenumber of desktops connected at 100 Mbps.

One previous solution to server bottlenecks wasinstalling an additional NIC in the server and segmentingthe network into two subnetworks (see Figure 1). This

Solving Server Bottlenecks

with Intel Server Adapters

This article discusses how advanced server adapter technologies address two

key issues of server performance: server bottlenecks and downtime caused

by link failure. These advanced technologies contribute to a simple and cost-

effective server solution, providing scalable bandwidth and automatic failover

connections for a faster and more dependable network link.

By Gary Gumanow

D A T A C E N T E R E N V I R O N M E N T

Page 19: Building Your Internet Data Center

100 Mbps

Server

Adapter

Switch

100 Mbps

Server

Adapters

Switches

100 Mbps

Segmentation

Typically requires:• Additional hardware• Reassigning of IP addresses• Repeated reconfiguration to

balance load

Figure 1. Increase Server Bandwidth by Segmenting the Network

reduced the traffic volume on each network link and elimi-nated the bottleneck.

Segmentation, however, poses a new set of problems,including additional overhead and the need to reassign IPaddresses and remap the network. Segmentation generallyrequires additional hardware such as switches or routers.Balancing traffic on the two segments also can be difficult,since it often requires repeated reconfiguration. Finally, sincethe two adapters operate in separate network segments, theydo not provide a failover connection if a link fails.

Creating A Scalable Network ConnectionAdaptive Load Balancing (ALB) offers a simple method tomove more data between the server and the network. ALBcan increase server bandwidth up to 800 Mbps by automati-cally balancing data transmission across as many as eightnetwork adapters (see Figure 2 for a 400 Mbps example).

Each additional adapter adds another 100 Mbps link to the network. Since the traffic distribution among theadapters is automatic, there is no need to segment or recon-figure the network. All adapters share the existing IP addressand the traffic is always balanced between them. ALB alsocan be used over Gigabit Ethernet links, providing through-put up to 8 Gbps.

ALB is implemented by installing a team of serveradapters into the server. The adapters can be quickly config-ured to run ALB using the Intel PROSet utility. Thisrequires no client configuration, and clients do not have tobe routed to communicate with each other. Moreover, themultiple adapters provide automatic emergency backup linksto the network. If one server link goes down because of abroken cable, bad switch port, or failed adapter, the otheradapters automatically accept the additional load. No inter-ruption occurs in server operation, and a network alertinforms IT staff of the problem.

18 PowerSolutions

Advanced Server Adapter technologies provide scalableserver bandwidth through load-balancing and automaticredundant connections for increased server availability. Thesetechnologies generally are not supported in desktop adapters.

Adapter Fault Tolerance (AFT). AFT monitors the serverconnection to the network and automatically switches traf-fic to a redundant link if a failure occurs.� Mixed Adapter Teaming—Enables one kind of server

adapter to be used as a redundant backup link for a differ-ent kind of server adapter

� Mixed Speed Teaming—Enables a 100 Mbps server adapterto be used as a backup link for a Gigabit server adapter

PCI HotPlug. PCI HotPlug is an industry standard thatenables a failed network adapter to be replaced withouttaking the server off-line.

Adaptive Load Balancing (ALB). ALB supports scalablebandwidth up to 800 Mbps, or 8 Gbps in a Gigabit Ethernetenvironment.

Link Aggregation. Link Aggregation supports scalablebandwidth up to 800 Mbps full-duplex or up to 8 Gbps in aGigabit Ethernet environment. The network interface card(NIC) and the switch must support Link Aggregation.

Fast EtherChannel (FEC). FEC supports scalable band-width up to 1600 Mbps at full-duplex. The NIC and switchmust support FEC.

Gigabit EtherChannel (GEC). GEC supports scalablebandwidth up to 16 Gbps at full-duplex. The NIC and switchmust support GEC.

ADVANCED SERVER ADAPTER TECHNOLOGIES

Page 20: Building Your Internet Data Center

An ALB team can consist of up to eight Intel ServerAdapters configured to work together. All adapters in a teammust be connected to a switch. They can be connected to asingle switch or to two or more switches, but all switchesmust be on the same network segment (they cannot be sepa-rated by a router).

Once ALB is configured, all outgoing server traffic willbe balanced across the adapter team. A single adapter han-dles incoming traffic. In most environments, this is a highlyeffective solution since server traffic is primarily outbound—from the server to the clients.

Link Aggregation/FEC/GECLink Aggregation and Fast EtherChannel (FEC) are othertechnologies that can help to increase server bandwidth.Like ALB, they automatically balance server traffic among asmany as eight network adapters and require no networkreconfiguration. Unlike ALB, they enable full-duplex trans-mission on all adapters if the switch supports this advancedfeature. Both incoming and outgoing server traffic are bal-anced and can be scaled in increments of 200 Mbps.

Fast Ethernet environments (see Figure 3 for an 800 Mbpsexample) can handle total throughput of up to 1600 Mbps. LinkAggregation also can aggregate traffic across multiple Gigabitserver adapters for throughput up to 16 Gbps at full-duplex.Gigabit EtherChannel (GEC), another emerging technology,will provide similar, full-duplex load balancing if it is connected

to supporting GEC switches.Because of their ability to

handle high-bandwidth, full-duplex traffic loads, these tech-nologies are ideally suited tohigh-performance environ-ments running demandingapplications such as enterpriseservers, Web servers, intranetservers, and high-end graphicsimaging and rendering servers.In addition to scalable serverbandwidth, these technologiesprovide reliable fault tolerance.If one link fails, the other

adapters in the team automatically accept the full traffic loadand generate an alert to notify IT staff of the problem.

Whereas ALB works when the Intel Server Adaptersare connected to the network via any switch, these full-duplex technologies require that the adapters be connectedto switches that support the specific scalable bandwidthtechnology that is configured in the adapter. Link Aggre-gation is supported by an increasing number of switch ven-dors. Fast EtherChannel works with any FEC-enabled

www.dell.com/powersolutionsPowerSolutions 19

Hubs

Switch

Server

400 Mbps (Fast Ethernet adapters)4 Gbps (Gigabit adapters)

One channeltwo-way

(receive andtransmit)

Up to 4 IntelServer Adapters

ALB requires:• Less new hardware• No new IP address• No configuration

Figure 2. Adaptive Load Balancing Assures Fast Throughput with Four Intel Server Adapters

Adaptive Load

Balancing (ALB)

offers a simple

method to move more

data between the

server and

the network. Intel Hubs

Switch mustsupport LinkAggregation,FEC, or GEC

Server

800 Mbps at full-duplex (Fast Ethernet adapters) 8 Gbps at full-duplex (Gigabit adapters)

Up to 4 IntelServer Adapters

All 4 channels transmit and receive

Figure 3. Link Aggregation, Fast EtherChannel, or Gigabit EtherChannel Network Traffic with Four Intel Server Adapters

Page 21: Building Your Internet Data Center

switch, and Gigabit EtherChannel works with any GEC-enabled switch.

Improving Server ReliabilityServer manufacturers have implemented a variety of mecha-nisms to improve the reliability of servers. However, abroken or loose network cable, a faulty switch or hub port,or a failed adapter can shut down server operation just aseasily as a server malfunction.

Resilience and Online ServiceabilityAdapter Fault Tolerance (AFT) provides a simple, effective,and fail-safe method for increasing the availability of serverconnections (see Figure 4). With two or more server adaptersinstalled in a server, AFT can be configured to establish anautomatic backup link between the server and the network.Should the primary link fail, the secondary link kicks inwithin seconds—transparent to both applications and users.

The redundant link that AFT establishes between theserver and the network includes a redundant adapter, a cable,and a hub or switch port connection. If any problem occursalong the primary link, the secondary link immediately takesover. AFT also initiates a network alert. The server remainsonline so technicians can take corrective measures whenappropriate—for example, during off-business hours.

AFT can be implemented in a server using only two serveradapters: the first as the primary connection and the second asa backup. AFT is also supported when server adapter teamsare configured for Adaptive Load Balancing, LinkAggregation, Fast EtherChannel, or Gigabit EtherChannel. Inthose cases, if any server links fail for any reason, the remain-ing links automatically take over to share the traffic load.

Mixed Speed and Preferred Primary TeamingUnlike most redundant link technologies, AFT supportsmixed-speed teaming using any combination of Intel ServerAdapters. For example, a Gigabit server adapter may be usedas the primary network link; the backup link could be anotherGigabit server adapter or a Fast Ethernet server adapter. Thiscapability enables a relatively inexpensive 100 Mbps backuplink to be used to safeguard a high-speed Gigabit Ethernetconnection. Although the 100 Mbps backup network connec-tion may not be able to support the full traffic load as effec-tively, it can allow business-critical applications to stay onlineuntil the higher speed link is fixed.

When configuring AFT, a preferred primary adapter canbe specified. If the primary link fails, it will automatically bereinstated as the primary link once it is fixed. For example, ifa Gigabit server adapter is used for specialized, high-demandapplications, a less expensive backup link can be installedusing a Fast Ethernet server adapter. If the primary link fails

20 PowerSolutions

Server

Same model adapterused for primary

and redundant links

RedundantNetwork Link

(Same performanceas primary link)

PrimaryNetwork Link

(100 or 1000 Mbps)

Scenario 1: Full-Bandwidth Redundant Link

Server

Different model adapterused for primary

and redundant links

SecondaryNetwork Link(100 Mbps)

PrimaryNetwork Link

(100 or 1000 Mbps)

Scenario 2: Lifeline Redundant Link(Mixed Adapter Teaming)

Figure 4. Adapter Fault-Tolerance Teaming

Page 22: Building Your Internet Data Center

and is then fixed, traffic will automaticallyrevert back to the higher performance link.

PCI HotPlug—Online ServiceabilityPCI HotPlug enables a failed adapter to bereplaced without taking the server off-line.This technology is now an industry stan-dard supported in most new servers. Whenused together with AFT, PCI HotPlugallows an adapter to be replaced withoutinterrupting network service. If an adapterfails, AFT automatically moves server traf-fic onto the redundant link and generates anetwork alert. PCI HotPlug enables IT staffto replace the failed adapter without bring-ing down the server.

Configuring Intel Server AdaptersA single driver provides the software agent that supportsAdapter Fault Tolerance, Adaptive Load Balancing, LinkAggregation, Fast EtherChannel, Gigabit EtherChannel,and PCI HotPlug. How the agent is configured in a partic-ular environment determines which advanced feature isenabled. However, all the scalable bandwidth technologiessupported by Intel Server Adapters include built-in sup-port for AFT and PCI HotPlug. So if ALB, Link Aggre-gation, FEC, or GEC is configured, AFT and PCI HotPlugare automatically activated.

All advanced server adapter features supported by IntelServer Adapters integrate seamlessly into Novell NetWare,Microsoft Windows NT 4.0, Microsoft Windows 2000,Linux®, and UnixWare® operating system-based servers.The advanced features are management-ready and simpleto use, with intuitive interfaces for quick setup and ease-of-use. Standard OS interfaces are used for NetWare,Linux, and UnixWare; Windows NT and Windows 2000use Intel PROSet, a Windows OS-based configuration utility from Intel.

Network alerts for failed links are OS-based for com-patibility with management applications. Specifically,NetWare alerts are generated for NetWare servers andevent logs for Windows NT servers. A management appli-cation, such as Intel LANDesk® Management Suite, candetect these alerts and trigger an appropriate action. Forexample, a network manager could choose to be notified ofa failure via an e-mail message, a fax, or a call to a pageror cellular phone.

All Intel Fast Ethernet and Gigabit Ethernet serveradapters can also be configured to work in servers equippedwith the Intel 82558 or Intel 82559 Onboard LAN con-troller. This enables AFT, ALB, Link Aggregation, or FEC to

be configured using fewer PCI slots by team-ing the onboard LAN controller with add-inIntel Server Adapters.

Reliable, Scalable, and Easy to ConfigureScalable bandwidth technologies, alongwith Adapter Fault Tolerance and PCIHotPlug, make the Intel Server Adapterfamily an ideal solution for fast networkconnectivity with enhanced server avail-ability. Businesses have the choice of con-figuring the adapter software to use thetechnologies best suited to the demands oftheir server environment and their exist-ing infrastructure. Each technology buildson the preceding one, so nothing is lost ashigher bandwidth load-balancing tech-nologies are employed.

By providing scalable bandwidth and increased avail-ability at a crucial point in the network, Intel ServerAdapters can help to revive network infrastructures thatare otherwise straining under increased traffic loads. Theyalso enable a more affordable server solution for high-demand networks. By integrating high-availability serverlinks and load balancing into the network adapter, theyeliminate the need for specialized server hardware andother expensive infrastructure components.

Gary Gumanow ([email protected]) is a senior productline marketing engineer in the Platform Networking Group atIntel Corporation. He is responsible for product planning for IntelGigabit Ethernet adapters and software. Gary has been designingnetworks for over 16 years and holds an MBA from Pace.

www.dell.com/powersolutions PowerSolutions 21

� Intel Server Adapters:

http://www.intel.com/network/products/

server_adapters.htm

� Support for advanced server adapter technologies:

http://www.intel.com/network/technologies/

advanced_features.htm

� Related white papers:

Layer 2 Network Prioritization http://www.intel.com/net-

work/white_papers/priority_packet.htm

� Building a Managed Computing Environment:

http://www.intel.com/network/white_papers/

managed_environment.htm

FOR MORE INFORMATION

By providing scalable

bandwidth and increased

availability at a crucial

point in the network,

Intel Server Adapters can

help to revive network

infrastructures that are

otherwise straining under

increased traffic loads.

Page 23: Building Your Internet Data Center

22 PowerSolutions

An important advance in data availability for opensystems came with the introduction of the

PowerVault 650F Fibre Channel enterprise-class storagesystem. The PowerVault 650F was introduced in January1998 with Dell PowerEdge Clusters featuring MicrosoftCluster Server (MSCS). Subsequent releases of storagearea network (SAN) technologies supporting the Power-Vault 650F provide improved performance, higher avail-ability, greater expandability, and enhanced configurationflexibility over traditional storage enclosures.

A PowerEdge Cluster can be configured in a numberof ways using PowerVault Fibre Channel storage com-ponents. A clustered pair of PowerEdge servers can connect directly to the storage system, connect to the storage system via a PowerVault SAN, or share astorage system with other cluster servers through clusterconsolidation.

PowerEdge Clusters and MSCSA PowerEdge Cluster is an integrated system of components,including a pair of PowerEdge servers, a PowerVault storagesystem, a cluster interconnect, and Microsoft’s failover clus-ter software. Microsoft Windows 2000 Cluster Service andMicrosoft Cluster Server for Windows NT Server 4.0, Enterprise Edition, implement a two-node failover clusterthat delivers high availability for applications and services.Optional features of a PowerEdge Cluster include advancedcluster management software, Dell OpenManage™ ClusterAssistant with ClusterX, installation services, consultingservices, proof-of-concept testing, and a system availabilityguarantee program.

Dell has extensively tested and certified all supportedPowerEdge Cluster configurations. All PowerEdge Clustersare cluster ready as shipped and require no additional third-party products or costly cluster-specific components.

IntegratingPowerEdge

Clusters withStorage Area

NetworksMany configuration options are available for Dell PowerEdge Clusters with

PowerVault Fibre Channel storage area networks (SANs). Compared with tradi-

tional storage enclosures, these configurations offer better performance, higher

availability, greater expandability, and enhanced configuration flexibility. This article

explains the benefits of various configurations of PowerEdge Clusters with SANs.

By Mike Kosacek and Edward Yardumian

D A T A C E N T E R E N V I R O N M E N T

Page 24: Building Your Internet Data Center

In MSCS cluster configurations, both servers share accessto a common, external storage system (which may containseveral RAID arrays), but one cluster node (one of theservers in the cluster) or the other will own any RAIDvolume in the external storage system. The cluster servicemaintains control over which node has access.

Clients do not connect directly to either of the two clusternodes, but to “virtual servers.” Each virtual server, managed byMSCS, has its own IP address, server name, and disk drives(in the shared storage). Multiple virtual servers can be run ona cluster and on each node simultaneously.

If a cluster node requires maintenance orfails, any virtual servers running on it are movedto the other server in the cluster, to which clientsreconnect once the services start. Cluster nodesconstantly monitor each other over the clusterinterconnect and automatically detect the failureof a node or application. If a server hangs or losespower, application services stop, or some otherinterruption in service occurs, the other clusternode will automatically take responsibility for thefailed server’s virtual servers.

Active/Active and Active/Passive ConfigurationsMSCS and all PowerEdge Clusters supportboth active/active and active/passive clusterconfigurations. The term active/active refers to a cluster withvirtual servers running on each node. When an applicationis running on Node 1, Node 2 need not wait idly for Node 1to fail. Node 2 can run its own cluster-aware applications (oranother instance of the same application) while providingfailover capabilities for resources on Node 1. An active/active cluster node must be sized appropriately to handle theload of both nodes (in the event of a failover).

The term active/passive refers to failover cluster configura-tions in which one cluster node is actively processing requestsfor a clustered application while the other cluster node simplywaits for the active node to fail. An active/passive configura-tion is more costly in terms of price/performance because oneserver sits idle most of the time. It is appropriate for business-critical systems since the application can use the full power ofanother server in case of a failure.

Primary and Secondary Storage ComponentsPowerEdge Cluster F-Series configurations, including theFE100 and FL100, use PowerVault Fibre Channel storagecomponents. These components include storage systems,such as the PowerVault 650F Fibre Channel disk processorenclosure, the PowerVault 120T tape autoloader, and thePowerVault 130T tape library, as well as SAN components,such as PowerVault 51F and 56F Fibre Channel switches.

PowerEdge Cluster FE100 or FL100 configurations usethe PowerVault 65xF, a highly available and scalable exter-nal Fibre Channel storage system, as primary storage. ThePowerVault 65xF features end-to-end, dual Fibre Channelloops, dual active RAID controllers (called storage proces-sors), power-protected mirrored cache, and redundant fansand power supplies for high performance and availability.The PowerVault 650F can be mounted in an industry-standard 19-inch rack. The PowerVault 651F is available in a tower or desk-side package.

The PowerVault 65xF, with a capacity exceeding four terabytes, is suitable for running data-intensiveapplications and for storage and cluster consoli-dation for multiple servers using a common storage system. SAN configurations support acombined maximum of eight primary and foursecondary storage ports.

Secondary storage components for PowerVaultSANs include backup systems such as thePowerVault 120T tape autoloader and the PowerVault 130T DLT tape library. The PowerVault 130T supports up to four DLT 4000 or7000 tape drives and 30 tape cartridges for a totalcapacity of up to 2.1 terabytes (at 2:1 compres-sion). The PowerVault 120T and 130T can beintegrated into the SAN using the PowerVault 35F

bridge, which provides SCSI-to-Fibre Channel bridged connec-tivity. Each PowerVault 35F supports up to four PowerVault120T tape autoloaders and two PowerVault 130T libraries.

Direct-Attach ConfigurationsA direct-attach PowerEdge Cluster configuration includesthe two cluster server nodes and a single PowerVault 650F or651F Fibre Channel storage system. In direct-attach configu-rations, the storage processors on the PowerVault 650F areconnected by cables directly to each of the two Fibre Channelhost bus adapters (HBAs) in the cluster nodes. A SANis not required (see Figure 1). Each clustered server has aredundant, active path to the PowerVault storage system.Application-transparent failover (ATF) software running oneach node monitors the paths to the storage system and canreroute traffic in the event of a failure in the HBA, cablesystem, or storage processors.

SAN-Attached ConfigurationsDell PowerEdge Clusters support not only direct-attachedcluster configurations, but also PowerVault SAN configu-rations using Fibre Channel switches. SAN-attachedclusters are superior to direct-attached cluster configura-tions in configuration flexibility, expandability, and per-formance. SAN-attached PowerEdge Cluster FE100 and

www.dell.com/powersolutions PowerSolutions 23

Cluster nodes

constantly monitor

each other over the

cluster interconnect

and automatically

detect the failure of a

node or application.

Page 25: Building Your Internet Data Center

FL100 configurations require a redundant Fibre Channelswitch fabric.

If greater storage capacity is desired for the cluster, a SAN-attached cluster configuration can connect to multiple

PowerVault 65xF storage systems through the SAN. A clusterconfiguration with multiple storage systems (see Figure 2) pro-vides greater storage capacity. For example, a cluster can supportover eight terabytes of data using two PowerVault 65xF storagesystems (see Figure 2), and around 16 terabytes using four.

If greater performance is desired as disks are added to thesystem, the total capacity for a given cluster can be dividedamong multiple storage systems, each with fewer disks. Enhancedstorage performance is realized through the additional activestorage processors included in each new storage system.

SAN-Attached Tape BackupAnother advantage of SAN-attached configurations is theability to perform tape backup operations over the SAN fora single cluster or multiple clusters. Backup over SAN is ahigh-performance method of backing up each of the serversor clusters attached to the SAN while conserving networkbandwidth for client systems.

Dell PowerSuites tape backup software bundles includethe necessary software components to enable each server orcluster node attached to the SAN to share a SAN-attachedtape device. The connection to the tape backup devicethrough the SAN is made through a Fibre Channel-to-SCSIbridge, such as the PowerVault 35F.

A direct-attach cluster configuration can be easily migratedto a SAN-attached configuration to implement backup over aSAN. To migrate a direct-attach configuration, add redundant,active Fibre Channel switches, as shown in Figure 3.

24 PowerSolutions

Cluster 1

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

Fabric A Fabric B

StorageProcessor A

StorageProcessor B

PowerVault Fibre ChannelStorage Enclosure

Figure 1. Direct-Attach PowerEdge Cluster Configuration

Cluster 1

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

Fabric A Fabric B

StorageProcessor A

StorageProcessor B

Fibre ChannelSwitches

StorageProcessor A

StorageProcessor B

PowerVault Fibre Channel Storage Enclosures

Figure 2. SAN-Attached Cluster with Multiple Storage Systems

Cluster 1

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

Fabric A Fabric B

Fibre ChannelSwitches

StorageProcessor A

StorageProcessor B

PowerVault Fibre Channel Storage EnclosureTape Backup Changer

or Autoloader

FibreChannel-to-SCSIBridge

Figure 3. Cluster with Backup over SAN

Page 26: Building Your Internet Data Center

Multiple Clusters on a SANFibre Channel switch zoning allows multiple clusters and sharedtape backup systems to be attached to the same SAN. Zoning isimplemented in the Fibre Channel switch configuration usingthe Switch Manager graphical utility or through the integratedcommand-line interface. Switch zoning prevents systems thatshould not see each other from doing so, yet still allows all sys-tems to share access to Fibre Channel switches and tape backupsystems. Zoning must be used if different operating systems orclusters and stand-alone servers share the same switch fabric.

Figure 4 shows two clusters attached to a SAN. Each cluster has its own PowerVault 65xF Fibre Channel storagesystem, and each has access to a common tape librarythrough a tape backup device attached to the SAN via anFC-to-SCSI bridge. Implementing this configuration requirestwo Fibre Channel zones: one zone for each pair of clusternodes and their primary and secondary storage.

Cluster Consolidation ConfigurationsCluster consolidation is an extension of a SAN-attachedcluster configuration. Storage consolidation and cluster con-solidation can both lower the total cost of ownership of aPowerVault SAN. For a SAN, storage consolidation is com-monly defined as the ability to allow multiple servers toaccess a portion of the storage capacity offered by a singlestorage enclosure (see Figure 5).

Storage consolidation requires more than just a sophisti-cated wiring scheme. Because operating systems such as

Windows assume ownership of all attached data volumes orlogical unit numbers (LUNs), a method to prevent the sys-tems from accessing volumes that do not belong to themmust be implemented.

Dell’s OpenManage Storage Consolidation softwareperforms Fibre Channel LUN masking to hide data vol-umes and prevent specific servers attached to the SANfrom accessing them. Fibre Channel HBAs located ineach of the servers enforce the LUN masking. Dell OpenManage Storage Consolidation gives servers explicitownership of a data volume located within the commonenclosure. For example, if two servers are sharing a stor-age system that has two available LUNs (or volumes ofdata), Server 1 can be assigned LUN 0, and Server 2 canbe assigned LUN 1. Dell OpenManage Storage Consolida-tion prevents each server from discovering or accessingstorage owned by other nodes.

Like SAN-attached configurations with multiple clusters attached to the SAN, cluster consolidation (seeFigure 6) requires zone configuration in the Fibre Channelswitches to prevent each cluster pair from detecting thepresence of the HBAs in the other cluster nodes. Zoneconfiguration also prevents one cluster node from nega-tively influencing the HBA of another node. With clusterconsolidation, storage can be centrally located for easieradministration, management, and backup. It also enablesthe cost of the PowerVault SAN and Fibre Channel storagesystem to be amortized across multiple systems.

www.dell.com/powersolutions PowerSolutions 25

Cluster 1

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

Fibre ChannelSwitches

StorageProcessor A

StorageProcessor B

Fibre Channel Storage Enclosurefor Cluster 1

Tape Backup Changeror Autoloader

FibreChannel-to-SCSIBridge

Cluster 2

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

StorageProcessor A

StorageProcessor B

Fibre Channel Storage Enclosurefor Cluster 2

Fabric A Fabric B

Figure 4. SAN with Multiple Clusters and Shared Tape Backup Systems

Page 27: Building Your Internet Data Center

26 PowerSolutions

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

Fibre ChannelSwitches

Optional SAN-basedTape Backup

FibreChannel-to-SCSIBridge

Server 3 Server 4

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

StorageProcessor A

StorageProcessor B

Storage for Servers 1–4

Fabric A Fabric B

Fibre ChannelStorage Enclosure

1 2

43

Nonclustered Stand-alone Servers

Figure 5. Storage Consolidation for Four Servers

Cluster 1 Cluster 2

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

Fibre ChannelSwitches

Optional SAN-basedTape Backup

FibreChannel-to-SCSIBridge

Server 1 Server 2

FC HBA

AFC HBA

BFC HBA

AFC HBA

B

StorageProcessor A

StorageProcessor B

Storage for Clusters 1 and 2

Fabric A Fabric B

Fibre ChannelStorage Enclosure

1 1

22

Figure 6. Cluster Consolidation

Page 28: Building Your Internet Data Center

The Appropriate Choice of ConfigurationPowerEdge Cluster F-Series offers many configurations usingPowerVault Fibre Channel storage and PowerVault SANs. A simple, yet highly available and solidly performing direct-attach configuration will suit many purposes.

If a direct-attach configuration is unsuitable, many SANconfigurations are available. PowerEdge Cluster F-Seriesconfigurations can use a PowerVault SAN for additionalcapacity, higher disk subsystem performance, tape backupconsolidation, or cluster consolidation. Cluster consolidationenables multiple clusters to leverage the Fibre Channel stor-age architecture and feature set.

The most desirable configuration depends on the needs ofthe application. For example, proper planning and sizing arevital to enable customers to benefit from cluster consolidation.Although the 65xF storage system can support up to 10 clusterpairs, that configuration may not be optimal for maximumperformance. The right choice depends on understanding theworkload each server will impose on the storage system. Aheavily loaded active/active clustered database may require adedicated storage system, whereas cluster nodes hosting fileshares, printers, or Web sites may be able to share a storagesystem because their I/O requirements are less demanding.

With a SAN and cluster consolidation, server and storagecomponents can be shared among multiple configurations,

including clusters. These options help to protect the enter-prise’s investment in the devices.

A Cost-Effective SolutionAlthough the initial costs of building a SAN are higher thanthose of purchasing a direct-attached storage system, withappropriate sizing and planning, a PowerVault SAN can bemore cost-effective, offer greater value to the business, andprovide investment protection for the storage components.Appropriate PowerEdge Cluster and PowerVault SAN config-urations are available for any application. Figure 7 illustrates aSAN with three possible cluster and SAN configurations:one-to-one, one-to-many, and many-to-one.

Mike Kosacek ([email protected]) is a senior member ofthe Cluster Development group at Dell Computer Corporation. Asthe lead engineer on several Dell products, Mike’s responsibilitiesinclude developing, testing, and certifying cluster solutions for the Dellserver and storage product lines. Mike has a degree in ElectronicsTechnology and is a Microsoft Certified Systems Engineer (MCSE).

Edward Yardumian ([email protected]) is a tech-nologist on the Internet Infrastructure Technologies team at DellComputer Corporation. Previously he was a lead engineer forDell PowerEdge Clusters.

www.dell.com/powersolutions PowerSolutions 27

Cluster 1

Server 1

FC HBA

AFC HBA

B

Consolidated Backup

Cluster 2 Cluster 3 Cluster 4

SAN-Attach(One-to-One)

SAN-Attach(One-to-Many)

Cluster Consolidation(Many-to-One)

Server 2

FC HBA

AFC HBA

BServer 1

FC HBA

AFC HBA

BServer 2

FC HBA

AFC HBA

BServer 1

FC HBA

AFC HBA

BServer 2

FC HBA

AFC HBA

BServer 1

FC HBA

AFC HBA

BServer 2

FC HBA

AFC HBA

B

Fabric A Fabric B

StorageProcessor

A

StorageProcessor

B

Cluster 1 Storage

StorageProcessor

A

StorageProcessor

B

Cluster 2 Storage

StorageProcessor

A

StorageProcessor

B

StorageProcessor

A

StorageProcessor

B

Cluster 3 and 4 Storage

Fibre Channel-to-SCSI Bridge

Fibre Channel-to-SCSI Bridge

Consolidated Backup

Figure 7. SAN with Three Possible Cluster and SAN Configurations

Page 29: Building Your Internet Data Center

28 PowerSolutions

The Dell PowerVault Fibre Channel family of products features Fibre Channel hard drives instead of SCSI hard

drives. As shown in Figure 1, employing Fibre Channel tech-nology throughout the Dell solution delivers a substantiallyhigher bandwidth than hybrid solutions. In a hybrid solutionusing SCSI-based drives, data throughput can be limited to 40 Mb per second. Therefore, you may not benefit from theextra bandwidth a true Fibre Channel solution can provide.

Dell also uses a dual-ported Fibre Channel system, shownin Figure 2, that allows for dual data paths and additionalfault tolerance.

ATF Eliminates Single Point of FailureDell and Novell jointly developed Application TransparentFailover (ATF) technology to work exclusively on Dell storagesystems. Using ATF software, you can have two Fibre ChannelHost Bus Adapters (HBAs) in the same system providingredundant data paths. ATF eliminates any single point of fail-ure by protecting against a failed HBA, loss of a FibreChannel switch, or any interruption in the fiber data path.

ATF also provides a greater level of fault tolerance withNWCS by providing higher availability at the hardware level.At the application level, NWCS provides higher availabilityfor applications and file access, for a complete solution.

During testing, Dell copied a CD from a client to theshared storage. During the copying process, the data streamwas interrupted by turning off a PowerVault 51F FibreChannel switch. The server console displayed a message thatthe data path had been interrupted and the system had initi-ated a failover. The client continued to copy data withoutinterruption and was not aware a failure had taken place.

To use ATF, you need to load SCSISAN.CDM. UseHADM.NLM to restore failed paths. See the sidebar in thisarticle for installation instructions.

Understand the Cluster Architecture FirstBefore exploring specific tuning parameters in an NWCScluster, you must understand the components of the cluster.You should also consider a few other factors when planningan NWCS deployment.

NetWareCluster Services:

Deployment Considerationsand Tuning

As of March 15, 2000, Dell Computer was the first and only company to complete a Novell

Yes-certified 32-node solution with NetWare Cluster Services (NWCS).1 Dell and Novell

jointly used their expertise to explore the upper limits of the solution, which allows Dell to

better support small-to-medium deployments reliably. Most recently, Dell certified NWCS

with Application Transparent Failover technology in a multinode cluster to produce a solu-

tion with no single point of failure. This article describes the process and additional tuning

procedures Dell needed to support NWCS clustering at a higher level.

By Richard Lang

D A T A C E N T E R E N V I R O N M E N T

1Refer to Novell Bulletin #58058.

Page 30: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 29

Typical cluster configurations include a shared disk sub-system connected to all servers in the cluster. The shareddisk subsystem can be connected via high-speed FibreChannel cards, cables, and switches, or it can be configuredto use shared SCSI. If a server fails, another designatedserver in the cluster automatically mounts the shared diskvolumes previously mounted on the failed server. Thisfailover gives network users uninterrupted access to the volumes on the shared disk subsystem.

Typical resources might include data (volumes), applica-tions, server licenses, and services. Figure 3 shows a typicalDell Fibre Channel cluster configuration.

Heartbeats Communicate Cluster Health A node participates in a distributed failure-detection algo-rithm when it joins the cluster. Node failure is detected byexternal monitoring of continuous heartbeats, which are smallIP packets. Each node periodically transmits a heartbeat andmonitors the heartbeat from other nodes. If a heartbeat is notobserved after a predetermined interval, the failure detectionalgorithm is executed. Since a master node will always exist,all nodes do not have to monitor all other nodes.

Local area network (LAN) heartbeat packets are trans-mitted at the rate of one per second. Disk heartbeat I/Osoccur at the rate of half the default threshold value of eightseconds; that is, one disk heartbeat every four seconds.

In Novell ConsoleOne™, you can edit the quorum mem-bership and timeout properties. First, right click the clusterobject and select Properties. On the cluster object propertiespage, select the Quorum tab. Membership determines thenumber of nodes that must be running before resources startto load. Timeout specifies the amount of time to wait for the

servers defined in the Membership field. Heartbeat specifiesthe amount of time between transmits for all nodes except the master. The time is represented in seconds. Tolerancedetermines the amount of time a node is given to return aheartbeat before it is removed from the cluster. For furtherdetails on the variables and settings see www.novell.com/documentation/lg/ncs/pdfdoc/orionenu1.pdf.

It is not necessary or recommended to isolate the heart-beat on a separate dedicated LAN. Since the heartbeat con-sists of small IP packets, bandwidth is not a concern. If youdid isolate the heartbeat on a separate LAN and the primaryLAN failed—affecting a large number of users—the heartbeatwould remain alive. In this case, resources may not fail over.

Tuning the Cluster Dell worked with Novell to establish the following tuningparameters using a load of 1,500 clients in a non-productionenvironment. Dell used four-gigabit segments for increasedbandwidth. During testing, Web applications, NovellGroupWise® mail, and file and print operations were ran-domly exercised through a script to vary the load and measure the performance changes during different types ofloads. Dell used four PowerEdge 2450 2U servers in the cluster and a PowerVault 650F true fiber-to-fiber storagesubsystem for the shared storage.

Tuning Novell Storage Services VolumesFor an explanation of available commands in NovellStorage Services (NSS), from your console type “NSS /?” or “NSS Help”.

NSS must be loaded with /AutoDeactivateVolume=ALL.(This command should be set by the NWCS install.)

Figure 1. Fibre Channel System versus Hybrid System

Fibre Channel100 Mb/sec

Fibre Disk Drives

Fibre Channel Switch

Host Server

HBA

100 Mb/sec

Fibre Channel System

Fibre Channel100 Mb/sec

SCSI Disk Drives

Fibre Channel Hub

Host Server

HBA

40 Mb/sec

Hybrid System

Figure 2. Dual-Ported Fibre Channel System

100 Mb/sec

Disk Drives

Host Server

100 Mb/sec

FibreChannelSwitch

HBA

HBA

Page 31: Building Your Internet Data Center

For previously installed Dell NetWare SAN environ-ments, remove the Mount All command. You may see theNSS load command as follows:

Load NSS ZLSS /AutoDeactivateVolume=All

ZLSS is specific to Dell and not part of a standardNetWare install. ZLSS prevents the NSS manager frommounting all NSS volumes attached to the SAN but is notrequired once Novell Cluster Services has been installed.The cluster manager will not allow any node to mount avolume that is not assigned to that node. This is part of theshare-nothing architecture of NWCS.

Three NSS parameters should be adjusted for best per-formance: (Note: Dell and Novell deemed the suggested settings optimal in a 1,200-user environment.)� CacheBalance: Default = 10 (range = 1 to 99)

—Sets what percentage of free memory NSS can use forits buffer cache

—CacheBalance should be set to 80 percent� ClosedFileCacheSize: Default = 512 (range = 1 to

100,000)—Determines the number of cached files that can be

cached in memory—ClosedFileCacheSize should be set to 100,000

� Mailboxsize: Default = 128 (range = 64 to 20,000)(Note: NSS documentation will state the range as 64 to 256)—Determines the NSS workspace—A setting of 16,000 is suggested for an environment of

1,000 to 1,200 users

Load NSS as follows: NSS /AutoDeactivateVolume=ALL/mailboxsize=16000 /CacheBalance=80/ClosedFileCacheSize=100000

See Novell documentation for additional tuningparameters for specific NetWare applications.

Mirror NSS for Additional RedundancyFor additional data redundancy in business-critical appli-cations, you can mirror your NSS volumes. You must con-figure NSS mirroring before creating or cluster enablingany NSS volumes.

Minimum requirements for mirroring NSS include:� NetWare 5.1 Support Pack 1 on each server� All servers connected to a shared storage system� 10 MB free for cluster communication� Each server must be assigned a unique ID (Example:

mm assign server id = clusternode1)

Tuning GroupWise to Improve Performance The parameters in Figure 4 show optimal tuning for supporting1,200 to 1,500 users in a GroupWise application running in acluster. Whenever NetWare Administrator (NWAdmin) isused to modify GroupWise parameters, it updates the po1.poafile, which is stored on the SYS volume in the system folder.

The sample of the po1.poa text file in Figure 4 showsmodifications to improve performance. These settings were used to provide optimal performance and increasedthroughput.

Troubleshooting Issues If kernel debugging is necessary, use the console CLUSTER DEBUG command to halt all nodes. Portal also can be used to provide a nonintrusive debuggingtool. It is best to avoid using the server’s floppy drivebecause it can execute in real mode longer than thethreshold period.

30 PowerSolutions

Figure 3. A Typical Dell Fibre Channel Cluster Configuration

Sys SysSys SysSys Sys

NetworkInterface

Cards

Fibre ChannelSwitch

Shared Disk System

Network Hub

FibreChannelCards

Server 1 Server 2 Server 3 Server 4 Server 5 Server 6

Page 32: Building Your Internet Data Center

Fatal Storage Area Network Errors Hardware problems, such as bad Fibre Channel cable,GBIC/FC port failure, HBA failure, or switch failure, cancause fatal storage area network (SAN) errors. The DellOpenManage Data Supervisor can help to diagnose thesetypes of problems. Using the Data Supervisor, you can alsoexport the log from the storage processor to review for errors(Dell tech support can assist in reviewing the log to determinefailures).

Split-Brain Conditions Split-brain conditions can occur as a result of LAN hard-ware or software problems. If you detect a split-brain con-dition, check the following:1. Check LAN driver and protocol stack statistics. A bad

network interface card (NIC) could be intermittentlydropping packets.

2. Check for a bad NIC or cable.3. Check to see if the LAN hub or switch power was reset.4. Check LAN switch for settings that could delay packets

(such as spanning tree).5. Check for proper server configuration. 6. Check packet receive buffers and service processes for

evidence of exhaustion.

In some cases, like the Intel NIC, you should set thespeed to auto detect. Dell found that an NIC set to 100 Mbpswill cause a LAN switch to see this as a mismatch, then con-nect at half-duplex. This can lower performance and possiblycause a false split-brain condition. If both the Intel NIC andthe LAN switch are set to auto detect, they will negotiate at100 Mbps full-duplex. The difference between half- and full-duplex can double the performance.

False Split-Brain Detection If LAN heartbeat packets are lost for periods of time longerthan the default threshold but the LAN hardware is function-ing correctly, a false split-brain condition can occur. In thiscase, the LAN has not failed, but the nodes can lose contactwith each other as they would in a failure. False split-brainconditions can occur from improper tuning or configuration.

Possible causes of the problem include:� Insufficient number of service processes� Insufficient number of packet receive buffers� LAN switch that delays or drops broadcast packets� Extremely high LAN utilization/packet storms� Unstable LAN driver� CPU utilization running very high� Non-optimal LAN driver load order� Non-optimal placement of the NIC on PCI bus� NIC that can transmit but not receive packets

False Node Failure DetectionTry increasing the cluster tolerance and slave watchdogparameters, which are normally equal. Configure toleranceand slave watchdog parameters to a value larger than thetime the server may be staying in real mode or longer thanthe server’s CPU hog value.

Set the CPU HOG TIMEOUT server parameter to a valueequal to the heartbeat threshold parameter (default is eightseconds). This will help identify NetWare Loadable Module®

(NLM) or driver problems.The following is an example of conditions caused by false

node failure detection:

Ate Poison Pill in SbdProposeView given bysome other node.

Ate Poison Pill in SbdWriteNodeTick given bysome other node.

This example is a special-case two-node split-brain con-dition. The node that eats this poison pill has lost thetiebreaker because its LAN link was determined to havefailed. The other node wins. This logic is available only withthe NWCS 1.01 two-node tiebreaker patch described inNovell Technical Information Document 2956097. Withoutthis patch, the master node will always win whether or not ithas good LAN connectivity.

www.dell.com/powersolutions PowerSolutions 31

;———————————————————————————————————; Number of Processing Threads; Sets how many threads the POA spawns for handling ; Message Files. The default is 5;———————————————————————————————————/threads-12

;———————————————————————————————————; Maximum Physical Connections; Number of physical Client/Server connections the server ; will allow. The default is 512;———————————————————————————————————/maxphysconns-1500

;———————————————————————————————————; Number of TCP Processing Threads; Sets how many threads the POA spawns for handling ; Client\Server requests. The default is 6;———————————————————————————————————/tcpthreads-10

PO1.POA FILE

Figure 4. Po1.poa File

Page 33: Building Your Internet Data Center

Monitoring Link Support Layer StatisticsIf failover is a result of the heartbeat timing out, you mayneed to diagnose further to determine the root cause. Ifclustering video servers, you may need to simply adjustthe timing or, in some cases, add a dedicated LAN seg-ment for the heartbeat. If the packet receive buffer isexceeded, the heartbeat may not be able to indicate thata node is functioning normally, which would cause theresource to fail to another node in the cluster. A bad NICcould also be the cause.

NetWare 5.1 has a diagnostic command called LSLSTATS.The LSLSTATS console command will open an LSL StatisticsMonitor screen that details packet receive buffer utilization.This can help to diagnose false split-brain conditions causedby receive buffer exhaustion. If a server has an abend, the

following kernel debugger commands will also provide LSLpacket receive buffer statistics: � \h—Provides Link Support Layer (LSL) help� \e—Lists Event Control Block (ECB) (receive buffer)

information� \u—Lists ECNs (receive buffers) in use outside the LSL

Richard Lang ([email protected]) is the technical market-ing manager responsible for the technical relationship of theNovell Alliances at Dell. Richard has a B.S. in ManagementInformation Systems from Kennedy Western University, anApplied Science Degree in Electrical Engineering from the DeVryInstitute of Technology, and is a Microsoft Certified SystemsEngineer (MCSE).

32 PowerSolutions

The following steps install ATF in both client and server

computers.

Client Installation—Windows NT/Windows 2000

1. Insert CD and allow it to autorun

2. Select Dell OpenManage Application Transparent Failover

3. Select Install ATF for NetWare

4. Follow the prompts and reboot

Before proceeding to the server install, you must create

driver diskettes using the following steps:

1. Insert CD and allow it to autorun

2. Select Dell OpenManage Application Transparent Failover

3. Select Install ATF for NetWare

4. Select Create SCSISAN High Availability Driver for NetWare

Server Installation (SCSISAN.CDM)

1. For NetWare 4.2, type load install at the server console

prompt

2. For NetWare 5.x, type load nwconfig at the server console

prompt

3. Choose driver options (load/unload disk and network drivers)

4. Choose Configure disk and storage device drivers

5. Choose Select an additional driver (Ignore parsing errors

because these errors are normal)

6. Install SCSISAN.CDM from the driver disk you created

Restoring LUNs with HADM.NLM

1. Copy the HADM.NLM file to the SYS volume in a directory

that you create on the NetWare server

2. At the server console, type SYS:\(directory

name)\HADM list

3. Identify the SSN number of the failed PowerVault 65xF

system (see Figure 5)

4. Enter command to restore LUNs: Sys:\(directory

name)\hadm restore 1 (1 is the SSN number from step

3) (see Figure 6)

Figure 5. Sample Console Screen for Identifying SSN Number of Failed PowerVault 65xFSystem

Figure 6. Sample Console Screen after Entering Command to Restore LUNs

Storage Devices with dual SPa

--------------------------------------------------

ssn SpSignature Peer SpSignature Node Name

1 e3672a31 72393031 [V596-A4-D0:0]

<Press any key to close screen>

Started restore for LUNs on storage system 1

Completed restore for LUNs on storage system 1

<Press any key to close screen>

INSTALLATION OF AUTOMATIC TRANSITION FAILOVER

Page 34: Building Your Internet Data Center

34 PowerSolutions

Many organizations are currently implementing or plan-ning enterprise-wide migrations to Windows 2000

Professional, to take advantage of its enhanced reliabilityand performance, ease-of-use, new security features, andunparalleled support for mobile users.

IT departments involved in such strategic initiativesneed a simple and cost-effective method to implement thismigration. Dell OpenManage solutions, such as ON Com-mand CCM™ from ON Technology, help solve the short-term migration problem—while providing a software deliv-ery infrastructure that can be leveraged for future delivery ofany software throughout their networks.

ON Command CCM Delivers Software Across Networks ON Command CCM is an enterprise-class system forremotely delivering applications and operating systems acrosscorporate networks to desktops, mobile PCs, handhelds, andservers (see Figure 1). As part of the Dell OpenManage pro-gram, ON Command CCM is now available pre-installedfrom Dell and can be purchased via Dell’s enterprise salesorganization.1 It is a proven solution that is currently beingused to deliver software to over 500,000 PCs worldwide in arange of environments from workgroup to enterprise-wide.

The system uses “scheduled push” technology to deliversoftware to multiple PCs simultaneously, from centralizedWindows NT or Windows 2000 servers. IT administratorscan remotely “wake-up” PCs, schedule deployments for

Migrating toWindows 2000—

Automatically

This article describes how organizations can leverage proven solutions from

Dell OpenManage to automate their migration to Windows 2000—allowing

them to get it done faster and more cost-effectively while minimizing demands

on valuable IT personnel.

By Phil Neray

E N T E R P R I S E M A N A G E M E N T

Figure 1. The Corporate IT Delivery Model

NetworkServices

Mobile Accessvia Dial-Up

or Internet (VPN)

Remote CCMServer and Depot

Remote Branch Offices

Master CCMConsole, Server, and

Software Depot

Corporate LAN

1 For more information on this new Image Management initiative from Dell, see www.dell.com/imagemanagement. A free trial download version is also available from www.on.com/wedeliver

Page 35: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 35

off-hours, and tightly control bandwidth usage to minimizeimpact on network resources and end-user productivity.

The administrative user interface is a snap-in to theMicrosoft Management Console (MMC), as shown in Figure 2,providing a standard and consistent interface for all Windows2000 management functions. The system is based on an openand scalable client/server architecture as shown in the sidebar.ON Command CCM also leverages group information con-tained in either Windows NT Domains or Active Directory(Note: It does not require implementation of Active Directory).Automating Windows 2000 Migration—For New Dell PCsImagine that you have just ordered several thousand newPCs from Dell, which are scheduled to arrive with Windows2000 pre-installed on the hard drives. Your challenge nowbecomes “how do I install all of my corporate applicationsand configure them for my particular environment, withoutphysically visiting each PC?”

First you can have the lightweight ON Agent pre-installed on your PCs along with the system image beforethey are shipped from the factory. An alternative is toautomatically install the agent from a simple login scriptwhen the PCs arrive at your site.

When the PCs are plugged into the network and bootup, the agent automatically finds its ON Command CCMserver via multicast and identifies itself to the server. Theserver then automatically incorporates each PC into itsInformation Database, using the Media Access Control(MAC) address as a unique identifier.

Using the ON Command CCM console, you can nowassign applications and configuration settings to each PC ona per-PC or group-wide basis (see Figure 3). For example, agroup of core applications (such as Office 2000, Lotus Notes®,Novell Client, or Adobe® Acrobat®) might be assigned to allPCs and a group of sales applications (such as Oracle, SAP,Siebel®, or Citrix thin-client) to only the sales PCs.

You can also configure PCs remotely with department- orcompany-specific settings such as default printers, preferredNetWare servers, browser home pages, or screen savers, viadrop-down menus from the administrative console—no needto edit complex scripts.

The ON Command CCM server pushes software and con-figuration actions to each PC for unattended installation bythe ON Desktop Agent. The server also maintains a detailedhistory of all transactions on a per-PC basis in its InformationDatabase. Administrators can extract data from this databasein standard SQL format in order to create queries and reportsand prepare for future deployments (for example, by creatingtarget distribution groups and checking for prerequisite soft-ware). A single click in the administrative console can alsorebuild PCs to a predefined state if a problem occurs, such asend-user misconfiguration or virus corruption.

Automating Windows 2000 Migrations—For Existing PCs So now that you’ve taken care of your new PCs, how do youmigrate all your existing PCs to Windows 2000? The process isas simple as assigning the new operating system with applica-tions to all target PCs from the administrative console.

By remotely waking up machines and controlling themimmediately after network boot, ON Command CCMmakes it possible to build machines over the network ratherthan requiring a visit to each computer. (See the sidebar fora discussion of how ON Command CCM leverages built-in

Figure 2. ON Command CCM Administrative Console (MMC snap-in)

Figure 3. Assigning Applications via Point-and-Click Menus

Page 36: Building Your Internet Data Center

36 PowerSolutions

manageability hardware contained in the BIOS of Dell PCsto accomplish this.)

ON Command CCM performs remote unattendedinstalls of Windows 2000 with complete control over howthe operating system gets installed—for example, DynamicHost Configuration Protocol (DHCP) and Domain NameServer (DNS) settings, System IDs, and Time zones) againwithout editing complex scripts (see Figure 4).

Once the OS has been installed and configured, you canuse ON Command CCM to remotely install and configureapplications as described in the previous section, therebycompleting the migration to Windows 2000.

Mobile and Remote PCs Most IT organizations support users in remote locations andmobile users that typically access corporate LANs from low-speed dial-up connections. ON Command CCM providesseveral options for addressing these bandwidth-limited situa-tions. For example, users can receive unattended applica-tion installs from a CD inserted into the computer or froma local file server—all under the control of a centralizedON Command CCM server.

ImageCast IC3 from StorageSoft Dell’s image management offering also includes a robustdisk imaging (“cloning”) solution from StorageSoft calledImageCast IC3 that can be used to build new or existingPCs with Windows 2000. The system image may include a

set of base applications and the ON Agent. ON CommandCCM can then be used to configure the PC and deployadditional applications.

Migrating User Settings with Desktop DNA Miramar’s Desktop DNA™ software utility can migrate a PC“personality” such as user preferences and user data files from

ON Command CCM’s open and scalable client/server architecture

operates at three levels (see Figure 5):

Administrative layer. The administrative console is a snap-in for

the Microsoft Management Console (MMC) . The server can also be

administered from third-party consoles and customer-written pro-

grams created with the Command Line Interface (CLI)

Server layer. The ON Command CCM Server includes an Information

Database for tracking all installation and configuration actions on man-

aged PCs; a Software Depot with all operating systems and applications

delivered to managed PCs; and Support Services for remotely adminis-

tering PCs over the network.

Client PC layer. The ON Command CCM Pre-OS Agent is used for

pre-boot operations such as installing and configuring operating sys-

tems; the Desktop Agent is used for installing and configuring appli-

cations. The PXE Boot Agent resides in system BIOS and allows PCs

to boot over the network to ON Command CCM servers, even in the

absence of a working operating system.

ON COMMAND CCM ARCHITECTURE

Command Line Interface (CLI)Administrative Console

(MMC snap-in)

Tivoli9 or CAConsole

Customer WrittenPrograms

API

ON Command CCM Server(Information Database, Software Depot,

and Support Services

Pre-OSAgent

DesktopAgent

PXE Boot Agent

Managed Device

Administrative Layer

Server Layer

Client PC Layer Protocols: DHCP, IP,Remote Wake-Up (RWU)

Figure 4. Installing and Configuring Windows 2000 from the Administrative Console

Figure 5. ON Command CCM Architecture

Page 37: Building Your Internet Data Center

one PC to another. ON Command CCM can automate thisprocess for multiple PCs simultaneously by installing andrunning Desktop DNA from the centralized ON Commandconsole—first capturing the settings on existing Windows 95PCs, then migrating the settings to new Windows 2000 PCs.

Building for the FutureTechnology is constantly changing, creating an ongoing needto deliver new software—such as service packs, updated driversand virus signatures, and new versions of applications—to allof your users.

How do you “get ahead of the curve”? By anticipatingchange and planning for it. Once you have implementedON Command CCM to automate the migration to Windows2000, you can leverage the same infrastructure to instantlypush all updates to users with the simple click of a mouse—over the network. The end result? You have successfullyleveraged technology to make your organization more com-petitive—your end-users are happier and more productive,your valuable IT staff is free to focus on next-generationstrategic initiatives, and you have dramatically lowered thecost of managing your network.

Phil Neray ([email protected]) is vice president of marketing forON Technology Corp. He has over 20 years of experience work-ing with leading-edge hardware and software companies in bothUNIX® and Microsoft environments. Phil has a B.S. in Electri-cal Engineering from McGill University with additional graduate-level studies in Economics and Accounting from the MBA pro-gram at Boston University.

38 PowerSolutions

Dell’s commercial PCs incorporate advanced manageability

features based on the Intel Wired for Management (WfM)

specification. ON Command CCM supports WfM features

such as:

Pre-boot Execution Environment (PXE), a network

booting mechanism that allows the ON Command CCM

server to take control of the PC even in the absence of a

working operating system. PXE, based on the standard

Dynamic Host Configuration Protocol (DHCP) protocol, per-

mits the ON Command CCM server to download and install

new operating systems, format and partition hard drives,

and flash system BIOS

Remote Wake-Up (RWU), which allows servers to

remotely wake up PCs by sending a specialized network

packet over local area networks (LANs)

Desktop Management Interface (DMI) 2.0, which

provides a real-time view of installed hardware and

software—such as processor type, memory, and installed

operating system—from a remote console, using the Dell

OpenManage client

Boot Integrity Services (BIS), part of the WfM Version

2.0 specification and planned for future implementation on

Dell PCs. BIS is designed for security-conscious environ-

ments and provides authentication and encryption of PXE-

based management software

LEVERAGING THE BUILT-IN MANAGEABILITY OF DELL HARDWARE

Windows 2000 Professional is designed to be the high-perform-

ance and secure network client operating system for businesses

of all sizes. It combines the best business features of Windows

98—plug and play, ease-of-use, and power management—with the

security, manageability, and reliability strengths of Windows NT.

Windows 2000 Professional provides numerous enhance-

ments compared to its 32-bit predecessors, including:� 25 percent better performance and faster multitasking

compared to Windows 9x (on systems with 64 MB or more

of memory)� The Encrypting File System (EFS), which allows users to

encrypt individual files, folders or their entire hard drives, to

restrict access to sensitive information� The Windows Installer service, which works with IntelliMirror,

Active Directory, and suitably configured applications to

create self-healing applications� Integrated Web features, such as Microsoft Internet

Explorer 5 and Automated Proxy, which reduces Help desk

calls by automatically configuring users’ Web browsers to

connect to a proxy server

Mobile and laptop users also benefit from enhancements

specifically designed for road warriors, including:� Off-line files and folders that automatically synchronize with

network folders whenever you connect to the corporate net-

work, allowing you to work on the latest copies when you

are off-line (on an airplane, for example)� Hibernate Mode, in which you can turn off your computer and

your programs and settings restore exactly as you left them� Built-in support for virtual private networking (VPN), provid-

ing secure access to corporate intranets and extranets via

the public Internet

For more detailed information about Windows 2000, see

the March issue of Dell Power Solutions or visit

www.dell.com/powersolutions.

WHY WINDOWS 2000 PROFESSIONAL

Page 38: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 39

Microsoft .NET Enterprise Servers help organizations tointegrate, manage, and Web-enable the enterprise—

fast. This is the first product delivered from Microsoft withinthe .NET vision, which will enable highly interconnecteddistributed computing over the Internet.

The Microsoft platform can help customers get theirenterprise Web solutions to market very quickly. The .NETEnterprise Servers family of products has the key elementsfor success in the time-to-market equation.

The .NET Enterprise ServersThe .NET Enterprise Servers are a comprehensive familyof server applications for quickly building and managingan integrated, Web-enabled enterprise. Microsoft builtthe applications from the ground up for interoperabilityusing today’s Web standards, with built-in support forXML to attain the highest levels of integration and interoperability. .NET Enterprise Servers are production-ready, out-of-the-box applications. The core .NETEnterprise Servers include:

� Application Center 2000� BizTalk™ Server 2000� SQL Server 2000� Commerce Server 2000� Exchange 2000 Server� Host Integration Server 2000� Internet Security and Acceleration Server 2000� Mobile Information 2001 Server

Scaling Up and Scaling OutThe Windows 2000 Server family and the .NET EnterpriseServers support scalability by adding software scaling—scaling out—to the traditional hardware scaling—scaling up—paradigm. Scaling out uses software to balance loads across mul-tiple machines in server clusters for each tier of an application.� Network Load Balancing (NLB): Provides load-

balance access to the application by grouping Webservers. Windows 2000 Advanced Server and Windows 2000 Datacenter Server scale out by distribut-ing incoming Internet Protocol (IP) traffic across a farm

Next-GenerationInternet:

Microsoft .NETEnterprise Servers

and Windows 2000

Microsoft .NET Enterprise Servers are a comprehensive family of server appli-

cations for building, deploying, and managing scalable, integrated Web solu-

tions with fast time-to-market. Using .NET Enterprise Servers, companies can

cost-effectively and efficiently integrate Web solutions such as e-commerce,

supply chain management, purchasing, business-to-business, and enterprise

resource planning (ERP) into their enterprises.

By Darcy Gibbons Burner

I N T E R N E T E N V I R O N M E N T

Page 39: Building Your Internet Data Center

40 PowerSolutions

of load-balanced servers, with incrementally expandedcapacity by adding servers to the farm using NLB.

� Component Load Balancing (CLB): Dynamically load-balances the business logic across a cluster of applicationservers. Application Center 2000 supplements the NLBsupport in Windows 2000 by supporting CLB and cen-tralized management for both kinds of scaling out.

� Microsoft Clustering Services (MSCS): Eliminates asingle point of failure by creating a single logical databasefrom two distinct servers, thereby providing hardwareredundancy and failover capability. SQL Server 2000completes the scenario through MSCS, which enablesthe creation of a single database across multiple servers.

These technologies for scaling out are combined withWindows 2000 Datacenter Server’s support for up to 32processors and 64 GB of memory, offering a complete scalingsolution for the enterprise.

Database Scalability Enhances ApplicationsThe performance of SQL Server 20001 has demonstrated itsscalability in several independent tests:� JD Edwards® OneWorld® Concurrent Users’ Benchmark� SAP R/3 SD Benchmarks� SAP R/3 Retail Benchmark � PeopleSoft® HRMS Benchmark � PeopleSoft Financials Benchmark� TPC-C record for a single symmetric multiprocessor

(SMP) server� All of the top 45 positions in the TPC-C price perfor-

mance category

SQL Server 2000 supports distributed partitioned views,which provide e-commerce customers with unlimited scalabil-ity by dividing the workload across multiple independent SQLServer-based servers. It also supports materialized views in theSQL Server 2000 relational engine to improve the perform-ance of complex queries in very large database environments.

Availability and ReliabilityThe key to Web site availability is a successful architecturethat keeps the entire application running if a single systemfails. Optimum availability has two components: reliable indi-vidual machines that can recover from failures, and the elimi-nation of any single point of failure. .NET Enterprise Serversthat run on Windows 2000 Server address both issues.

The Windows 2000 enhancements—such as auto-restartof system services, Web-Based Enterprise Management(WBEM), and system file protection—emphasize reliability

on a single machine. Additionally, the same clustering serv-ices that enhance scalability also increase availability of theentire system by eliminating any single point of failure.

Queuing and Events Support Enhances Availability .NET Enterprise Servers support queuing technologies andloosely coupled events. COM+ Queued Components providean easy way to invoke and execute components asynchro-nously. The availability or accessibility of the sender orreceiver does not affect processing. A home shopping net-work, for example, might benefit from asynchronous process-ing; that is, viewers phone in to several operators, orders aretaken en masse, and they are then queued for later retrievaland processing by the server.

COM+ Events is a loosely coupled events system thatstores event information from different publishers within anevent store in the COM+ catalog. Subscribers query thisstore and select the events for which they want information,which results in a subscription.

When an event occurs, the event system checks this databaseto find the interested subscribers, creates a new object for eachinterested class, and calls a method on that object. The COM+Events design avoids the disadvantages of tightly coupled events(TCE), such as polling, the need for both publisher and sub-scriber to be running at all times, divergent callback mecha-nisms, and the inability to filter or intercept an event.

ManageabilityMicrosoft Application Center 2000 enhances manageabil-ity of applications by providing a unified interface formanaging server farms using both NLB and CLB, and by

1See www.microsoft.com/sql/productinfo/Worldrecord.htm for additional information about SQL Server scalability and performance.

� Scalability—efficiently and effectively addressingincreased system demands

� Availability and Reliability—customer confidence in usingWeb applications for critical needs with minimal downtime

� Manageability—efficient and painless deployment, ongoing operation and maintenance

� Security—enterprise-critical systems not disrupted or compromised by the bad guys (or the clueless)

� Integration—the ability to leverage existing technologyand personnel investments

� Productivity—the ability to build solutions as quickly, inexpensively, and efficiently as possible

� Adaptability—the ability to leverage today’s applica-tions as the Web evolves

KEY REQUIREMENTS FOR A SUCCESSFUL WEB SOLUTION

Page 40: Building Your Internet Data Center

automatically propagating application changes across allservers in a cluster.

Security .NET Enterprise Servers provide access to the major securitytechnologies in Windows 2000 Server, including 56-bit and128-bit Secure Sockets Layer (SSL)/Transport Layer Security(TLS), IPSec, Server gated cryptography, Digest authentica-tion, Kerberos 5 authentication, and Fortezza.

The Certificate Server in Internet Information Server 5.0 is a critical part of a public key infrastructure (PKI) that allowscustomers to issue their own x.509 certificates to their users forPKI functionality, such as certificate-based authentication, IPSec,and secure electronic mail. Integration with Active Directorygreatly simplifies the process of user enrollment for administrators.

Data Access and Integration.NET Enterprise Servers can help developers leverage invest-ments in hardware, software, training, and resources and buildupon their existing systems. It addresses four general types ofinteroperability: data access and integration; host integration;application integration; and network and security integration.

Accessing Data Throughout the EnterpriseMicrosoft uses COM components to provide one consis-tent programming model for access to any type of data,regardless of where that data may be found in the enter-prise. These components are both tool- and language-independent. They include the following: � OLE DB: Microsoft’s low-level interface to data across

the organization which builds on the success of opendatabase connectivity (ODBC) by providing an openstandard for accessing all kinds of data

� ODBC: The widely used interface for accessing structured data in more than 50 different databases,including Microsoft SQL Server, Oracle, Sybase®,Informix®, and DB2®

� ActiveX® Data Objects (ADO): A language-neutralobject model that exposes data raised by an underlyingOLE DB provider

� Remote Data Service (RDS): Used to transport ADOobject record sets from a server to a client computer toprovide a low-overhead, high-performance way to mar-shal record-set data over a network or the Web

� Collaboration Data Objects (CDO): A set of COM objectsthat provide access to data stored in Microsoft Exchange

XML Facilitates Data Exchange XML is a meta markup language whose structural represen-tation of data is easy to implement and deploy. It providesinteroperability through a flexible, open, standards-based

format that offers new ways to access legacy databases anddeliver data to Web clients. Applications using XML arebuilt more quickly, easier to maintain, and easily providemultiple views of the structured data.

.NET Enterprise Servers include integrated, high-performance XML support for easy data exchange betweendisparate businesses and enterprise systems. In addition,Microsoft Windows 2000 Server has a high-performanceXML parser, support for XML streaming and persistence,XML record-set translation, and support for building XMLdata islands in Microsoft Internet Explorer.

Host Integration—Critical to Interoperability Microsoft Host Integration Server 2000 helps solve the prob-lem of integrating the Windows platform with other non-Windows enterprise systems running on hosts such as IBM®

mainframes and AS/400® systems. The bidirectional integration services of Host

Integration Server 2000 free developers from platformboundaries and allow them to build highly scalable, dis-tributed applications that incorporate existing processesand data without recoding or “wrapping” existing code.

Components and Transactions—Available and Reusable COM Transaction Integrator (COMTI) uses a componentbuilder within Host Integration Server 2000. COMTIenables CICS® and Information Management System (IMS)transaction programs to be accessed as Microsoft TransactionServer (MTS) or COM+ components. As COM+ compo-nents, these objects are available and reusable by manyCOM-compliant development tools. COMTI also allowsCICS transactions to initiate MTS transactions.

Host Integration Server 2000 Component Builder providesdevelopers with a graphical, drag-and-drop environment forencapsulating existing business logic. It automatically gener-ates the appropriate COM interfaces.

Integrating Applications Enables Document Exchange Microsoft BizTalk Server 2000 enables document exchangeamong applications within an organization and across vari-ous platforms and operating systems. It also provides a stan-dard gateway for sending and receiving documents via theInternet with external trading partners.

Network and Security Integration ServicesThe network and security integration services within HostIntegration Server 2000 support Systems Network Architecture(SNA) and LAN protocols, application programming interfaces(APIs), and integrated security for password synchronizationand single sign-on. It also supports the Active Directory servicesin Windows 2000 Server.

42 PowerSolutions

Page 41: Building Your Internet Data Center

ProductivityCompanies can use Microsoft technologies and the .NETEnterprise Servers to create an outstanding productiveenvironment for developing enterprise Web applications.Microsoft tools and developer programs enable rapidapplication development and ensure high productivity forbuilding distributed Web applications.

Integrated Commerce Capabilities The .NET Enterprise Servers family provides a full set of com-merce capabilities to simplify the process of building sophisti-cated, customer-centric Internet and extranet selling sites.Business managers can use Commerce Server 2000 to betterunderstand their customers, how those customers interfacewith their site, and how to best reach existing customers aswell as attract new customers.

Some real-time marketing capabilities enabled throughCommerce Server 2000 include a high-performance personal-ization and targeting system, advanced catalog management,and sophisticated business analysis with online analytical pro-cessing (OLAP) services.

Commerce Server 2000’s partners include domainexperts who are building and delivering horizontal andvertical industry-specific applications that provide specialfunctionality (such as international tax calculations,product configuration, or shipping rates) or assist withintegration (such as interfaces with popular ERP pack-ages). The range of these solutions includes payment, tax,shipping, logistics, procurement, accounting, customermanagement, ERP, and EDI.

Productivity-Enhancing Technologies Windows 2000 Server provides a wide range of tools andtechnologies designed to make developers more produc-tive when creating applications for .NET EnterpriseServers. Active Server Pages allow developers to combineHyperText Markup Language (HTML) pages, script com-mands, and COM components to create interactive Webpages and Web-based applications that are easy to deployand modify.

Windows 2000 COM+ Services, an extension to COM,builds on the COM integrated services and features. Thismakes it easier for developers to create and use softwarecomponents in any language, using any tool.

Windows 2000 Transaction Services work with COM+to make it easier to develop and deploy server-centricapplications by offering comprehensive component functionality such as automatic transaction support fordata-integrity protection; simple role-based security; andaccess to popular databases, message queuing products, and mainframe-based applications.

AdaptabilityThe .NET Enterprise Servers are the first Microsoft deliver-able on the .NET vision, which will enable highly intercon-nected distributed computing over the Internet. .NET willenable computing to move beyond today’s world of stand-alone Web sites to an Internet of interchangeable compo-nents where devices and services can be assembled intocohesive, user-driven experiences.

The Next-Generation Internet.NET Enterprise Servers are a comprehensive family of serverapplications for building, deploying, and managing scalable,integrated Web solutions with fast time-to-market. .NETEnterprise Servers support interoperability, productivity, scala-bility, availability and reliability, manageability, security, priceperformance, and adaptability. This new family of applicationsoffers the building blocks for the next-generation Internet.

Darcy Gibbons Burner ([email protected]) is a market-ing manager in Microsoft’s eCommerce marketing organization.She has worked on development environments for a wide varietyof operating systems and programming languages at companiessuch as CenterLine® Software, Lotus® Development, andAsymetrix®, where she has been a system administrator, devel-oper, trainer, and product manager. Darcy has a B.A. inComputer Science from Harvard University.

www.dell.com/powersolutions PowerSolutions 43

For more information, technical support, and service, visit

Microsoft at www.microsoft.com.

The following business scenarios describe some typical situ-

ations in which companies can use BizTalk Server 2000 as a

platform to build their business-to-business (B2B) solutions:

� Trading partner integration through Web-based or tradi-

tional electronic data interchange (EDI), supply chain

integration, order management, invoicing, and shipping

coordination

� Automated procurement including Maintenance Repair

and Operations (MRO), pricing and purchasing, order

tracking, and government procurement

� B2B portals, including enterprise portals and extranets,

trading communities, electronic catalog management, con-

tent syndication, and post-sale customer management

� Business process integration, such as commerce site to

Enterprise Resource Planning (ERP) integration, commerce

site to legacy integration, and ERP-to-ERP integration

SOLUTIONS USING BIZTALK SERVER 2000

Page 42: Building Your Internet Data Center

44 PowerSolutions

Continuous availability of mission-critical applications isvital to sustaining a competitive edge in today’s global

marketplace. Microsoft Cluster Server (MSCS) provides crucial application failover capabilities to Windows® 2000 and Windows NT™ platforms. NSI® Software GeoCluster™extends MSCS capabilities over distance for added protectionagainst hardware failures and site disasters. Like NSI’s Double-Take® replication product—which provides continuous, byte-level data replication over IP network connections—GeoCluster is designed to provide the most efficient replica-tion possible to maintain maximum application performance.

MSCS OverviewMSCS was introduced in Windows NT Server, EnterpriseEdition 4.0. It supports the connection of two servers into a clus-ter and provides an architecture for developing and deployinghighly available applications. A future version of MSCS willsupport larger clusters. Two major requirements for a clusteredapplication are location independence and shared data access.

Location IndependenceLocation independence enables users to access an application orservice at any location within a cluster. With MSCS, location

independence is achieved by creating virtual servers withunique server names and IP addresses that can be moved fromone node to the other as required without changing the physi-cal identity of each node. Any node can create and simultane-ously host multiple virtual servers, and a single virtual servercan be moved without affecting others in the cluster. MSCSorganizes cluster resources such as network names, IP addresses,disks, and applications into cluster groups, which can bebrought online on one node of the cluster or the other.

Shared Data AccessAnother cluster requirement is that application data andmetadata be available to each node in the cluster. For applica-tion metadata, MSCS provides an application programminginterface (API) for cluster-aware applications to write to ashared portion of the registry that is made available to allnodes of the cluster. This API eliminates the need for manualsynchronization of registry entries between nodes.

For application data, MSCS provides all cluster nodeswith access to data by giving each node in the cluster theability to access a shared disk via a common SCSI or FibreChannel bus. Although a shared disk is actually available toall nodes, MSCS is considered a shared-nothing cluster

Geographic Applicationand Data Availability

with GeoClusterand MSCS

Microsoft Cluster Server (MSCS) enhances the availability of applications

based on Windows NT and Windows 2000, but is still vulnerable to certain

types of failures. This article explains how GeoCluster, from NSI Software,

can provide an added layer of protection.

By Andrew Thibault and David Demlow

D A T A C E N T E R E N V I R O N M E N T

Page 43: Building Your Internet Data Center

architecture since the cluster software prevents data corrup-tion by allowing only one node to access the disk at a time.A true shared-disk cluster would allow multiple nodes toaccess the same disk simultaneously using a distributed lockmanager or other mechanism to control access.

A standard MSCS cluster, as illustrated in Figure 1, hasphysical disk resources that control the ownership of eachshared disk within the cluster. For example, if the clustergroup SQL that contains the physical disk resources S: and L: is brought online on node 1, drive S: and drive L: willbecome accessible to node 1 but will be inaccessible fromnode 2 at that time.

Shared-Storage LimitationsThe architecture of a standard MSCS shared-disk clusterprotects against the failure of an individual server by allow-ing the other cluster node to take control of the disk andrestart the application. But it leaves the cluster vulnerable toother failures that could leave the shared data unavailable.Specifically, because there is only one copy of the data withthis architecture, both the storage of that data and the con-nection to the storage are potential points of failure.

In addition, the distance between the servers and storage isconstrained by both the physical media limitations (rangingfrom 75 feet for SCSI to 10 km for Fibre Channel) and theperformance limitations resulting from the round-trip latencyintroduced by accessing storage at the speed of light over anincreasing distance. Even with massive amounts of bandwidth,reading from and writing to a disk 10 km away adds a measur-able and significant delay to each I/O operation that can beunacceptable for many of today’s I/O-intensive applications.

Geographic Data Redundancy with GeoClusterNSI Software’s new GeoCluster software enhances MSCS by allowing each node to maintain its own storage with anindependent, synchronized copy of the cluster data. Becausemultiple copies of the data exist, administrators can protectagainst even catastrophic site-level failures by locating theother node of the cluster with its own data at an alternativefacility (see Figure 2). The nodes are connected and data isreplicated over an IP connection that can be a LAN forshort distances or a Fiber Distributed Data Interface (FDDI)or virtual LAN (VLAN) connection for greater distances.Typically a private IP connection is created to isolate repli-cation traffic from production networks.

Just as MSCS provides a physical disk resource that con-trols ownership and access to specified disks, GeoClusterprovides a replicated disk resource that performs the samefunctions. If a group containing the replicated disk S: isbrought online on node 1, all changes made to drive S: onnode 1 are automatically and continuously replicated to

drive S: on node 2. In addition, drive S: on node 2 is pro-tected against accidental modifications since it is an off-linereplica. If that replicated disk is moved to node 2, theprocess continues in the opposite direction.

Cluster-Aware ApplicationsAlthough an application not specifically designed for clusteroperation can often run in a clustered environment with cer-tain modifications, the real benefits of clustering are achievedwhen the application communicates its status to the clustersoftware and monitors the status of the cluster itself. MSCSprovides a simple API that enables developers to add this levelof cluster awareness to their applications.

Failover of non-cluster-aware applications typically requiresthe installation of an idle copy of the application on thesecond node to ensure that the correct executable files,dynamic link libraries (DLLs), and registry settings are present.The second copy of the application must remain in an idlestate until failover. Another approach is to reverse-engineerthe application, determine what registry keys and DLLs arerequired to move the application to another location, andmanually copy those keys and DLLs.

Either approach involves risk and requires careful atten-tion to detail and testing. A single configuration change,service pack, or patch applied on one node that is notmatched on the other may prevent a successful failover. Anapproach that worked for one version of an application maynot work at all after a simple patch.

Creating cluster-aware applications allows developers toconsider and control the behavior of their applications in acluster environment. Many cluster-aware applications allow

46 PowerSolutions

PowerEdgeServers

PowerVaultFibre Channel

Switches

PowerVaultStorage

MSCS Active/ActiveCluster

Node 2Node 1

Figure 1. Standard MSCS Shared-Disk Cluster

Page 44: Building Your Internet Data Center

multiple instances of the application to run on one cluster oreven on a single cluster node. With Microsoft SQL Server 7or later versions, for example, failover can occur at the indi-vidual database level. For instance, the sales and accountingdatabases might run on node 1 in normal operations, whileinventory and a Web back-end run on node 2. During end-of-month accounting operations, however, the sales databasecould be temporarily shifted to a different node to balancethe workload more evenly. In addition, the applicationdevelopers can provide their own application-monitoringalgorithms to ensure that their applications are not onlyactive but healthy and responsive, and to trigger failover orother corrective actions as necessary.

With a tight level of application integration and the useof one standard, high-availability platform for developmentand testing, the cluster-aware technology can be used formore than emergency failover situations. It can even helpeliminate scheduled downtime—often a major cause ofserver and application downtime. Many applications providea rolling upgrade process in which the idle node of a clusteris upgraded while off-line, and then becomes the active nodewhile the original active node is being upgraded. Microsoftprovides a process to perform a rolling upgrade of the nodesin a Windows NT 4.0 cluster to Windows 2000.

Cluster Quorum ResourceIn any cluster, one resource is designated as the cluster quorumresource and is required for cluster operation. The quorumresource is used as an arbitrator in the event that the nodeslose communication with each other or otherwise challengeownership of a given resource (see Figure 3). The quorumresource performs special roles within the cluster and is

designed to prevent it from operating as a split-brain cluster (inwhich both nodes try to bring the same resources online at thesame time, potentially causing data corruption and conflicts).

For example, if a node comes online and is unable tocommunicate with the other node in the cluster to deter-mine whether it is live, the node that is coming online willforce the cluster to arbitrate by trying to access the quorumresource. In a standard MSCS cluster, one physical disk actsas the quorum resource. For arbitration, a SCSI bus reset isperformed, followed by a delay of 10 seconds to determinewhether the other node of the cluster performs anotherSCSI reserve command of the quorum disk.

If it does, the challenging node knows the other node islive. If the other node does not reserve the quorum drive,the challenging node reserves it, becomes the new clusterquorum resource, and forms the cluster. If the cluster diskbecomes unavailable, the cluster cannot continue to operateand will shut down.

To provide the arbitration function in a distributedGeoCluster, one replicated disk resource is specified as thecluster quorum and configured to monitor multiple locationson the network called arbitration shares. These arbitrationshares are standard Universal Naming Convention (UNC) fileshares that are accessible by both nodes in the cluster over thenetwork. For a node to take ownership of the quorum resource,it must be able to access a majority of the configured arbitra-tion shares. Therefore, multiple arbitration shares should belocated throughout the network for redundancy.

Solutions: MS Exchange and SQL SupportAlthough many applications can be run on a cluster, two of the most common and critical are Microsoft Exchange

www.dell.com/powersolutions PowerSolutions 47

PowerEdgeServer

Node 1

PowerEdgeServer

PowerVault Fibre Channel Switches

PowerVaultStorage

Node 2

LAN or VLAN

Site 1 Site 2

PowerVaultStorage

Figure 2. GeoCluster Distance Cluster

Page 45: Building Your Internet Data Center

Server and Microsoft SQL Server. Both are ideal candidates forstretch clustering using the MSCS and GeoCluster combination.

The Exchange Server 5.5 engine itself is not fully cluster aware, but Microsoft has released a cluster-compatibleversion of Exchange 5.5 with services that configure withinthe cluster. It provides active/passive failover of theExchange services. Exchange 2000 is fully cluster awareand capable of active/active failover. It adds the ability ofan Exchange server to host multiple message stores, andallows individual message stores to be moved independ-ently among cluster nodes.

SQL Server 7.0 and later versions are fully cluster-awareapplications. An active/active configuration of SQL can besupported for failover, and individual virtual servers can bemoved independently of any other virtual servers to enablemanual load balancing.

GeoCluster and Double-Take ComparisonThe most significant difference between NSI’s GeoClusterand Double-Take products is that GeoCluster is designedexclusively for Windows 2000 Advanced Server andWindows NT Server, Enterprise Edition 4.0, whereas Double-Take runs on Windows 2000, Windows NT, NetWare, andSolaris® platforms. Double-Take provides failover capabilitiesin the event of a failure by assigning the entire identity of adowned source server to a target server and allowing applica-tions to be automatically started on the target server. Since itis not using cluster-aware application versions, however, itcan neither fail over individual application instances without

failing over an entire server nor create virtual servers asMSCS and GeoCluster do.

On the other hand, Double-Take offers greater flexibilityin selecting data for replication. Double-Take enables replica-tion of individual files, directories, and volumes, whereasGeoCluster only replicates entire volumes. Double-Take alsoprovides many-to-one failover, whereas GeoCluster’s depend-ence on MSCS currently limits it to two-node clusters. WhenWindows 2000 Datacenter increases the number of nodesallowed in a cluster, GeoCluster will expand accordingly. Inaddition, while MSCS and GeoCluster require that both clus-ter nodes be connected to the same logical IP subnet (clusterheartbeats are not routable), Double-Take can be configured toreplicate across virtually any LAN or WAN topology.

GeoCluster for Added ProtectionMSCS has brought greatly enhanced availability to applicationsbased on Windows 2000 and Windows NT. It gives developersa standard platform on which to develop cluster-aware applica-tions. But because it relies on shared storage subsystems andmaintains a single copy of cluster data in a single geographiclocation, it remains vulnerable to certain types of failures.

GeoCluster extends MSCS by providing the ability tocreate an MSCS cluster with replicated data volumes.GeoCluster’s patented real-time data replication provides localcopies of clustered data stored on each node of the cluster, andthus eliminates the potential for a single point of failure.Because data is synchronized using standard TCP/IP connectiv-ity, nodes can be located almost anywhere. Companies that areconsidering adoption of Microsoft clustering technology or havealready invested in it should also consider using GeoCluster foran added layer of site-level disaster protection.

Additional ResourcesFor additional information and technical support on NSI Double-Take or GeoCluster, please call 877-723-3925 in North Americaor 33 1 47 77 15 77 in Europe, or visit the following Web sites:� Evaluation Software and Additional Information

http://www.sunbelt-software.com/dell/dt� Microsoft/Dell/NSI MSCS Disaster Recovery Video

http://www.microsoft.com/Seminar/1033/20000323DisasterAJ1/Seminar.htm

� Online Purchasinghttp://gigabuys.us.dell.com/store/catalog.asp?Manufacturer=NSI

Andrew Thibault ([email protected]) is directorof Strategic Business Development for Sunbelt Software, distribu-tor of NSI Software’s Double-Take and GeoCluster products.

David Demlow ([email protected]) is vice president ofProduct Management for NSI Software.

www.dell.com/powersolutions PowerSolutions 49

Figure 3. Functions of Cluster Quorum Resources

Q:

Q:\

Node

Node

IP

\\Server1\Quorum

\\Server3\Quorum

\\Server2\Quorum

\\Server4\Quorum

Page 46: Building Your Internet Data Center

50 PowerSolutions

Security is a primary consideration during networkinstallation among schools within a district. Students

may welcome the challenge of finding security lapses—andthey often will find security holes. This task becomes justanother learning tool for them. For the school district,however, security holes can be devastating, sometimes tothe point of compromising its secured, critical files. Eventhe best efforts of network administrators may not beenough to combat these issues if the district employs afaulty network design.

To secure its new network, Bay City Public Schools(BCPS) implemented NetWare 5.1, with Novell DirectoryServices (NDS) 8 as the directory service tying together all the network pieces. This solution provided the requiredsecurity and the advantages of central administration andlogical resource organization for the many students and faculty who would be using the network every day.

Since installation of the NetWare 5.1 and NDS 8 network has become more complicated than the admin-istration, BCPS turned to Dell Computer and MatrixIntegration (www.matrixintegration.com) for assistance.

Matrix Integration performed the on-site design, whileDell Computer provided the hardware and custom installa-tion of NetWare 5.1 and NDS 8. In addition, MatrixIntegration installed and configured the final applicationsthat would run on each server. The team of employees fromBCPS, Matrix Integration, and Dell Computer created thesmoothest possible installation.

Dell Custom Factory Integration provided the installa-tion for the school district. Some specific steps, includingtechnical details, are described in this article as well as infor-mation about the latest capability developed by CustomFactory Integration and Novell—NDS tree integration.Custom Factory Integration completed the entire NetWareinstallation and configuration process in the factory,enabling minimal OS on-site installation.

BCPS Uses a Cooperative Network BCPS used a cooperative network to provide the WAN services that connect each school. This WAN setup, whichencompasses several high-speed technologies, ensures BCPSthat each school can use various services provided for the

Dell Installs NetWare 5.1

for Bay City Public Schools

The Bay City Public School District in Bay City, Michigan, is modernizing its

computing environment to provide the best possible education for its students.

The goal is to provide students with the latest computer and networking

capabilities for e-mail and Internet access. This article describes their choice of

NetWare 5.1 using Dell Custom Factory Integration. The installation included

integrating each of the servers into the Novell Directory Services (NDS) tree

after installation, including all of the NDS settings and NetWare 5.1 applications.

By Rod Gallagher

D A T A C E N T E R E N V I R O N M E N T

Page 47: Building Your Internet Data Center

entire district. This assurance also includes outlying schools,which received the same services as those near the data center.

The installation team set up a GroupWise server at eachschool for staff and student e-mail access. Users could inter-act through Internet links with one another as well as withothers outside the school district.

ZENWorks Protects the DesktopsThe network also included ZENWorks™ to ensure thatBCPS can control and secure each desktop for student use.The students can use the desktops, but they cannot performfunctions that might harm the stability of the systems. Thiscontrol of the system helped BCPS manage the supportcosts once the students started using them. It also makesthe systems easy for students to use; their configurationsremain constant each time they log in.

Staff and students access the Internet through the WANlinks at each school, then through centralized links. TheseWAN links allow BCPS to control the content received byeach staff member and student. BCPS can also secure its net-work from malicious attacks and ensure that sensitive data issafeguarded. Each WAN link provides a central point toinstall a caching or proxy appliance to assist in the respon-siveness to the clients.

NetWare 5.1 Servers Provide Access to Network ServicesEach school received its own IP subnet. NetWare 5.1 serversperform the Domain Name Service (DNS) and DynamicHost Configuration Protocol (DHCP) functions to give theworkstations easy access to the network services provided tothem. The routers at each school ensure that the schools donot “communicate” with each other except for transmittingthe allowed traffic. The subnets at each school provide forsecurity among the sites, and the routers perform the appro-priate translation between sites. This arrangement adds somesecurity, but it also created a major problem that will be dis-cussed later in this article.

A centrally located data center holds the systems thatcontrol the NDS partitions, which provide the redundancycritical to a high-volume network. If the main server at anyschool had a catastrophic problem, the security functionalitywould automatically transfer to the data center. This backupfunctionality ensures that the network services are availableto the students even if problems occur with the local server.This is the best possible solution without moving to rela-tively high-cost clustering scenarios.

Dell Configures the Servers—at the FactoryThe proposed solution for BCPS was to build the NetWare 5.1servers at the factory. This enables complete control of allsettings and options. It also minimizes the number of touches

required on-site and dramatically lowers the implementa-tion costs.

Custom Factory Integration was able to install theSupport Pack appropriate for NetWare 5.1 and the properNetWare 5.1 applications (including DNS and DHCP). The on-site installation team then needed only to focus onthe more difficult applications without wasting significantamounts of time performing basic installation tasks.

To fully integrate the NetWare 5.1 servers, BCPS requiredthat the servers be integrated into the NDS tree that they hadpreviously set up for this purpose. The factory installationwould create objects, such as the servers, volumes, and DNS/DHCP objects, which the NDS tree integration utilitythen moved into specific locations, based on the schoolthat would be using a particular server. Figure 1 shows apartial NDS tree diagram.

www.dell.com/powersolutions PowerSolutions 51

Figure 1. Novell Directory Services Tree

Bay CityPublic Schools

Bay CityPublic Schools

Bay_city_tree

Data CenterData CenterAuburnAuburnWesternWestern

Auburn ServerWestern Server Data Center Server

SYS

DATA

SYS

DATA

SYS

DATA

DNS/DHCP

DNS/DHCP

DNS/DHCP

Print Server 3Print Server 2Print Server 1

User 1 User 2 User 3

Page 48: Building Your Internet Data Center

52 PowerSolutions

Matrix Integration performed the final applicationinstallation on-site, although it could have been doneeasily in the factory.

Response File Enables Unattended Installation of NetWare 5.x Custom Factory Integration used Novell’s response file capa-bilities for custom NetWare 5.1 installation. The IT staffprovided specific information required on each server, thenCustom Factory Integration used this information to deter-mine a unique server name, IP address, volume sizing, andNetWare application setup.

The response file enables a completely unattended instal-lation of NetWare 5.x, including each of the various defaultapplications. By modifying each response file, CustomFactory Integration could completely install each NetWareserver with unique information. Locations are provided forany piece of information that can be specified during thenormal installation process, which significantly speeds up theinstallation time.1

Installation Includes the Latest Support Pack The software configuration also included the latest SupportPack. Custom Factory Integration added it after installing allof the specific software settings and applications to ensure thatthe required updates were applied properly. Custom FactoryIntegration can include any Support Pack desired; BCPSwanted the latest version available.

Custom Factory Integration used scripting tools for acompletely automated installation of the Support Pack files.The response file process allows commands to be insertedwithin the AUTOEXEC.NCF file that will run on the firstboot after the server is completely installed. From there,STUFFKEY.NLM (provided by Novell) was used to auto-matically uncompress the Support Pack files and perform theautomated installation of the Support Pack.

Software loading and configuration were complete in lessthan 30 minutes at the factory. From the BCPS perspective,the significant advantage to this process was that each serverwas uniquely configured for specific schools within the dis-trict and could be plugged directly into the network. Theonly remaining portion to be done on-site was the final NDS

moves, discussed in the next section. Figure 2 shows theCustom Factory Integration process.

Dell Integrates NDS Tree—from the FactoryThe NDS integration from the factory was the exciting por-tion of this installation. The previous theory was that anyNDS object must be created at the actual tree in which itwould eventually reside; therefore, the factory integrationwas considered impossible. DSMerge was often tried, but theresult was corrupted NDS files.

The problem: Every server and application requires NDSto store its information. Although this benefits the on-siteadministrator—the entire network can be managed remotelyfrom a single location—it causes problems for the installa-tion partners because the server or application must “see”the on-site NDS tree. To resolve this problem, conventionalwisdom says to use a WAN link to the remote installationsite, a time-consuming and costly process. This solutionmeans only large enterprises would have the resources toaccomplish it.

Dell Uses “Fake” Tree for Integration Using the NDS integration tool, however, Custom FactoryIntegration can install the servers and applications into a“fake” tree—in reality, a temporary NDS tree. Once the con-figuration has been set (including the server, applications,users, groups, and other objects), the server is shipped to thecustomer’s site. Since this server holds the entire fake tree, thetree is also shipped to the site.

Once the server (and fake tree) arrives at the customersite, the server is connected to the network and poweredup. The server does not realize or care that it is in a differ-ent environment: it simply needs to see its own configura-tion. Once connectivity is confirmed to the remainder ofthe customer network, a simple command to kick off thetree integration utility is run. This NDS tree integrationutility takes each object from the fake tree and transfers it to a specific location in the customer’s existing tree.Effectively, the server configuration and all of its objectshave been transferred to the tree as though they had beenbuilt there at installation.

1 For more information about using response files to automate the installation of NetWare 5.1, see www.novell.com/documentation/lg/nw51/docui/index.html.

Initialcustomercontact

Fill out customNetWare 5.1

form

Fill outsite code

spreadsheet

Dell CustomFactory Integration

engineering

Factorybuildsorders

Customizedservers are

shipped

On-sitesetup

Treeintegration

is performed

Figure 2. Custom Factory Integration Process

Page 49: Building Your Internet Data Center

Security is paramount to the team that put this tooltogether. Backups of both trees are automatically madebefore any movement occurs. The tool also uses the sameprocess used by the normal NetWare 5.1 installation toprovide an extra level of security. The only possibility for a corrupted tree is if it was corrupted during a normalinstallation—an unlikely event.

Figure 3 shows the NDS integration process.

NetWare Loadable Module Refines NDS Tree IntegrationNovell and Dell have refined this NDS tree integrationprocess during many months of development. The finalresult—a NetWare Loadable Module (NLM)—provides thefunctionality.

The first step is that NLM verifies security to the tree.Administration rights must be obtained in both trees: the faketree and the on-site tree. This verification is done to facilitatethe second portion, Schema Reconciliation. The schemas ofthe two trees must agree before work can continue. Missingthis step would cause irreparable harm to both trees.

Once the schemas are the same, time synchronizationensures a timely process through the major work that needsto be done. Time synchronization can be established veryquickly, which enables the NLM to take the objects—suchas all server objects, volume objects, DNS/DHCP objects,and any other objects that may need to be moved—from thefake tree and create them in the customer’s tree.

This move inserts a tree with the entire structure of thefake tree into the customer’s tree. For example, if the objectsbeing moved were two levels down from the root of the faketree, the resulting objects will be two levels down from theroot of the new tree.

The NLM parameter file enables objects to be moved toa different location. In this scenario, objects at the root ofthe source tree (in this case, the factory-created tree) can bemoved to any location in the target tree. Customers thencan quickly insert large numbers of servers into locationsthroughout the NDS tree with a minimal amount of effort.

An Appropriate SLP Overcomes On-Site Technical IssueSeveral technical issues relating to testing had to be over-come while working on-site. These issues would not haveoccurred in a normal installation.

The IP-only backbone is one issue that could impactNetWare networks that span a WAN. BCPS, which runs anIP-only network, had concerns about services being seenacross the routers. The WAN filtered out the NDS servicemessages that would be critical to NLM’s successful integra-tion. The NLM needed to “know” the IP address of themaster NDS server as well as “know” the service was there.

The use of an appropriate Service Location Protocol(SLP) configuration would allow the NLM to “know”where the appropriate services were located. The properconfiguration of the SLP files (see references at the end ofthis article for details) on each remote server was critical tothe proper operation across the WAN.

Factory Installation Saves Time, Money, and Ensures QualityThe real question is the value of this service to BCPS. Anexperienced engineer would have easily taken at least fivehours per server to install NetWare 5.1 and the SupportPack, and to configure the various applications. In fact, itwould most likely have taken more hours. At a low rate of$150 per hour, BCPS would have spent about $750 perserver. Custom Factory Integration accomplished the same

www.dell.com/powersolutions PowerSolutions 53

Bay CityPublic Schools

Bay CityPublic Schools

Bay_city_tree

Data CenterData CenterAuburnAuburnWesternWestern

Auburn ServerWestern Server Data Center Server

SYS

DATA

SYS

DATA

SYS

DATA

DNS/DHCP

DNS/DHCP

DNS/DHCP

Print Server 3Print Server 2Print Server 1

User 1 User 2 User 3

FakeOrganization

FakeOrganization

Fake Tree

Figure 3. NDS Integration Process

Page 50: Building Your Internet Data Center

54 PowerSolutions

task for under $400 per server, saving the school district over$350 per server.

This factory installation and configuration also bene-fited the schools because the entire process is automatedand audited to ensure quality control. An on-site personmay accidentally input incorrect information, whereasautomated processes will use the exact information pre-sented by the customer. Dell’s factory process allows lesschance for human error.

Dell’s Custom Factory Integration also saves a signifi-cant amount of time: the installation and configuration is complete once the servers are on-site, with no addi-tional time requirements. On-site installation wouldrequire an additional five hours per server, which can add up to weeks of additional time and a greater chanceof problems.

The complete process saved BCPS approximately $6,000.The installation required a week, with much of that timespent troubleshooting on-site connectivity issues and variousother setup functions. The actual NDS integration couldhave been completed within the first day if the goal hadbeen a quick installation.

Dell Custom Factory Integration greatly enhanced theinstallation of Bay City Public Schools’ Novell network.

BCPS saved time and money by using Dell’s latest capabili-ties while enhancing the quality of their installation andfreeing resources to focus on other problems. CustomFactory Integration enabled BCPS to streamline theirprocess of installing a cutting-edge network.

ReferencesConfiguring a LAN/WAN Infrastructure for SLP. Novell. TID10014467.

How NetWare 5 Clients Locate NetWare Services. Novell. TID2952823.

SLP Terms and Configuration Reference. Novell. TID 2951567.

Configuring SLP for a NetWare Server. Novell. TID 2951564.

Configuring SLP for a NetWare Client. Novell. TID 2951560.

Rod Gallagher ([email protected]) is the team lead for theNovell team within the Dell Custom Factory Integration engi-neering group. His team is responsible for creating the NetWareinstallation process used in the factory, as well as the NDS treeintegration that occurs at the customer’s site. Rod is a MasterCertified NetWare Engineer (MCNE) and a Microsoft CertifiedSystems Engineer (MCSE).

Next-Generation Four-Way Servers from Dell

The PowerEdge 6400 and PowerEdge 6450 are the next generation of enterprise servers from Dell. Theseservers are designed for running your business-critical applications:

� Databases� Internet servers including Web hosting and e-commerce� Data warehousing� Terminal Server Environments (TSE)� Online transaction processing (OLTP)� Enterprise resource planning (ERP)� Communications such as e-mail

They provide building blocks for networked configurations of servers, including clustered configurations,scalable enterprise computing (SEC) environments, and storage area networks (SANs). Fibre Channel orSCSI external storage devices may be used.

The PowerEdge 6400 and PowerEdge 6450 offerhigher levels of scalability, availability, and performancefeatures than the workgroup and departmental serversfor customers who deploy mission-critical applications.

For more information and availability of PowerEdge 6400 and PowerEdge 6450, visit us at www.dell.com.

The PowerEdge 6400 and PowerEdge 6450 are scheduled for release during the fourth quarter of 2000.

� � � � � � � �

Page 51: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 55

The Dell Technology Consulting (DTC) MicrosoftBusiness Solutions Practice uses a well-defined and

mature methodology to architect and then deploy Windows2000-based solutions. The DTC approach, which is based onthe Microsoft Solutions Framework,1 uses a three-phase lifecycle (see Figure 1) to take a customer project from solutionarchitecture to production deployment.

First, the DTC Functional Specification serves as a foun-dation for design principles as well as historical reference forthe technical, operational, and business objectives that leadto the recommended solution architecture. The FunctionalSpecification will also include documentation of the coresolution components and map function and process to atechnology recommendation.

Next in the process is the Technical Design. Thisphase extends the functional, operational, and technologyrecommendations made in the previous phase into aSolution Architecture replete with hardware and softwarearchitecture, implementation and deployment best prac-tices, an operational framework, and, if required, solutionvalidation services.

The final phase, Implementation Planning andDeployment, extends the Technical Design to include migra-tion planning and production deployment. This phase canbe as simple as providing installation services for core Active Directory services and a high-availability Web siteinfrastructure to the complexity of a global deployment ofWindows 2000 Active Directory and Exchange 2000 Server.

The First Phase: Functional SpecificationDuring the Functional Specification phase, DTC works withthe customer to refine and document the business, technical,functional, and operational objectives of the project. Thisphase has two steps: discovery and functional specification.

Discovery Examines Goals and ResourcesThe discovery stage and its documentation focus on vari-ables pertinent to the solution at hand. A combination ofsurveys, forms, and technical interviews provide the sourceof information for discovery.

The DTC project team members work with their customercounterparts during this stage to examine the following:

Microsoft Business Solutions Practice—

A DeploymentPerspective

Dell Technology Consulting can work with customers to help them architect

and implement a Windows 2000-based solution. Dell Technology Consulting

follows a three-phase approach, from refining and documenting objectives to

validating to deploying the Windows 2000-based solution. This article explains

all three phases in detail.

By Jim Plas

1The Microsoft Solutions Framework provides disciplined guidance throughout the IT life cycle of application and infrastructure projects. For more information,see www.microsoft.com/TechNet/Analpln/PRACTICE.asp.

D A T A C E N T E R E N V I R O N M E N T

Page 52: Building Your Internet Data Center

� Project vision and scope� Shared customer goals� Business, technical, operational, and functional objectives

and requirements� Existing server and user services design, distribution, and

service levels as they pertain to the solution at hand� Existing physical network infrastructure, including wide

area network/local area network (WAN/LAN) and datacenter environment

� Existing network protocols and supporting protocol infra-structure, such as Windows Internet Naming Service(WINS), Domain Name Service (DNS),and Dynamic Host ConfigurationProtocol (DHCP)

� Existing directory and network operatingsystem (NOS) design

� Existing deployment of messaging and col-laboration systems

� Existing deployment of client hardware,desktop operating systems, and pertinentapplications and services

Functional Specification Sharpens ScopeDuring this step, DTC helps refine and con-firm the shared customer goals and discoveryresults in one of two ways, depending on thescope of the project. DTC will produce a

functional specification document or conduct formal projectstatus meetings.

The functional specification document, which is usu-ally a key component of large or complex projects, is usedto sharpen the project scope and act both as a foundationfor the resulting technical design and as a historical refer-ence for future enhancements and modifications. Unlessthe project is complex, customer objectives and require-ments can be confirmed during project status meetingsbefore moving forward with a technical design. Figure 2illustrates the areas of focus for a functional specificationdocument in a typical Windows 2000 migration ordeployment.

The Second Phase: Technical DesignOnce the Functional Specification phase is concluded, DTCbegins building the technical design. The Technical Designphase expands the functional specification by addressingdetailed solution architecture, specific build documentation,best-practice information, proof-of-concept executions, andpilot deployments. The technical design, which alwaysincludes the detailed solution architecture, may also consistof the following components:� Hardware and software solution architecture, including

an operational framework� Solution validation� Server build and deployment documentation

Solution Architecture Acts as a GuideIn a typical Windows 2000 project, the solution architectureacts as an implementation and configuration guide, and con-tains the following information:� Finalized design for the Active Directory, including loca-

tion and replication configuration� Finalized design for Dynamic DNS (DDNS), DHCP, and

WINS, including server build best practices� Recommended server configuration docu-

mentation for domain controllers andglobal catalog servers, including tuning and best practices

� Server build documentation for standardapplication servers

� Migration and deployment procedures tobe utilized by the build team

� Standard desktop configuration forWindows 95/98, Windows NT 4.0,Windows 2000 Professional, andMacintosh® clients

� Finalized security and Group Policy Objectdefinitions and application procedures

� Remote Installation Services (RIS) and

56 PowerSolutions

Figure 1. Phases of a Dell Technology Consulting Windows 2000-based Solution

3The Implementation Planning and Deployment phase extends the Technical Design phase to include migration planning and production deployment.

2The Technical Design phase extends the functional, operational, and technology recommendations from the Functional Specification phase into a solution architecture.

1The Functional Specification phase creates the foundation for the design principles of the solution.

DTC’s Dell Technology

Solution Centers

provide the ability to

build and validate design

assumptions in a fully

staffed, fully equipped

lab environment.

Page 53: Building Your Internet Data Center

IntelliMirror technical specifications, as they apply to thecustomer’s deployment

� Specific solution components of software architectureconfiguration and deployment

� Exchange 2000 Server front-end and back-end architecture� Operational frameworks for Active Directory, Exchange

2000 Server, and supporting infrastructure

This document, which can also be utilized as an imple-mentation guide for less complex or isolated deployments,can cover all necessary architecture and process recom-mendations required for implementation, depending uponcustomer request. In the case of package development(RIS, IntelliMirror, Published Apps, and so forth), DTCwill work with the customer in a lab environment tofinalize engineering recommendations and requirementsbefore production deployment. This is often done duringthe Solution Validation component of the DTCTechnical Design phase.

Solution Validation Verifies the Proposed Approach Depending on the complexity of the technology beingemployed, DTC recommends and often requires at least oneof the following solution validation options:

Proof of concept. DTC will work with the customer’steam to deploy and manage a proof-of-concept lab. In thislab, DTC will help the customer to simulate such events asthe migration of a Novell Directory Services (NDS)-baseddirectory to Windows 2000 Active Directory, or to buildMicrosoft Installer Packages for use in the controlled deploy-ment of applications through Group Policy Objects.

Proof-of-concept services can be used as a base forknowledge transfer or as a preliminary stage prior to pilot orproduction deployments. DTC highly recommends proof-of-concept services when introducing new and complex tech-nologies into an environment.

Pilot deployment. Pilot deployments are often used to vali-date the functionality and ease of use of a proposed solution.Pilots can range from a complex multilocation deploymentincorporating various workgroups throughout an organizationto a simple 15-user pilot inside an IT department.

Pilots are a critical step in the design and deploymentprocess because they are often the primary vehicles for end-user feedback during the deployment of new technologies ina corporation. Furthermore, pilots help ensure the quality ofengineered solutions.

E-validation. DTC’s Dell Technology Solution Centersprovide the ability to build and validate design assumptionsin a fully staffed, fully equipped lab environment. Althoughcustomers primarily use e-validation services to prove sizingand scaling assumptions in the e-commerce arena, they can

www.dell.com/powersolutions PowerSolutions 57

The functional specification document for a typical Windows2000 migration or deployment will include the following areasof focus:

� Refined understanding of business objectives

� Business drivers and impact on technology solution or politics

� Operational objectives and impact on functionality and technology

� Well-defined project metrics for success, risks, and constraints

� Critical issues and their potential impact on project success

� Out-of-scope requirements and plans to address them

� Required hardware and software architecture

� Windows 2000 Active Directory infrastructure architecture:

—Definition of forest and domain model

—Definition of locations and location of global catalogservers

—Supporting organizational unit (OU) structure

—Naming conventions

—Overlap or creation of Domain Name Service (DNS)namespace and transfer requirements

—Integration of Windows Internet Naming Service (WINS)as required

—Dynamic Host Configuration Protocol (DHCP)/DNS andclient requirements

—Domain Controller (DC) server configuration

� Recommended approach to populating the Active Directory

� High-level policy and operational guidelines as they relate to business, technical, and operational problems and objectives

� Microsoft Exchange 2000 Server and storage architectureand deployment models

� LAN/WAN requirements as they pertain to Active Directorylocations and replication

� Inclusion of accessory technologies, such as the following:

—Windows NT Server, Terminal Server Edition; and Citrix7MetaFrame

—High-availability solutions, such as Microsoft Cluster Server(MSCS) and Network Load Balancing (NLB)

—NetIQ7, HP™ OpenView7, and Dell OpenManage

—Storage Area Network (SAN), network attached storage(NAS), and backup and recovery solutions

—Appliances as required

FUNCTIONAL SPECIFICATION DOCUMENT FOR A WINDOWS 2000 MIGRATION

Figure 2. Functional Specification Document for a Windows 2000 Migration

Page 54: Building Your Internet Data Center

also use e-validation services to test scenarios such asMetaFrame™ application load or to simulate an NDS migra-tion to Active Directory.

In addition, customers can validate and tune applica-tions or an enterprise environment on Dell hardware at aDell eCenter of Excellence testing facility.

The Final Phase: Implementation Planning and DeploymentDepending on the particular project, the ImplementationPlanning and Deployment phase may occur at the conclu-sion of or concurrently with the Technical Design phase of aDTC project. Implementation Planning and Deploymentconsists of four steps:1. Organization and update of current-state documentation2. Location assessment and deployment planning3. Location build and deployment documentation4. Production deployment

Note that not all customers use DTC to deploy their pro-duction solutions. DTC technical designs and, in many cases,functional specifications are more than adequate tools for thesuccessful deployment of Windows 2000-based solutions.

Organization and Update of Current-State DocumentationAs part of the Technical Design phase, DTC will review andupdate existing current-state documentation to prepare forthe deployment of Active Directory infrastructure compo-nents and also core server packages and client deploymentsas required. The areas of the enterprise typically incorpo-rated into current-state documentation for a Windows 2000project include:� WAN/LAN as it pertains to the solution� The TCP/IP infrastructure� The Windows NT domain structure, Novell NDS

deployment, or other enterprise directory service� The operational/support model� The distribution of client hardware, operating system

statistics, and users� Applications and services as they relate to the solution

Current-state documentation will be used as a basis formigration, deployment, and integration planning as required.

Location Assessment and Deployment PlanningIn multilocation deployments, DTC works with its cus-tomer to determine the deployment priority. Based on thispriority, DTC schedules location deployments or migra-tions. Additionally, as part of the current-state update,DTC will determine which locations require a physicallocation assessment.

If DTC determines an on-location visit is needed to eval-uate the migration readiness of each desktop, a DTC teamwill visit the location prior to deployment and build a migra-tion plan specific to that location. A location visit will mostlikely be unnecessary, however, if DTC can develop an accu-rate representation of a remote location’s hardware makeup,application and services configuration, LAN/WAN environ-ment, and security model during the current-state update.

Location Build and Deployment DocumentationFor a multilocation deployment, DTC will develop locationbuild and deployment documentation. This build documen-tation will provide instructions on installation, configura-tion, and tuning of each desktop, server, and service, as wellas outline steps to prepare a location for the migration ofexisting technology to the new Windows 2000-based infra-structure. Location build and deployment documentation isoften a key component in the successful mass deployment ofstandard packages to multiple locations.

Production DeploymentIn a typical multilocation Windows 2000 or Exchange 2000Server project, production deployment is coordinated with acentrally based implementation team. Once the servers areinstalled according to the previously developed locationbuild and deployment documentation, the remote imple-mentation team applies specific security and administrativepolicies, and ensures enterprise functionality.

In addition to the deployment of servers, productiondeployment will often include the deployment or upgrade ofWindows 2000 Professional, Microsoft Office products, orTerminal Server or MetaFrame clients; the implementationof backup and recovery technology; and even the deploy-ment of entirely new desktop hardware. To ensure the success of such complex deployments, DTC will enlist anexperienced project manager to coordinate the following:� Logistics of hardware and service delivery� Status meetings and the delivery of end-user training

and project communications� Project tasks and team deliverables

Jim Plas ([email protected]) is a practice manager for DellTechnology Consulting. He manages the Microsoft BusinessSolutions Practice, which specializes in the architecture anddeployment of such Microsoft technologies as Windows 2000Active Directory, Exchange 2000, SQL Server, Site Server, high-availability Web sites, and Terminal Services solutions. Jim is theauthor of a SAMS publication entitled Deploying Windows NT4.0 in the Enterprise, a contributing author to many otherSAMS publications, and both a contributing author and regulartechnical editor for Windows 2000 Magazine.

58 PowerSolutions

Page 55: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 59

SQL Server 7.0 provides four methods of replication:snapshot, transactional, Immediate-Updating Sub-

scribers (IUS), and merge.1 These four replication methodsbehave differently, and each is selected for specific rea-sons. Latency and site autonomy factors are important for the database administrator (DBA) to consider whendeciding which replication method to deploy. In Figure 1,the replication methods are depicted in relation tolatency and site autonomy.

Data Latency Dictates Pace of ReplicationLatency is the time required for data updates on the sourceserver to be reflected on the secondary server or servers.Some systems can tolerate high levels of latency, such astraveling sales recording software that may be synchronizedwith the central office on a nightly or weekly basis. At theother extreme, some financial transaction systems requirecentral office updates to be available instantaneously to theregional and branch levels. The increasing demand for

near-real-time data should continue to drive acceptabledata latency levels lower and lower.

Information anytime, anywhere will be the clarion call of the successful business in the future. In addition to datalatency, site autonomy is an important criterion when formu-lating a replication strategy.

Site Autonomy Correlates with LatencySite autonomy is the degree of independence between thesource server and any secondary servers. Some forms of replica-tion require a tight coupling between source and secondary systems. For instance, IUS requires the Publisher and updatingSubscriber to be online simultaneously to complete the two-phase commit protocol. The Publisher is the server that makesdata available to other servers participating in the replicationactivity. Subscribers receive and apply the data updates pro-vided by the Publisher through a Distributor.

Other replication methods can endure significant logicalseparation. With merge replication, a Subscriber can be

SQL ServerReplication

RevealedIn mission-critical environments, the primary concern for the organization

is often the protection and availability of its data. Without reliable access

to secure and relevant data, the smooth operation of the company comes

to a painful, unprofitable halt. Data should not only be secure, but also be

accessible after a system or catastrophic failure. Replication—provided in

SQL Server 7.0—offers this accessibility.

By Rudy Lee Martinez

1According to Microsoft, IUS is an option of both snapshot and transactional replication rather than a distinct method. However, to help explain the imple-mentation of IUS, this article considers it a fourth method.

D A T A C E N T E R E N V I R O N M E N T

Page 56: Building Your Internet Data Center

entirely disconnected and miss one or more data updates, butreceive the updates when it is back online. With snapshotreplication, entire publications are sent over the wire at eachreplication cycle. Therefore, the next cycle will update dis-connected Subscribers after they have rejoined the replica-tion process.

Note that data latency and site autonomy requirementsand characteristics move in unison. That is, systems thatrequire low latency levels also require low autonomy; like-wise, systems that can endure high latency will also be moreautonomous. Before exploring replication further, consider arelated topic: distributed transactions.

Distributed Transactions Are Similar to Replication Distributed transactions (DTs) are often confused with repli-cation. Both approaches provide distribution of data acrossmultiple servers to achieve high availability of the data. Ofall the methods for making data available at many sites, DTsprovide the lowest data latency and require the least siteautonomy.

The near-real-time financial transaction example men-tioned earlier would very likely be accomplished with DTs,when it “absolutely, positively has to be there…in an instant.”

An important distinction between DTs and replication isthat DTs require the use of the begin distributedtransaction, commit tran, and rollback tran SQLcommands. DTs also require the Microsoft Distributed Trans-action Coordinator (MSDTC), which ships with SQL Server6.5 and later. MSDTC provides transaction objects for com-plete transaction management in a distributed environment.DTs, through the use of custom SQL statements and theMSDTC, offer near-real-time data updates when the highestlevel of concurrency is required.

The IUS replication method also uses the MSDTC toperform simultaneous updates, although the fast update isapplied to the publication database only. In IUS, updates tothe Subscribers are applied via traditional replication methods.Furthermore, the use of IUS replication incurs higher datalatency and provides higher site autonomy than DTs (seeFigure 1).

Transaction Log Updates Involve High LatencyDTs occupy one end of the latency/autonomy scale andscheduled transaction log updates occupy the other. LikeDTs, the scheduled transaction log update method is not areplication method, but it is also used to apply updates acrossdatabase servers. This procedure is similar to an electronicsneakernet that, on a scheduled basis, simply restores transac-tion logs from the source database to the databases on secondary servers to maintain a degree of concurrency. Thismethod may be acceptable when transaction logs are reason-ably small and high latency is acceptable.

The PDS Metaphor Elucidates Replication MethodsTo understand SQL Server replication methods, consider thePublisher/Distributor/Subscriber (PDS) metaphor and thetransaction data constructs that constitute the replication.The PDS metaphor describes the arrangement of publica-tion, distribution, and subscription servers in the logicalreplication architecture. The logical arrangement, whenimplemented, is manifested in physical replication models.

The simplest, most common default replication model iscentral Publisher/central Distributor/multiple Subscribers.This model can be implemented with the Publisher/Distributoron the same server; or to reduce processing and I/O load, the two functions can be deployed on separate servers. Other models also exist, such as multiple Publishers/multipleDistributors/central Subscriber.

All of the replication method examples contained hereinuse the default model. Evident from the metaphor names,the Publisher is responsible for producing publications thatare distributed by agents on the Distributor to one or moreSubscribers. Publications consist of one or more articles.Articles are the lowest level data construct in replication.The DBA can customize an article to provide various cuts,or views, of the data. Tables used in an article can be verti-cally or horizontally partitioned to provide tailored replica-tion data to different Subscribers.

Subscribers come in two forms: push and pull. PushSubscribers, the default and most common method, havethe replication updates pushed to them by the Distributor.Push Subscribers are defined and managed on the publish-ing database. The two substantial benefits of push subscrip-tions are that they are more secure than pulls and they

60 PowerSolutions

DistributedTransaction

TransactionalIUS

TransactionalReplication

SnapshotIUS

SnapshotReplication

MergeReplication

ScheduledTlog

Updates

Low HighSite Autonomy

Late

ncy

Figure 1. IUS Provides Higher Data Latency and Site Autonomy

Page 57: Building Your Internet Data Center

enable centralized management. The DBA can create manypush subscriptions at one time.

Pull subscriptions, on the other hand, are initiated andmanaged at each subscriber site, which makes them moreflexible. This gain in flexibility is offset by reduced security,since control is relinquished to each subscription server.Now that the groundwork is in place, consider the four repli-cation methods.

Snapshot Replication Updates Periodically with Bulk TransfersSnapshot replication is the easiest form of replication to setup and maintain. It consists of the periodic bulk transfer ofan entire publication from the publication server, throughthe distribution server, to one or more subscription servers. Itincurs a high degree of latency because the publications areonly refreshed periodically.

Snapshot replication also provides a high degree of siteautonomy. If a particular Subscriber is disconnected andmisses a replication cycle, it should not pose a problembecause the entire publication will be applied at the nextcycle when the Subscriber is reconnected. Since continuousmonitoring is not required, snapshot replication requireslittle overhead except at the time of transfers, which can bescheduled during off-peak hours.

As shown in Figure 2, at the time of replication, theSnapshot Agent running on the Distributor reads the publi-cation database and creates files in the snapshot folder,

which is also on the Distributor. The Distribution Agent onthe Distributor uses the publication data in the snapshotfolder to update the Subscribers. Client 2 in Figure 2 can bearound the corner—or around the world—and will see thenew information on the Subscriber after the update. Notethat the distribution database stores only job history anderror information. Unlike other forms of replication, it doesnot store publication schema and data. With snapshot repli-cation, the Distribution Agent simply provides pass-throughfunctionality.

Transactional Replication Updates Frequently with Small ChangesIn transactional replication, history and errors—as well aspublication schema and data—are stored in the distributiondatabase on the distribution server. The transactional repli-cation process begins with snapshot replication:� The Snapshot Agent reads the publication database, pop-

ulates the snapshot folder that is read by the DistributionAgent, and distributes the update to the Subscribers.

� When client 1 in Figure 3 updates the publication data-base, the changes are stored in the transaction log on the Publisher.

� The Log Reader Agent on the Distributor periodicallyreads the Publisher transaction log and applies thechanges to the distribution database.

www.dell.com/powersolutions PowerSolutions 61

PublicationDatabase Server

Client 1

Distribution Database Server

Client 2

DistributionDatabase

DistributionDatabase

PublicationDatabase

PublicationDatabase

Client 1updates database

Subscription Database Server

SubscriptionDatabase

SubscriptionDatabase

Client 2sees effect oftransactions

Snapshot Agent

Snapshot Folder

Distribution Agent

Transaction Data

Job History and Errors

Figure 2. Snapshot Replication

PublicationDatabase Server

Client 1

Distribution Database Server

Client 2

DistributionDatabase

DistributionDatabase

PublicationDatabase

PublicationDatabase

Client 1updates database

Subscription Database Server

SubscriptionDatabase

SubscriptionDatabase

Client 2sees effect oftransactions

Initial Data and Schema

Job History and Errors

Subsequent Transactions

Log Reader Agent

Snapshot Agent

Snapshot Folder

Distribution Agent

Transaction Log

Figure 3. Transactional Replication

Page 58: Building Your Internet Data Center

� The Distribution Agent reads the changes from the distribution database and distributes the changes to the Subscribers.

Client 2 will again see the effect of the new transactionsapplied to the subscription database.

The two significant differences between snapshot andtransactional replication are the size and frequency of theupdates. The snapshot method replicates sizable publicationsinfrequently. In contrast, the transactional technique updatesSubscribers frequently with small changes.

IUS Replication Propagates Discrete Updates to Others Like transactional replication, IUS first applies snapshotreplication to all Subscribers. IUS then propagates publica-tion database changes to the Subscribers via the standardtransactional method or snapshot method, depending on thechosen replication strategy. The distinguishing characteristicof IUS is that it allows updates to a particular Subscriber tobe propagated to all other Subscribers.

Note that in Figure 4, client 2 can send an update to theSubscriber. The MSDTC then uses a two-phased commitsystem to ensure the same update is applied to both the Pub-lisher and the Subscriber at the same time. The MSDTCcopies the immediate update only to the Publisher and notto any other Subscribers. Otherwise, the result would be adistributed transaction.

The timing and path of the update to the other Subscribersdepends on the choice of either a transactional or snapshotIUS. Specifically, updates with transactional IUS happenquickly and use the Log Reader Agent, whereas updates withsnapshot IUS occur less frequently and use the Snapshot Agent.

Merge Replication Enables Subscriber AutonomyMerge replication is similar to IUS replication: both allowSubscribers to change data on the Publisher, and the otherSubscribers will reflect those changes. As with all other methods, merge replication begins with snapshot updates onthe Subscribers. The major difference between merge and trans-actional replication is that each site is entirely autonomous withmerge: no cross-Subscriber dependency exists.

The Merge Agent, which runs on the Distributor forpush subscriptions and on the Subscribers for pull replica-tion, monitors and distributes all changes on the Publisher andSubscribers after the initial data population. Since data con-flicts can occur, the Merge Agent uses a conflict table locatedon the publication database to resolve conflicts. Conflicts areresolved via the default “first in wins” or a custom prioritizationconfigured by the DBA. After the conflict has been resolved,the Merge Agent can perform Subscriber synchronization atany time, depending on Subscriber connectivity (see Figure 5).

62 PowerSolutions

PublicationDatabase Server

Client 1

Distribution Database Server

Client 2

DistributionDatabase

DistributionDatabase

PublicationDatabase

PublicationDatabaseClient 1

updatesdatabase

Subscription Database Server

SubscriptionDatabase

SubscriptionDatabase

Client 2reads data

MSDTC

Client 2updates data

Initial Data and Schema

Job History and Errors

Subsequent Transactions

Snapshot Agent

Log Reader Agent

Distribution Agent

Transaction Log

Snapshot Folder

Figure 4. IUS Replication

PublicationDatabase Server

Client 1

Distribution Database Server

Client 2

DistributionDatabase

DistributionDatabase

PublicationDatabase

PublicationDatabase

Client 1updates database

Subscription Database Server

SubscriptionDatabase

SubscriptionDatabase

Client 2updates data

Conflict Table

Client 2reads data

Initial Data and Schema

Job History and Errors

Subsequent Transactions

Snapshot Agent

Merge Agent

Snapshot Folder

Figure 5. Merge Replication

Page 59: Building Your Internet Data Center

Plan Properly to Meet Subscriber Needs As with all IT efforts, proper planning is the key to success-ful implementation. Factors to consider when planning areplication strategy include the type of data to be pub-lished, the subscribers to the data, and the location of thesubscribers.

As mentioned earlier, an article can consist of partitions(vertical or horizontal) of a table, or it can be the entiretable. Publications can also contain more than one article.This flexibility provides a high degree of customization thatenables DBAs to meet the needs of the subscriber commu-nities. The DBA should select and partition data to mini-mize transmission times, bandwidth utilization, and storagerequirements. Optimization requires thorough analysis fromthe beginning.

The first step in the analysis phase is to understand anddocument the requirements of the subscribers:� How often will they need updates?� What degree of security is expected of the published data?� Do the Subscribers require update ability?� Are the Subscribers located on a LAN, or will WAN

traversal be required?� Are all subscription servers always online?� What are the acceptable levels of data latency and site

autonomy?

These are some questions that must be answered beforethe DBA can deploy a successful replication strategy.

SQL Server Eases Replication SetupSQL Server has taken much of the guesswork out of repli-cation setup. Functional wizards walk the DBA throughmost of the replication tasks, such as distributor selection,publication/article creation, and agent scheduling. Figures6 through 10 depict some of the necessary steps for replica-tion definition.

The first step for the DBA is to identify the distributor.Since the publisher is dependent on the distributor, theDBA must identify a distribution server before establishingpublication and subscription servers. The DBA can chooseto collocate the distribution and publication databases or toput them on separate servers (Figure 6).

In Figure 7, the DBA selects the publication method. Next,the DBA may choose to select IUS (Figure 8). Note that if the DBA selects merge in Figure 7, the “Allow Immediate-Updating Subscriptions” step is not presented. After the DBAhas defined the distributor and replication method, article andpublication setup can occur.

In the example shown in Figure 9, the Employee andProducts tables are selected to create their respective arti-cles. Note that stored procedures can also be published in

www.dell.com/powersolutions PowerSolutions 63

Figure 6. Choosing a Distributor

Figure 7. Choosing a Publication Type

Figure 8. Allow Immediate-Updating Subscriptions

Page 60: Building Your Internet Data Center

addition to data from tables. The DBA will combine thesearticles into the Northwind publication.

In Figure 10, the DBA customizes the Employees_Articleto disregard the Title, TitleofCourtesy, and BirthDatecolumns. The DBA will not include those columns in thearticle or in the publication. The DBA can also applySELECT statements to the tables in the articles to filter thedata by rows. This degree of flexibility enables the DBA todefine the article and publications to address the variousrequirements of many subscribers.

Use Replication Monitor to Supervise ReplicationAfter the DBA has defined and scheduled replication,monitoring can be done via the Replication Monitor tool,which ships with SQL Server. This tool enables the DBAto view the status of replication agents, monitor replica-tion progress, and troubleshoot potential problems withthe Distributor.

After the server has been classified as a Distributor,Replication Monitor appears in Enterprise Manager on theconsole tree under the active server. The DBA can use it toview a list of Publishers, publications, and subscriptions thatthe Distributor supports. The tool supports monitoring ofreplication schedules, near-real-time “heartbeat” status, andthe history for each replication agent.

In Figure 11, the DBA uses Replication Monitor todisplay the status and session details of a Snapshot Agent.

Replication Protects Against Data LossReplication, in one or more of its forms, can provide anorganization with the necessary degree of protection fromdisastrous data loss. The reliable availability of a company’sdata makes the effort worthwhile to implement and main-tain a replication strategy.

Replication also reduces the load placed on primaryonline transaction processing (OLTP) systems. If the DBAuses replication to maintain reporting databases in an onlineanalytical processing (OLAP) environment, clients canaccess the OLAP servers for most or all of their reportingneeds and decision support services.

Rudy Lee Martinez ([email protected]) is theprogram manager for the DellStar Enterprise project and alsomanages a team of software developers in Product Group IT.Rudy holds an M.S. in Computer Science from Wake ForestUniversity and is a Microsoft Certified Systems Engineer(MCSE), Solutions Developer (MCSD), Database Adminis-trator (MCDBA), and Trainer (MCT).

64 PowerSolutions

Figure 10. Filtering Table Columns

Figure 11. Session Details of a Snapshot Agent� Visit us online at WWW.DELL.COM/POWERSOLUTIONS

Figure 9. Selecting Specific Articles

Page 61: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 65

The rapid growth of the Internet has caused an increase inthe deployment of various Web technologies. It also has

contributed to the overall complexity of managing content,networks, and applications. As businesses look to leveragethe expanding Internet market, they face an ever-growinggamut of protocols to support, devices to manage, and con-tent and applications to deploy.

In addition, Web traffic through certain sites doublesevery 100 days,1 and network computing hardware oftencannot keep up with the increased demands. For organiza-tions such as Web hosting companies, dot-coms, andInternet service providers to remain competitive, they need a reliable solution for handling this traffic.

Various products are available to enhance the perform-ance, 24×7 reliability, and economic expansion of public and

private Internet sites. Solutions that offer these capabilitiescome in many varieties, but all serve to facilitate a conceptknown as Internet Traffic and Content Management (iTCM).

What distinguishes these products from one another?What makes one solution more viable than another? Thecurrent flood of messages from various vendors, analysts, andreviewers has done little to clarify the trade-offs. Rather, ithas only added to the confusion.

Growing Demand for Internet Business ApplicationsWhile business use of the Internet began conservatively, it hasrapidly grown into a sophisticated array of e-commerce andcontent personalization applications for consumers and busi-nesses alike. Figure 1 shows how the value of Internet businessapplications has increased in proportion to their complexity.

Choosing the RightInternet Traffic

and ContentManagement

SolutionMany products are available to help businesses enhance the performance,

reliability, and economic expansion of their Internet site. These products all

support a concept known as Internet Traffic and Content Management (iTCM).

This article describes various iTCM solutions, including Dell’s PowerApp.BIG-IP,

to help you decide which one is right for your business.

By David Barclay

1Internet Research Group (formerly Collaborative Research). The 1999 IRG Internet Traffic and Content Management Report. Page 8.

I N T E R N E T E N V I R O N M E N T

Page 62: Building Your Internet Data Center

The bottom line is that businesses are extending theironline focus to applications, networks, protocols, and content, which are becoming more complex each year.Combined with this shift is the need to provide fail-safeaccess to accurate content, while increasing processingcapacity and availability, and simplifying management.

Internet Traffic and Content Management DefinedAs Internet sites begin to handle more traffic and supportmore complex protocols and services, availability and faulttolerance become critical needs. Every transaction and userinteraction must be 100 percent reliable to maintain optimalquality of service (QoS). To address these needs and preventoverload to one specific server, sites often replicate dataacross an array of servers.

As more servers are deployed, the site becomes more costlyand more difficult to manage, and there is little assurance thatone server will not become overloaded, provide incorrectresponses, or outright fail. The site needs a more intelligentproduct that can complement and manage incoming traffic—a function known as load balancing. With load balancing, traffic can be dynamically distributed across a group of serversrunning a common application—yet the groupappears as one server to the network.

Load balancing distributes traffic more effi-ciently, offers greater economies of scale, andprovides significantly greater fault tolerance.Internet traffic and content management prod-ucts, however, not only encompass load-balancingcapabilities, they also intelligently monitor andmanage the health of servers and Internet con-tent, and make decisions on where to route traf-fic to optimize site performance and availability.This intelligence ensures that users are connectedto the most available server, providing excellentand predictable QoS and the right content.

Preventing Service Interruptions and Guaranteeing AvailabilityWith today’s Web applications, service interruptions can becostly and can occur in many forms. Server and software fail-ures are the most common, because hardware, operating sys-tems, and applications may simply stop responding. Contentfailures, error messages, and incorrect data can infuriateusers. Finally, heavy traffic and network congestion or failurecan easily limit site availability.

The iTCM products, therefore, must be designed toprevent these interruptions and guarantee availability. Asolution not geared to provide high availability does notmaximize the return on investment for Internet and intranetconnectivity. Therefore, as you evaluate products, you shouldlook for a balance of the following:� QoS-based availability� Assured and continuous operation with zero downtime� Simplified, consistent management across a range of

protocols� Robust technical support and ease of installation

Using a single solution to provide these critical elementsultimately can provide tremendous cost savings, enhance userexperience, and provide significant long-term business value.

Internet Traffic and Content Management SolutionsiTCM products typically offer varying levels of expand-ability, availability, and speed. The following sectionsdescribe the various product offerings.

Software-Only Solutions With software-only solutions, software is installed directly ontothe servers in the array. This enables network managers to per-form highly granular server management operations, such asanalyzing CPU memory utilization and managing agent-basedcontent. In theory, software-only solutions provide cost savingsand faster performance because traffic does not have to passthrough an additional device. In addition, loading software

onto all servers in the array eliminates a singlepoint of failure, since the entire site will not godown should one server fail.

Some software-only products allow synch-ronization of data among servers in a cluster.This feature can be useful if the servers do notmaintain identical content and do performcomplementary tasks, which require differentservers to work in concert to complete a singlecontent request.

Although software-only products eliminatea single point of failure and enable in-depthanalysis and synchronization of data, there aretrade-offs to recognize with this solution. Most

66 PowerSolutions

Value

Early Stage Complex Solutions

Corporate Home Page Training

HR Information/Self-Updating Sales Force Automation

Business-to-Business Business-to-ConsumerInformation Catalogs/Payment

Market/Product Information Business-to-Consumer Full E-Commerce

Informational Search Business-to-Business Supply Chain Automation

INTERNET BUSINESS APPLICATIONS

Figure 1. The Growth of Internet Business Applications

Load balancing

distributes traffic more

efficiently, offers

greater economies of

scale, and provides

significantly greater

fault tolerance.

Page 63: Building Your Internet Data Center

notably is server management. Since the soft-ware is loaded on all servers in the array, main-taining this software can become task-heavyover time, since all servers will require updates.

Another consideration for software-onlysolutions is OS dependency. Because the soft-ware is installed directly onto the servers,businesses are locked into supporting specificplatforms. Security concerns also can arisebecause this type of solution exposes server IPaddresses directly to the user.

SwitchesSwitches can perform fast load balancing atlayers 2 and 3 in hardware managed by a cen-tral processor, which executes backgroundtasks such as routing, table, and network management. Thistype of solution enables fast balancing of static content, andmany solutions offer high backplane-speed support. In addi-tion, switches have the potential to connect to multiplehigh-speed Ethernet ports simultaneously, further optimizingspeed. But one irony to this configuration is that whileswitches provide multi-gigabit throughput, most sites con-nect to the Internet using T3 (45 Mbps) or less lines, render-ing the additional capability moot.

This switch-based architecture, however, has limitations.For example, packets that require exception handling at the network level (layer 4) must be separately opened andexamined to determine their destination port. This processuses the switch’s central processor and may compromise itsperformance. In other words, executing per-frame processingfrom a centralized processor can limit the total framethroughput of the device.

Often missing from switch solutions are functions such asSecure Sockets Layer (SSL) session ID tracking, user authen-tication, and application health checking. Their absence further limits the intelligence and viability for more sophisti-cated tasks, such as e-commerce.

Keep in mind that when evaluating backplane speed, the packet balancing occurs only as fast as the uplink.

Expandability occurs only through a cascadingnetwork of very expensive, large switchingchassis—and an additional layer of devices is required to achieve full redundancy.Furthermore, most switching solutions do not contain viable wide area network, high-availability load-balancing features forextended networks.

Routers and Caching Systems iTCM products complement routers andcaching systems. An iTCM product, for exam-ple, can offer additional expandability, avail-ability, and security above basic routing andcaching functionality. The product can alsoreverse-manage the load balancing between

cache servers and routers, further improving performance.This was the case in the network design implemented byNASA a few years ago, as demand for information on theJohn Glenn “return to space” mission grew at a feverishpace. NASA employed a dynamic configuration—combiningan appliance-based iTCM product with routers and cacheservers—to meet user demand.

Appliance-Based ProductsTurnkey load-balancing appliances are “network appliance”software/hardware products that offer full IP support and easilyenhance traffic performance. These appliances are typicallydeployed in redundant pairs and placed between the serverfarm and the network—operating jointly as parallel and hot-spare iTCM devices, as shown in Figure 2. This redundancyoffers fail-safe, cost-effective operation and significantly mini-mizes maintenance. Servers can be upgraded and managedwithout any downtime or effect on the network.

One key advantage of an appliance-based iTCM solutionis that it provides OS independence, enabling the organiza-tion to implement any type of application or Web serverinto the mix. Also, the design approach of an iTCM appli-ance offers a stable balance between high functionality,speed, dependability, flexibility, and cost-effectiveness. Since

www.dell.com/powersolutions PowerSolutions 67

iTCM products

complement routers

and caching systems by

offering additional

expandability, availability,

and security above

basic routing and caching

functionality.

Internet

Router Firewall

PowerApp.BIG-IP 200

ServerArray

Figure 2. Internet Traffic and Content Management Deployment

Page 64: Building Your Internet Data Center

an iTCM appliance is a software solution combined with ahardware platform, it can be upgraded easily.

For institutions that need continuous e-commerce andsecure connections as well as intelligent application inter-action, these products are valuable. Also, iTCM appliancesolutions are primarily subsets of, and not replacementsfor, applications whose functionality is highly distributedacross large clustering systems.

One example of an appliance-based iTCM solution isPowerApp.BIG-IP, a new appliance that Dell Computerwill make available in October 2000. PowerApp.BIG-IPutilizes software—BIG-IP—licensed by F5 Networks™,Inc., and a 2U server hardware platform developed byDell. The BIG-IP software offers a load-balancing solutionthat provides high availability and intelligent load bal-ancing of Web traffic and content.

Choosing the Right iTCM Solution Several criteria for selecting the right iTCM product for yourbusiness follow.

DependabilityThe ability of an iTCM product to eliminate a single point of failure determines its dependability. A common way toachieve dependability is by offering failover capability througha redundant pair of iTCM devices. PowerApp.BIG-IP, forexample, uses a method called session state mirroring toenable failover in less than one second—with uninter-rupted service to customers. For instance, if a customer is transferring a large file via File Transfer Protocol

and PowerApp.BIG-IP fails over from the active to thestandby controller, the file transfer will continue—uninterrupted. Session states are mirrored in RAM, so no end-user sessions are lost.

Optimal Quality of Service and High Availability To best meet your networking demands, you should lookat the scope and sophistication of load-balancing intelli-gence within each iTCM product. Load-balancing optionsshould include numerous traffic distribution algorithms—such as round robin, round trip time, and packet rate—and QoS features that track and gain intelligence basedon current conditions, thus improving performance witheach request.

Specifically, an iTCM product should detect errors andreroute traffic automatically by actively querying contentaccuracy and application performance. By performingapplication-layer (layer 7) testing, the connected user isassured that the iTCM product has thoroughly checkedfor all the different processes involved in creating adynamic page before routing service requests to the server.This testing eliminates the occurrence of error messagesbecause of overloaded servers, software failures, or bad ormissing content.

iTCM products should also provide intelligent persistencefeatures, such as SSL session ID tracking, to ensure that usersstay connected to a single server while completing a transac-tion. This feature is invaluable for handling transactionsfrom user environments such as AOL, where numerous userscan be assigned the same IP address. This multiple assign-ment of IP addresses can confuse an iTCM product and con-centrate traffic on a single server in the server farm. Instead,your iTCM product should read specific session IDs from anSSL transaction, ensuring the user is uniquely identified anddelivered accurate and timely content until the transactionis complete.

Another key requirement to ensure high availability is tooffer functionality known as traffic prioritization. This func-tion enables the network to offer varied access service levelsbased on traffic source, type, or destination—guaranteeingaccess. For example, rules can be established that always givepriority to credit card transactions or to content from or to aspecific domain. Traffic prioritization provides the most flexi-bility and further optimizes availability to a broad range ofbusiness applications.

Similarly, the product should be able to drill down andidentify specific types of traffic based on HTTP header infor-mation. This capability gives an e-business greater controlover a wider range of traffic, because the business can seemore granular levels of traffic details at the application layer.For maximum flexibility and control of traffic, the product

68 PowerSolutions

Criteria Characteristic

Dependability � Automatic Failover� Session State Mirroring

Quality of Service/ � Session ID TrackingAvailability � Traffic Prioritization

� Agent-free Application Checking� Multiple Load-Balancing Algorithms� Launch Executables (SSL, SQL, checks)� Proactive Checking� Content Verification� SSL Acceleration

Security � Network Address Translation Protection

� Packet Filtering� Thwarting Teardrop, Land, ping, and

Denial of Service Attacks� Server Addresses Not Revealed to

Outside World

CRITERIA FOR CHOOSING AN iTCM SOLUTION

Page 65: Building Your Internet Data Center

should be able to recognize and provide intelligent, high-availability load balancing to any HTTP header, includingHTTP version, HTTP host field (also known as URL), andthe HTTP method used by the request.

The product also should offer SSL acceleration, which is especially critical in e-commerce applications. SSL accel-eration is a function used to off-load SSL processing fromservers, enhancing their performance while improvingresponse time for and traffic management of customer trans-actions. SSL acceleration improves the performance of e-commerce servers and provides security, speed, and trafficmanagement during business-critical online transactions—from a single location—without the cost of installing addi-tional hardware or software on each server.

SecuritySince the iTCM product is typically placed in front of theserver farm, it must provide enhanced security measures toprotect against common attacks and to route traffic aroundhacked servers. It should also mask the well-known ports ofthe actual servers being load balanced to prevent unautho-rized access to these ports.

Vendor Credibility and Support When choosing a vendor for a load-balancing product,knowing the vendor’s customers and the vendor’s history ofcustomer service are important factors. The vendor shouldprovide proven examples that showcase e-commerce, highcustomer-traffic loads at varying times of the day, and distrib-uted networks across multiple continents.

Technical support is another critical evaluation area.Vendors should have a strong reputation and investment intech support, and offer on-site installation and training of

their system. Seek customer references and look for special-ized technical expertise in load balancing or iTCM, asopposed to broad networking support. Your diligence inunderstanding these areas will ensure that your installedsystem will be optimized for your application and trafficrequirements.

It Is Your E-Business—So Choose Carefully Continually enhancing your Web application’s availability,reliability, and speed are critical to optimal Internet QoS.Today’s users who want to purchase products or services onthe Internet will quickly choose a competitor if they are facedwith less than optimal performance, or missing or bad contenton your Web site. These shortcomings can quickly result inlost business, lost brand equity, and lost market share. To helpyou choose the right load-balancing product for your Webinfrastructure, consider the following questions:

Availability� How does your company ensure 24×7 availability of

hardware, software, and applications?� What metrics are you using to quantify the cost of

downtime or erroneous responses?� How are you testing applications and links prior to

deploying them?

Scalability� How is your company measuring, and then forecasting,

for network growth?� How do you handle system overload problems?� How do you accommodate the need for additional

servers or other network capacity?� Describe your company’s growth rate.

Management� What effect does taking a server off-line have on your

clients?� What amount of downtime do you experience when

you add capacity?� What are the single points of failure in your network?

Carefully assessing these questions as they relate to your busi-ness can simplify your task of selecting a load-balancing productto manage your traffic needs for today and tomorrow.

David Barclay ([email protected]) is a product marketingmanager for Internet Server Products in Dell’s Enterprise Systemsgroup. Prior to joining Dell, David was a product marketing manager at Compaq® Computer Corporation in its Desktop andPortable PC divisions. David has a B.B.A. and an M.S. degreein Marketing from Texas A&M University.

www.dell.com/powersolutions PowerSolutions 69

PowerApp.BIG-IP offers several features to provide enhanced

security:

Network address translation (NAT). Uses port mapping

to map well-known ports—such as 80, 443, 20, and 21—to

any port number on the actual servers, providing greater secu-

rity by making it difficult for intruders to identify what services

are running on which port

Packet filtering. Used to limit or deny access to the server

farm by monitoring the traffic source, destination, or port

Reaps idle connections—to prevent denial of service

attacks. Can identify any services and ports that receive

illegal access attempts by identifying the affected ports,

frequency of attempts, and source IP address of attacker

SECURITY FEATURES OF POWERAPP.BIG-IP

Page 66: Building Your Internet Data Center

70 PowerSolutions

A firewall is a system that protects the local network fromthe external network. It is the sentinel through which

all network traffic must pass before it can enter or exit thelocal network. In its simplest form, a firewall is a filteringrouter that screens out unwanted traffic.

The Linux kernel includes a Linux firewall with accessi-ble source code. Therefore, unlike most commercial firewalls,the Linux firewall is highly customizable and can be config-ured according to each customer’s security requirements.This article focuses on a Linux firewall based on the kernelversion 2.2.12, which is shipped with Red Hat® Linux 6.1.

The Linux firewall has special capabilities because itsdirect integration into the OS kernel results in better perfor-mance. In contrast, firewalls on other operating systems are lay-ered on top of the OS core or, at best, linked into the networkstack. As illustrated in Figure 1, the Linux kernel integrates thekey firewall functions with its masquerading techniques.

Linux Firewall Setup and TestingThis article describes how to use open source informationavailable in the marketplace to set up an operational firewall

based on the Linux architecture of Red Hat Linux 6.1. Theprocess consists of four steps:� Plan network design and infrastructure layout � Build and test the Linux kernel � Set up the firewall � Deploy and test the firewall

Plan Network Design and Infrastructure LayoutFor network design and infrastructure planning, the world isdivided into the external network and the internal network.The internal network is what the enterprise wishes to protect.The external network is the public Internet. The firewall connects both environments and forms the point of contactbetween the internal network and the external one (seeFigure 2). Only the firewall is visible to the external world.

All outbound traffic from the internal network is mas-queraded at the firewall; that is, the TCP/IP package infor-mation of the internal network machines is replaced by theinformation of the firewall. If an answer to a sent-out pack-age is received, the firewall converts the answer packageback to the header contents expected by the originator with

Linux Firewallson DellPowerApp

Servers

Dell PowerApp servers ship with the Linux OS and can be configured for low-

cost firewall and Network Address Translation operation. This article explains

how to set up a basic functioning firewall in Red Hat Linux 6.1.

By Zafar Mahmood

I N T E R N E T E N V I R O N M E N T

Page 67: Building Your Internet Data Center

a technology called Network Address Translation (NAT).Figure 3 lists the TCP/IP package information used to set upthe Linux firewall example described in this article. Figure 4illustrates the configuration pictorially.

Since the Linux firewall is part of the OS, special infra-structure requirements for the firewall are minimal. Anyexisting PowerEdge server or PowerApp appliance withLinux support can be used. One processor with 256 MB ofRAM can handle the firewall workload, which is typicallynot CPU intensive. To improve availability and providesome level of data redundancy, a low-end solution such as amirrored drive to a high-end solution that uses a RAID con-troller such as PERC-2/SC, can be used.

To separate inbound and outbound traffic, a second net-work interface adapter for the firewall must be added to thedefault configuration. A dual-port network interface card(NIC) can be used instead of two NICs.

The first step in setting up the firewall infrastructure isOS preparation. This step can usually be avoided by orderingDell PowerEdge servers and PowerApp appliances with theLinux OS preinstalled by Dell. Dell recommends the RedHat 6.1 Enterprise Edition, which is already configured torun with the Dell PowerEdge and PowerApp server products.Consult www.redhat.com for detailed information or todownload the latest version.

Figure 5 shows a typical layout of the file directory. Atleast 2,000 MB should be allocated for the /usr directory;other partitions need no more than 512 MB. These sizesvary depending on the additional applications and packagesinstalled.

The basic installation should contain the following pack-ages: X-Windows, networked workstation, development, andkernel development. The windowing interface package usingX-Windows is not required, but configuring the Linux kernelfor use with a firewall is easier if the package is installed.This article assumes a system setup with the X-Windowssystem running as the default.

www.dell.com/powersolutions PowerSolutions 71

DNSServer

MailServer

WebServer FTP

Server TelnetServer

DNSServer

Masquerading

MailServer

WebProxy

Internal Network Interface

Clients

Internet

External Network Interface

Figure 1. Integration of Firewall Functions with Linux Kernel Masquerading Techniques

Figure 2. Generic Firewall Architecture

Internal Network

External Network

eth0

eth1

FirewallFilter

Figure 3. TCP/IP Configuration for the Example Linux Firewall

Firewall

Hostname: SAPFW

Internal network: 172.16.0.1/16

Gateway: NONE

External network: 143.166.32.84/24

Gateway: 143.166.227.254

Client: External Network

Hostname: GUI01

External network: 143.166.32.1/24

Gateway: 143.166.227.254

Server: Internal Network

Hostname: SAPSRV01

Internal network: 172.16.1.100/16

Gateway: 172.16.0.1

TCP/IP CONFIGURATION FOR THE EXAMPLE LINUX

FIREWALL

Page 68: Building Your Internet Data Center

72 PowerSolutions

After installing the listed package with the recommendednetwork configuration, the system should have networkconnectivity on both networks. Figure 6 shows a summaryof all network configurations obtained with the ifconfigcommand.

A Dynamic Host Configuration Protocol (DHCP) serverassigns an IP address to the network adapter with eth1 inter-face, which is connected to the external network. After setting up both networks on the firewall, use the ping com-mand to test all interfaces for proper connectivity from theinternal and external networks to the firewall’s external andinternal interfaces. All four ping commands should reportsend/receive packages, proving that both NICs are activeand connected to the right networks (see Figure 7).

Build and Test Linux Kernel The next step is to create the Linux kernel. To begin, open aterminal window, enter the Linux source directory, and callthe setup program with:

cd /usr/src/linuxmake xconfig

This action opens the kernel configuration with an X-Windows front end, as shown in Figure 8.

To adapt the related items for the firewall, follow theHOWTO-Guide 1.81 for the Linux firewall available on theInternet at www.linuxdoc.org/ and in the /usr/doc directory of each Linux installation. As an alternative to using the X-Windows configuration, you can edit the related items directlyin the /usr/src/linux/.config file. Note: Do not forget thedot at the beginning of the filename. Figure 9 shows the recom-mended configuration parameters for the Linux firewall.

After saving the new configuration to the .config file,you can start to compile a new kernel enabled for operationin a Linux firewall. First use make clean to remove oldrelated binaries before starting the kernel compilationprocess. The next step is to create the new dependenciesbetween the various modules with make depend.

Next create the modules containing the drivers, withmake modules. Then copy the newly created modules to the right location under /lib/modules with Makemodules_install. Finally, compile the kernel itself withmake bzImage bzlilo.

These steps create two new files under the root directory,called vmlinuz and System.map. The first is the Linuxkernel itself, and the second is the information file contain-ing the possible modules available for that kernel.

Copy the kernel to the /boot directory and rename it. Donot overwrite the existing file in /boot. Give the new kernel aname, such as vmlinuz-fw. Rename the existing /boot/System.map file, and then replace it with the new file.

Figure 4. Illustrative Network Architecture

External Network Clients

Internal Network Servers

Firewall

143.166.13.1 143.166.13.2 143.166.13.3 143.166.13.4 143.166.13.5

eth1: 143.166.32.84eth0: 172.16.0.1

172.16.5.28172.16.5.13172.16.5.12172.16.5.10

Device Mount Point File System Mount Options Dump Option/fsck Order/dev/sda1 / ext2 defaults 1 1/dev/sda6 /home ext2 defaults 1 2/dev/sda8 /tmp ext2 defaults 1 2/dev/sda5 /usr ext2 defaults 1 2/dev/sda9 /var ext2 defaults 1 2/dev/sda7 swap swap defaults 0 0/dev/fd0 /mnt/floppy ext2 noauto,owner 0 0/dev/cdrom /mnt/cdrom iso9660 noauto,owner,ro 0 0none /proc proc defaults 0 0none /dev/pts devpts gid=5,mode=620 0 0

RECOMMENDED LINUX FILE DIRECTORY LAYOUT

Figure 5. Recommended Linux File Directory Layout

Page 69: Building Your Internet Data Center

Next, reset the Linux loader for the appropriate boot image. Add a section with the following lines to/etc/lilo.conf, the file handling the configuration:

image=/boot/vmlinuz-fwlabel=firewallread-onlyroot=/dev/sda1

The root device may be different in the actual setup.Unlike the default Linux kernel, the new kernel does not need a RAM disk image because the PERC-2/SC or PERC-2/DC driver is directly compiled into the kernel.

Finally, activate the new configuration by calling lilo.The result may resemble the following:

[root@sapfw linux]# liloAdded linux *Added firewall Added linux-up

Now test the new kernel by rebooting. At the liloprompt, enter firewall to boot the new kernel. If youencounter problems, boot the default kernel and debug thenew kernel configuration file. You may have to change additional settings or recompile the kernel.

www.dell.com/powersolutions PowerSolutions 73

ping 143.166.32.84 External interface eth1

ping 172.16.0.1 Internal interface eth0

ping 143.166.32.1 External client

ping 172.16.1.100 Internal server or client

Figure 8. Menu for the X-Windows Front End

[root@solengfw /root]# ifconfigeth0 Link encap:Ethernet HWaddr 00:A0:C9:9D:BA:4F

inet addr:172.16.0.1 Bcast:172.16.255.255 Mask:255.255.0.0UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1RX packets:42057 errors:0 dropped:0 overruns:0 frame:0TX packets:23555 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:100 Interrupt:10 Base address:0xecc0

eth1 Link encap:Ethernet HWaddr 00:C0:4F:9E:DE:83 inet addr:143.166.32.84 Bcast:143.166.32.255 Mask:255.255.255.0

UP BROADCAST RUNNING MTU:1500 Metric:1RX packets:1490144 errors:0 dropped:0 overruns:0 frame:0TX packets:35825 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:100 Interrupt:11 Base address:0xec80

lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0

UP LOOPBACK RUNNING MTU:3924 Metric:1RX packets:14 errors:0 dropped:0 overruns:0 frame:0TX packets:14 errors:0 dropped:0 overruns:0 carrier:0collisions:0 txqueuelen:0

NETWORK CONFIGURATIONS OBTAINED WITH IFCONFIG

Figure 6. Summary of Network Configurations

PING COMMANDS

Figure 7. ping Commands for Testing NIC Connections

Page 70: Building Your Internet Data Center

74 PowerSolutions

� Development and/or incomplete code/drivers

(CONFIG_EXPERIMENTAL) [Y/n/?] YES:

Though not required for IP MASQ, this feature allows the

kernel to create the MASQ modules and enable the option for

port forwarding.

� Enable loadable module support

(CONFIG_MODULES) [Y/n/?] YES:

This feature allows you to load kernel IP MASQ modules.

� Networking support

(CONFIG_NET) [Y/n/?] YES:

This feature enables the network subsystem.

� Packet socket

(CONFIG_PACKET) [Y/m/n/?] YES:

This OPTIONAL, but recommended, feature will allow you to

use TCPDUMP to debug any problems with IP MASQ.

� Kernel/User netlink socket

(CONFIG_NETLINK) [Y/n/?] YES:

This OPTIONAL feature will enable logging of firewall hits.

� Routing messages

(CONFIG_RTNETLINK) [Y/n/?] NO:

This feature has nothing to do with packet firewall logging.

� Network firewalls

(CONFIG_FIREWALL) [Y/n/?] YES:

This feature enables the IPCHAINS firewall tool.

� TCP/IP networking

(CONFIG_INET) [Y/n/?] YES:

This feature enables the TCP/IP protocol.

� IP: advanced router

(CONFIG_IP_ADVANCED_ROUTER) [Y/n/?] NO:

This feature is required only for CONFIG_IP_ROUTE_VERBOSE

and fancy routing (independent of ipchains/masq).

� IP: verbose route monitoring

(CONFIG_IP_ROUTE_VERBOSE) [Y/n/?] YES:

This feature is helpful for using the routing code to drop and

log IP spoofed packets (HIGHLY RECOMMENDED).

� IP: firewalling

(CONFIG_IP_FIREWALL) [Y/n/?] YES:

This option enables the firewalling feature.

� IP: firewall packet netlink device

(CONFIG_IP_FIREWALL_NETLINK) [Y/n/?] YES:

This OPTIONAL feature will enhance the logging of firewall hits.

� IP: always defragment (required for masquerading)

(CONFIG_IP_ALWAYS_DEFRAG) [Y/n/?] YES:

This feature is REQUIRED to ensure that you will be asked

about enabling the IP Masquerade and/or Transparent

Proxying features. This feature also optimizes IP MASQ

connections.

� IP: masquerading

(CONFIG_IP_MASQUERADE) [Y/n/?] YES:

This feature enables IP MASQ to re-address specific internal-to-

external TCP/IP packets.

� IP: Internet Control Message Protocol (ICMP) masquerading

(CONFIG_IP_MASQUERADE_ICMP) [Y/n/?] YES:

This feature enables support for masquerading ICMP ping

packets (ICMP error codes will be MASQed regardless). It is

important for troubleshooting connections.

� IP: masquerading special modules support

(CONFIG_IP_MASQUERADE_MOD) [Y/n/?] YES:

This OPTIONAL feature enables the OPTION that will later

enable the TCP/IP Port forwarding system to allow external

computers to directly connect to specified internal MASQed

machines.

� IP: ipautofw masq support (EXPERIMENTAL)

(CONFIG_IP_MASQUERADE_IPAUTOFW) [N/y/m/?] NO:

IPautofw is a legacy method of port forwarding. It is mainly a

hack that is better handled by per-protocol modules. It is NOT

recommended.

� IP: ipportfw masq support (EXPERIMENTAL)

(CONFIG_IP_MASQUERADE_IPPORTFW) [Y/m/n/?] YES:

CONFIGURATION PARAMETERS FOR LINUX FIREWALL

Figure 9. Configuration Parameters for Linux Firewall

Page 71: Building Your Internet Data Center

Set Up Linux FirewallRed Hat Linux 6.0 and later versions provide a firewallmechanism called IPFW. It uses IPCHAINS as a firewalladministration program, rather than ipfwadm, which wasused in the older versions of Linux.

As a firewall administration program, IPCHAINS createsthe individual packet filter rules for the input, output, andforwarding chains that compose the firewall. Figure 10describes the most important IPCHAINS rules.

To work with the operational Linux firewall, the systemneeds to know the rules to enable the router capabilities. Todefine and enable the rules automatically on system startup,create the file /etc/rc.d/rc.firewall with the stepslisted in Figure 11. This file contains all the steps needed tostart and configure the firewall and is executed every timethe system boots.

The last steps in setting up the firewall are to execute our script by adding /etc/rc.f/rc.firewall as the lastline of the file rc.local and setting the access to therc.firewall as root only (bit mask 700). The initializationscript will activate the firewall when the system reboots.

The machines on the external network do not need infor-mation about the internal network, and transparency is theobjective and function of the firewall. However, the internalnetwork clients may need services from the external networkand may want to access services behind the firewall.Therefore, the default gateway of the internal network clientsshould be the firewall’s internal network TCP/IP address.Furthermore, the domain name server resolution should con-tain a name server in the end-user’s network to provide accessto machine names or domains outside the internal networkwithout directly using the TCP/IP addresses.

Deploy and Test the FirewallOnce the firewall is operating, use some basic tests to verifyfunctionality. Figure 12 lists the suggested sequence of stepsfor troubleshooting.

Tests 3 and 4 usually succeed. If they do not, verify thatyou have switched on ICMP masquerading in the kernel andenabled TCP/IP forwarding in the rc.firewall file. If allfour tests succeed, you can assume that masquerading works.

You can also verify masquerading with a network monitoron an external network machine. All ICMP packages sentfrom the internal network should include only the TCP/IPaddress from the external network firewall NIC.

The Linux Firewall and Dell PowerApp ServersA Linux firewall is easy to use and configure on DellPowerApp servers, which are networked devices explicitlydesigned to provide a dedicated service, such as Web host-ing, e-mail, Internet caching, or firewall applications. Dell

www.dell.com/powersolutions PowerSolutions 75

This feature enables IPPORTFW. With this option, external

computers on the Internet can communicate directly to

specified internal MASQed machines. This feature is typi-

cally used to access internal SMTP, Telnet, and WWW

servers. FTP port forwarding will need an additional patch.

Additional information on port forwarding is available in the

“Forwards” section of the HOWTO.

� IP: ip fwmark masq-forwarding support (EXPERIMENTAL)

(CONFIG_IP_MASQUERADE_MFW) [Y/m/n/?] NO:

This feature allows IP forwarding directly from IPCHAINS. As

of release 2.2.12, this code is EXPERIMENTAL, and the rec-

ommended method is to use IPMASQADM and IPPORTFW.

� IP: optimize as router not host

(CONFIG_IP_ROUTER) [Y/n/?] YES:

This feature optimizes the kernel for the network subsystem,

although we do not know whether it makes a significant per-

formance difference.

� IP: GRE tunnels over IP

(CONFIG_NET_IPGRE) [N/y/m/?] NO:

This OPTIONAL feature enables PPTP and GRE tunnels

through the IP MASQ box.

� IP: TCP syncookie support (not enabled by default)

(CONFIG_SYN_COOKIES) [Y/n/?] YES:

This feature is HIGHLY RECOMMENDED for basic network

security.

� Network device support

(CONFIG_NETDEVICES) [Y/n/?] YES:

This feature enables the Linux Network sublayer.

� Dummy net driver support

(CONFIG_DUMMY) [M/n/y/?] YES:

This OPTIONAL feature can help with debugging problems.

� /proc file system support

(CONFIG_PROC_FS) [Y/n/?] YES:

This feature is required to enable the Linux network forward-

ing system.

Page 72: Building Your Internet Data Center

PowerApp servers are turnkey devices and can easily beconfigured to function as low-cost Linux firewalls requir-ing minimal administration.

Additional InformationFor additional information, consult:� Ziegler, Robert L. Linux Firewalls. New Riders Publishing:

2000.

� Hunt, Craig. Linux Network Servers. Network Press: 1999.� Siever, Ellen, et al. Linux in a Nutshell. O’Reilly: 1999.

Zafar Mahmood ([email protected]) is a solutionsconsultant for the Dell Enterprise Systems Group, SolutionsEngineering. Zafar came to Dell from Oracle’s WorldwideSolutions Support Group and has been involved in database per-formance optimization, database systems, and LAN implementa-tion for more than five years. Zafar has a Masters degree inElectrical Engineering with a specialization in ComputerCommunications from the City University of New York.

76 PowerSolutions

Figure 10. IPCHAINS Packet Filtering Rules for the Linux Firewall

Ipchains –P |INPUT|OUTPUT|FORWARD| is a default rule

regarding the input, output, or forwarding criteria of the fire-

wall. For example:� Ipchains –P forward DENY defines as the default rule

that no packets will be forwarded from the internal net-

work to the outside world.� Ipchains –P input DENY defines as the default rule that

no packets will be allowed to enter the internal network.� Ipchains –P output DENY defines as the default rule

that no packets will be allowed to go out of the internal

network.

Ipchains –A |INPUT|OUTPUT|FORWARD| -I interface

–p protocol –s source address –d destination

address –j RULE –l appends a rule regarding the input,

output, or forwarding criteria of the firewall to the default

rules. For example:� Ipchains –A input –I eth1 –s 172.16.0.0/16 –j

reject –l appends a rule that prevents packets from

the outside world from entering the internal network if

they arrive at the external interface eth1 containing a

source address indicating that they come from the inter-

nal network (spoofing). The –l option at the end of the

chain logs events relevant to this rule chain to the

/var/log/messages file.� ipchains -A input -i eth1 -s 0/0 67 -d 0/0 68

-p udp -j ACCEPT –l is an input rule that accepts IP

address assignments containing source, destination, and

port numbers of BOOTP server and client from a DHCP

server intended for external interface eth1. The rule also

logs each rule event to the /var/log/messages file.� ipchains -A forward -s 172.16.0.0/16 -j MASQ

–l defines the packet-forwarding criteria of the firewall.

It states that all packets originating from the internal net-

work should be masqueraded as if they were coming

from the external clients. Since the Internet standards

prohibit sending packets that have private network

addresses on the external networks, IP masquerading is

used to disguise those packets as packets with legitimate

IP addresses for the external networks.

IPCHAINS RULES rc.firewall

#!/bin/sh

/sbin/depmod -a

/sbin/modprobe ip_masq_ftp

echo “1” >/proc/sys/net/ipv4/ip_forward

echo “1” >/proc/sys/net/ipv4/ip_dynaddr

/sbin/ipchains -A input -i eth1 -s 0/0 67 -d

0/0 68 -p udp -j ACCEPT -l

/sbin/ipchains -M -S 7200 10 60

/sbin/ipchains -P forward DENY

/sbin/ipchains -A forward -s 172.16.0.0/16 -j

MASQ -l

Figure 11. rc.firewall File

On the firewall itself, ping both its external and internal

TCP/IP addresses and then one address in each internal and

external network. Positive results show that the network

stacks on the firewall are working properly and the network

is hooked up. If the external network address is distributed

through DHCP, try /sbin/ifconfig to gather the local net-

work addresses.

From an internal network client machine, ping its own IP

address to test its local functionality and then ping the fire-

wall’s internal network IP address to test whether the net-

work connection between the internal network client and

the firewall is working properly.

From an internal network client machine, ping the firewall’s

external network IP address and then ping an external client

machine’s IP address to test the routing of the firewall.

Now try the opposite and ping an internal network

machine from the external network. The result should be a

“destination unreachable” message.

STEPS IN FUNCTIONALITY TESTING

Figure 12. Testing Functionality of the Linux Firewall

Page 73: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 77

A re Intel-based servers that sport eight CPUs worththe money?

These souped up boxes have been pitched by theirmakers as offering great gains in performance in the enter-prise server market, but historically the scalability of theseeight-way machines has been less than stellar. Older eight-way architectures saw only a 30 percent to 60 percentincrease in performance, but costs were running about threetimes as high as users moved up from four-way machines.

To provide an assessment of the current state of eight-way server scalability affairs in the Intel-based server arena,we took a close look at the Dell PowerEdge 8450 eight-wayserver running Windows NT 4.0 with Service Pack 6 andWindows 2000. We chose Dell because it was the first todeliver on a server with support for the Profusion® architec-ture late last year.

Our test results show that the current Intel Profusionarchitecture with Pentium® Xeon® processors—as imple-mented in this Dell machine—scales very well when runningWindows NT 4.0 and Windows 2000, depending on theserver workload. Our results showed the eight-way configura-tion doubled the performance of its four-way counterpart inour Internet tests and registered 88 percent better perform-ance in our SQL tests. The PowerEdge 8450 we tested costsapproximately double that of Dell’s four-way boxes.

These performance gains are realized mainly throughimprovements in the hardware. Our tests showed very little dif-ference in the eight-processor support from Windows NT 4.0 to Windows 2000. Our testing used the operating systemas a platform for the test as opposed to a variable in thescalability mix.

Eight-Way EvolutionMultiway processor architectures have been around for awhile. Prior to the 1995 release of the Pentium Pro proces-sor, which was better suited for symmetric multiprocessing(SMP) configurations than its predecessor the Pentium,Unisys®, Sequent®, and NCR® had proprietary SMP archi-tectures that used Pentium processors. These architecturesprovided more performance from a single server but wereexpensive. The marketplace begged for a more affordablemethod for adding processors.

In 1997, vendors began taking a commodity approach to multiway processor architectures. NCR, Axil™, and Corollary® worked on an architecture to support SMP beyondfour-way—the upper limit supported by the Pentium Pro atthe time. NCR devised its Octascale® architecture and Axilworked on its Northbridge server technology, which pushedthe PC SMP limit to eight processors. Corollary decided towait for the Xeon processor (shipped September 1998) from

A Look atEight-Way

Server ScalabilityThe Dell PowerEdge 8450 gives a good bang for the buck. This is

particularly true if your server does lots of computationally heavy database

transactions, because the eight-way configuration is less than half the cost

of its four-way counterpart.

By John Bass

P R O D U C T R E V I E W

Page 74: Building Your Internet Data Center

Intel instead of developing its new Profusion architecturearound the Pentium Pro.

Waiting on the Pentium Xeon proved to be a good choicefor Corollary because Intel purchased it later in 1997. Thiswas a defining moment in the future of eight-way processing.By 1998, Axil announced it would pare down its operation,and excitement about NCR’s Octascale was fading.

Profusion development has not been without its problems.There was a glitch in the chipset in mid-1999 that causedsystems to lock up. This problem has been fixed, but it,along with past scalability issues with eight-way systems,have cast a cloud of uncertainty over the scalability of the

Profusion architecture. However, with servers based on thisarchitecture hitting the market as a commodity item, wedecided they warranted a closer look.

The objective of our testing was to find how well theProfusion chipset scales in terms of CPU performance inWindows NT and Windows 2000 environments. To stressthe servers adequately, we sought the most CPU-intensivebenchmark tests possible and worked with Quest Software™to produce benchmark factory-based tests that stressed theprocessor from an Internet and a database point of view.Market research told us that most eight-way machines areprimarily used as high-end Web and database servers.

We did not include file tests because our testing provedthat beyond a four-processor configuration, the limitednumber of clients we had in our lab created the bottleneck,as opposed to either the server hardware or software.

Test Results The Internet tests revealed good scalability of the serverwhile running Windows NT and Windows 2000. We raneach test with four, five, six, seven, and eight processors—first with Windows NT and then with Windows 2000 as theunderlying operating system. All processors in the Profusionarchitecture were 400 MHz CPUs.

We saw 31.95 transactions/sec with four processors and 63.31 transactions/sec with eight processors underWindows NT. This is a 98 percent increase in performancebetween four- and eight-way processor configurations. The performance increases from four, five, six, seven, andeight processors were virtually equal. This showed that theProfusion chipset in the Dell PowerEdge 8450 scales wellunder Windows NT in an Internet setting.

Figure 1 shows the Windows NT Internet test results andFigure 2 shows the Windows 2000 test results. These resultsshow that the Dell 8450 running either Windows NT orWindows 2000 scales well up to eight CPUs. Figure 3 showsthat Windows NT and Windows 2000 were near the theo-retical maximum scalability limits for performance. Pointsgreater than theoretical maximum scalability are due to per-formance enhancements such as server-side caching.

We measured 34.93 transactions/sec with four processorsand 73.41 transactions/sec with eight processors under Windows 2000, a 110 percent increase in performancebetween configurations. The increase of more than 100 per-cent is probably due to server-side application caching doneby the operating system and Web server. There were no unex-pected jumps or dips in performance as processors were added.

Note that the absolute performance values of Windows2000 were 11 percent higher than Windows NT. The SQLServer tests reinforced the results of the Internet tests, butthey showed a more realistic degree of scalability from four- to

78 PowerSolutions

70

60

50

40

30

20

10

01 3 5 7 9 11 13 15

Load (number of virtual users)

Tran

sact

ion

Per S

econ

d

4 CPUs 7 CPUs5 CPUs 8 CPUs6 CPUs

WINDOWS NT INTERNET TEST RESULTS

A steady performance increase as we added processors to this servershowed that the Dell 8450 running Windows NT 4.0 scales well upto eight CPUs.

Figure 1. Windows NT Internet Test Results

70

60

50

40

30

20

10

01 3 5 7 9 11 13 15

Load (number of virtual users)

Tran

sact

ion

Per S

econ

d

4 CPUs 7 CPUs5 CPUs 8 CPUs6 CPUs

WINDOWS 2000 INTERNET TEST RESULTS

A steady performance increase as we added processors to this servershowed that the Dell 8450 running Windows 2000 scales well up to eight CPUs.

Figure 2. Windows 2000 Internet Test Results

Page 75: Building Your Internet Data Center

eight-way server processor configurations. The amount of per-formance increase between configurations is about 88 percentwith Windows NT and Windows 2000.

We measured 1.15 transactions/sec with the SQL Servertest using four processors under Windows NT. For eightprocessors, we measured 2.16 transactions/sec. This is an87.8 percent increase in performance. The performancejumps between four, five, six, and seven processor configura-tions were at or near theoretical values. The performanceincrease from seven to eight processors was less than the the-oretical value, but still acceptable. Our test results showedthat the Profusion chipset in the Dell PowerEdge 8450 scaleswell under Windows NT running SQL Server.

Under Windows 2000, we saw 1.17 transactions/secwith the SQL test with four processors. For eight proces-sors, we measured 2.19 transactions/sec. This is an 87.2 percent performance increase. The performance increasesbetween four, five, six, and seven processors were near thetheoretical maximum. The increase from seven to eightprocessors was less than the theoretical maximum, but stillacceptable. Therefore, our tests showed that the Profusionchipset in the Dell PowerEdge 8450 scales well under Windows 2000 running SQL Server.

Figure 4 shows the SQL Server tests with Windows NT.This increase in transactions per second as we added CPUsshowed that the Dell 8450 scales well up to eight processorsfor database applications on Windows NT. Figure 5 showsthe SQL Server tests with Windows 2000. Figure 6 showsthe overall SQL Server test results for Windows NT andWindows 2000.

During the development of the tests, we found the lesscomputational heavy the SQL transactions, the less the per-formance would scale from four to eight processors. Thisoccurs because the processors must handle more networkrequests with a less computational-heavy SQL Server load.You must take this issue into account when estimating thebenefit an eight-way server might provide your operation.The more computationally strenuous the SQL Serverrequests, the better the eight-way server will scale.

www.dell.com/powersolutions PowerSolutions 79

120

110

100

90

80

70

60

504 5 6 7 8

Number of Processors

Perc

enta

ge o

f Per

form

ance

Incr

ease

over

Fou

r Pro

cess

ors

TheoreticalWindows NTWindows 2000

OVERALL INTERNET TEST RESULTS

This graph shows that Windows NT and Windows 2000 were near the theoretical maximum scalability limits for performance. Points greater thantheoretical maximum scalability are due to performance enhancements suchas server-side caching.

Figure 3. Overall Internet Test Results

2 4 6 8 10 12 14 16 18 20 22 24 2 28 30Number of Virtual Users

Tran

sact

ions

Per

Sec

ond

SQL SERVER TESTS WITH WINDOWS NT

The increase in transactions per second as we added CPUs showed that theDell 8450 scales well up to eight processors for database applications onWindows NT.

4 CPUs 7 CPUs5 CPUs 8 CPUs6 CPUs

2.5

2.0

1.5

1.0

0.5

0

Figure 4. Eight-Way Scalability with Windows NT and SQL Server

2 4 6 8 10 12 14 16 18 20 22 24 2 28 30Number of Virtual Users

Tran

sact

ions

Per

Sec

ond

SQL SERVER TESTS WITH WINDOWS 2000

The increase in transactions per second as we added CPUs showed that theDell 8450 scales well up to eight processors for database applications onWindows 2000.

4 CPUs 7 CPUs5 CPUs 8 CPUs6 CPUs

2.5

2.0

1.5

1.0

0.5

0

Figure 5. Eight-Way Scalability with Windows 2000 and SQL Server

Page 76: Building Your Internet Data Center

ConclusionsOur tests showed that the Dell eight-way server configura-tion scaled well from its four-way sibling in a heavy compu-tational environment. Results will vary depending on thepercentage of network and disk I/O in typical server work-load patterns. As the disk and network I/O increases, thedegree the server will scale will decrease.

This leads to the most important question: Are theseDell eight-way servers worth the money? If your server doeslots of computationally heavy database transactions, we sayyes, because the eight-way configuration is less than half thecost of its four-way counterpart. However, if you need youreight-way machine to complete several hundred networktransactions per second, the answer is not so obvious.

John Bass (john_bass@ ncsu.edu) is the technical director ofCentennial Networking Labs (CNL) at North Carolina StateUniversity. CNL is a network testing lab that specializes in func-tion and performance testing of networks and network equipment.

Reprinted with permission from Network World, July 3, 2000.

80 PowerSolutions

100

90

80

70

60

50

40

30

20

10

0

Number of Processors

Perc

enta

ge o

f Per

form

ance

Incr

ease

over

Fou

r Pro

cess

ors

OVERALL SQL TEST RESULTS

The Dell 8450 showed an 88%–90% improvement in performance when we increased the number of CPUs from four to eight.

TheoreticalWindows NTWindows 2000

4 5 6 7 8

Figure 6. Overall SQL Test Results

Windows NT 4.0 Enterprise Edition and Windows 2000

Advanced Server have eight-processor symmetric multi-

processing (SMP) support, but Microsoft advertises

improvements in SMP performance from Windows NT to

Windows 2000. Microsoft provides its users with limited

information on these improvements other than to say

that performance is boosted by ability of Windows 2000

to map network-interface-card interrupts to a particular

processor, and that the new operating system has more

efficient implementation of the Windows NT File System.

Aside from these relatively minor changes in function to

improve SMP performance, there is a big change in how

Windows 2000 manages memory greater than 4 GB.

The new Enterprise Memory Architecture (EMA)

improves the operating system’s ability to support Intel’s

Physical Address Extensions (PAE) for 32 bit IA-32-based

systems. PAE lets a server use more than 4 GB of RAM.

The Windows 2000 new EMA feature should greatly

improve on Windows NT performance in systems with

large amounts of RAM.

SMP SCALABILITY IMPROVEMENT FROM WINDOWS NT TO WINDOWS 2000

Want to reachDell enterprise

customers?

You have choices. A limited number of pages are available in each issue for advertising. Visit www.dell.com/powersolutions for a current media kit or call Debbie Dawson at 650-948-3140 ext. 102.

Or if you would like to share your experience with Dell readers, you may submit an article for publication in the magazine. Visitwww.dell.com/powersolutions for information about submitting articles or send e-mail to [email protected].

We welcome your feedback and encourage you to submit topics that youwould like to see in the magazine by sending e-mail to [email protected].

Dell Power Solutions, The Magazine for Direct Enterprise Solutions

Page 77: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 81

Beowulf is a concept of clustering commodity computersto form a parallel, virtual supercomputer. It is easy to

build a unique Beowulf cluster from components that youconsider most appropriate for your applications.

Figure 1 shows the architectural stack of a typicalBeowulf cluster. As the figure illustrates, numerous designchoices exist for building a Beowulf cluster. No Beowulf

cluster is general enough to satisfy the needs of everyone.This article presents some considerations, derived from Dell’sexperience building several Beowulf clusters and collectingtheir benchmark results, for selecting compute nodes.

Compute node qualities to consider include processortype, size and speed of Level 2 (L2) cache, number of proces-sors per node, speed of front-side bus (FSB), scalability ofmemory subsystem, and Peripheral Component Interconnect(PCI) bus speed. Some parallel applications are cache-friendly; that is, the problem can be easily accommodated byL2 cache. Compute nodes with large full-speed L2 cache canboost the performance of these applications. On the otherhand, applications with random and wide-range memoryaccess patterns will be more likely to benefit from fasterprocessor speed, system bus, and memory subsystem, ratherthan large L2 cache.

Performance Characteristics of a Compute NodeBefore you can select the compute nodes, you must under-stand their performance characteristics for compute-intensiveapplications—one of the typical parallel applications running

Design Choicesfor a Cost-Effective,

High-PerformanceBeowulf Cluster

The article “High-Performance Computing with Beowulf Clusters,” Issue 2,

2000, of Power Solutions provided an overview of Beowulf clusters built from

commodity computer systems. This article, the first in a series, will focus on

the design choices for building a cost-effective, high-performance Beowulf

cluster, beginning with the compute nodes used to construct one.

By Jenwei Hsieh, Ph.D.

In June 2000, one of the Dell PowerEdge clusterslocated at Cornell University was named to theTop500 supercomputer list. Comprised primarilyof supercomputers and mainframes, the Top500list ranks the cluster of 64 Dell PowerEdge 6350servers located at the Cornell Theory Center(CTC) among the 500 highest performing, mostcomplex supercomputers in the world. This is thefirst time a system from Dell has been includedon the prestigious list.

H I G H P E R F O R M A N C E C O M P U T I N G

Page 78: Building Your Internet Data Center

on a Beowulf cluster. The Hierarchical INTegration (HINT)benchmark program, developed by Dr. John Gustafson andother researchers at the U.S. Department of Energy’s AmesLaboratory, is a good tool to help understand the performancecharacteristics of a compute node.

HINT uses an approximation method to solve an inte-gration problem. The objective of the integration is toobtain the highest quality answer in the least amount oftime, for as large a range of times as possible. “Quality” is thereciprocal of the error, which combines precision loss anddiscretization error.

During the calculation process, the speed is defined asQUality Improvement Per Second (QUIPS), and is measuredas a function of time or the memory size of the problem. HINTcan be run with any precision of any data type: floating-point,integer, and so on. For this article, we used floating-point

double precision (DOUBLE). The results are reported in agraphical representation and show floating-point performance,memory hierarchy, unit-stride memory performance, numericalaccuracy, and non-unit-stride memory performance.

Figure 2 shows the performance characteristics of DellPowerEdge 2400 (with 600 MHz Pentium III processors) and6350 (with 450 MHz Pentium II Xeon processors) serverswhen running HINT with one processor and two processors.Toward the left side of the curve, the problem is small and itfits completely in the cache. Hence, the maximum QUIPS isachieved on this side of the curve.

As the problem size increases (moving right), the datarequires more memory than the Level 1 (L1) cache and L2cache can accommodate. Therefore, the boundary betweencache and main memory and the corresponding performancedip in the HINT curve is visible. For example, the HINT curveof PowerEdge 2400 with a single processor drops at around 256 KB, which is the size of the L2 cache. The HINT curve ofPowerEdge 6350 with dual processors drops at around 4 MB,which is the total amount of L2 cache for two processors.

Comparing the HINT curves of both platforms producesseveral interesting observations:� PowerEdge 2400 has better peak performance when the

L2 cache can accommodate the problems. This is becauseof its faster processor, system bus, and memory subsystem.

� PowerEdge 6350 has better performance for those prob-lems larger than the L2 cache in PowerEdge 2400 canhandle, but not larger than the L2 cache in PowerEdge6350 can handle. The range of problem size increaseswith the number of processors because of the aggregationof L2 cache.

� Comparing the HINT curves of dual processors shows aninteresting difference in peak and memory performancebetween PowerEdge 2400 and 6350. PowerEdge 2400 has34 percent better peak performance than PowerEdge6350, but only 19 percent better memory performance,even with its faster FSB speed and memory technology:133 MHz synchronous dynamic RAM (SDRAM) versus50ns EDO dynamic RAM (DRAM). The narrower differ-ence in memory performance indicates that PowerEdge6350 has a more scalable memory subsystem. Its chipsetand the associated memory subsystem are designed to support up to four processors simultaneously.

Performance Characteristics of a Beowulf ClusterThe parallel version of HINT also can be used to study theperformance characteristics of a Beowulf cluster, especiallythose built completely from the same type of compute nodes.Figure 3 shows the HINT curves of a Beowulf cluster builtfrom eight PowerEdge 6350 servers and two versions of SGI™Origin2000™-series systems.

82 PowerSolutions

Highly Parallel ApplicationsHighly Parallel Applications

MPI/ProMPI/Pro MPICHMPICH PVMPVM

Windows NT/2000Windows NT/2000 LinuxLinux

TCP/IP VIA GMTCP/IP VIA GM

Fast Ethernet Giganet MyrinetFast Ethernet Giganet Myrinet

PowerEdge 2450 or 6350sPowerEdge Servers or Precision Workstations

Applications

Middleware

OS

Protocol

Interconnect

Compute Nodes

Figure 1. Architectural Stack of a Typical Beowulf-Class Cluster

Figure 2. HINT Curves of Dell PowerEdge 2400 and 6350 Servers

Memory Used (logscale)8K 16 32 64 128 256 0.5 1M 2 4 8 16 128M

4.5e+06

4e+06

3.5e+06

3e+06

2.5e+06

2e+06

1.5e+06

1e+06

500000

0

QUIP

S

HINT with DOUBLE

PowerEdge 2400 2 CPUsPowerEdge 6350 2 CPUs

PowerEdge 2400 1 CPUPowerEdge 6350 1 CPU

Page 79: Building Your Internet Data Center

The Beowulf cluster has 32 Pentium II Xeon processorsrunning at 450 MHz. Each SGI Origin2000 has 32 processors.The faster version uses 250 MHz MIPS R10000 processors,and the slower version uses 200 MHz R10000 processors. TheHINT curves show that using the same number of processors,the Dell Beowulf cluster has better peak performance than theslower version of SGI Origin2000, and slightly lower peakperformance than the faster version of SGI Origin2000.The HINT curves of these two SGI Origin2000s havewider coverage than the Dell cluster because of their largerL2 cache (4 MB per MIPS processor versus 2 MB perPentium II Xeon processor).

The HINT curves of Figure 3 also show the memory sub-system performance of these three systems. When the prob-lem size is larger than 200 MB, the Dell Beowulf cluster hasbetter performance. It takes advantage of the independentmemory subsystems from eight PowerEdge 6350 servers.

SMP versus Uniprocessor: Performance/Cost Ratio Comparison Once we obtained a “hint” of a compute node’s performancecharacteristics, the next question was how many processorsshould we use in each compute node. To answer this ques-tion, we employed the well-known Numerical AerospaceSimulation (NAS) Parallel Benchmark (NPB) suite devel-oped by NASA’s Ames Research Center.

The NPB suite consists of eight programs derived fromcomputational fluid dynamics (CFD) codes. Each of the eightprograms—five kernels and three simulated CFD applications—represents some particular aspect of highly parallel computa-tion for aerophysics applications. The five kernels—EP, FT,MG, CG, and IS—mimic the computational core of different

numerical methods used by CFD applications. The simulatedCFD applications—LU, SP, and BT—reproduce much of thedata movement and computations found in full CFD codes.

Figure 4 shows the performance of the LU program onthe previously mentioned Dell Beowulf cluster with eightPowerEdge 6350 servers, each with four 450 MHz Pentium IIXeon processors. The x-axis represents the number of proces-sors used for the benchmarks, in the format of “number ofservers” by “number of processors per server.” Total perform-ance is represented as bars in the unit of millions of floating-point operations per second (Mflops/sec). The per-processorperformance, derived from dividing the total performance bythe number of processors, is represented as a curve with theunit of Mflops/sec per processor.

Using the same number of processors, we tried differentcombinations of the number of servers and the number ofprocessors. For example, for the total of eight processors, wetested three combinations (2 by 4, 4 by 2, and 8 by 1—thethree data points on the left-hand side).

The results show that from a per-processor performancepoint of view, we attain better performance with one processorper server than two processors per server, and two processorsper server is also better than four processors per server forcompute-intensive applications. Therefore, if your goal isbest performance and budget is not a concern, your Beowulfcluster should use uniprocessor compute nodes.

However, budget is a concern for most Beowulf users. Theidea of using a Beowulf cluster is to take advantage of its cost-effectiveness. Consider the performance of the NPB fromanother perspective: performance/cost ratio comparison.

Figure 5 shows the performance/cost ratio comparisonwith the NPB programs. The performance/cost ratio of eachNPB program is defined as the total performance of oneNPB program divided by the total list price of all hardware

www.dell.com/powersolutions PowerSolutions 83

Memory Used (logscale)16 32K 1M 2 4 8 16 32 64 512M 1G

5e+07

4e+07

3e+07

2e+07

1e+07

0

QUIP

S

SGI Origin2000 (250 MHz)Dell Beowulf ClusterSGI Origin2000 (200 MHz)

HINT on PowerEdge Cluster vs. SGI Origin2000s

Figure 3. HINT on Dell Beowulf Cluster and SGI Origin2000s

Figure 4. Performance of the LU Program

0

500

1000

1500

2000

2500 120

100

80

60

40

20

02×4 4×2 8×1 4×4 8×2 8×4

Number of Nodes × Number of Processors

Mflo

p/Pr

oc/S

econ

d

Mflo

p/Se

cond

LU Class B

Page 80: Building Your Internet Data Center

and software components. The performance/cost ratio ofusing uniprocessor compute nodes is used as the base. Wefurther divide the ratio of using two- and four-way servers bythe base, for comparison.

In Figure 5, if the bar of either two- or four-way serversis equal to 1.00, they have the same performance/costratio. It does not matter what compute nodes you use for aBeowulf cluster, since you achieve the same performancewith the same cost.

On the other hand, if the bars are greater than 1.00, itmeans that you can achieve better performance with two- orfour-way servers with the same budget. With the exception ofIS and CG, NPB programs demonstrate a better performance/cost ratio on symmetric multiprocessing (SMP) nodes. Thatis because you can save the cost for hardware equipment andcluster interconnect with SMP nodes, even though theycannot deliver linear performance scalability compared touniprocessor nodes.

Impact of L2 Cache and Memory SubsystemFinally, let us study the impact of L2 cache and memorysubsystem on the performance and scalability of a Beowulfcluster. In Figure 2, we used HINT to understand the per-formance characteristics of two types of compute nodes:PowerEdge 2400 and 6350 servers. We again applied NPBprograms to further study the impact of L2 cache andmemory subsystem.

The eight NPB benchmark programs have differentdegrees of cache-friendliness. Figure 6 shows the perform-ance of IS and LU on two Dell Beowulf clusters. One cluster consists of eight PowerEdge 2400s with sixteen 600 MHz Pentium III processors with 256 KB L2 cache;the other cluster has eight PowerEdge 6350s with sixteen450 MHz Pentium II Xeon processors with 2 MB L2 cache.We selected IS and LU because they are two extremeexamples of cache-friendliness; the LU program is themost cache-friendly NPB program, and the IS program isthe least cache-friendly.

Figure 6 uses the same representation approach as Figure 4, comparing the total performance and per-processorperformance between two Dell Beowulf clusters built fromtwo generations of SMP servers based on Intel architecture.For IS, the cluster of PowerEdge 2400s performs betterthan the cluster of PowerEdge 6350s. However, for cache-friendly applications such as LU, the cluster of PowerEdge6350s performs better than its competitor. The former alsoshows better scalability from using an 8-by-1 configurationto using an 8-by-2 configuration. This can be seen from theper-processor performance curves in Figure 6.

Selecting Compute Nodes for A High-Performance Beowulf ClusterThe first step in building a cost-effective high-performanceBeowulf cluster is to select the proper compute node basedon the needs of your applications. This article discussed theperformance results from several Dell Beowulf clusters todemonstrate the impact of compute node architecture on thecluster performance.

From a performance/cost ratio perspective, Intel-basedSMP servers are cost-effective compute nodes. Depending on the memory access pattern of applications, you shouldconsider the size of L2 cache and the scalability of thememory subsystem in your Beowulf cluster.

Jenwei Hsieh, Ph.D. ([email protected]) is a member ofthe Internet Infrastructure Technologies team at Dell. Jenwei isresponsible for developing high-speed interconnects as one of thebuilding blocks for the Internet infrastructure and Beowulf clusters.He has published more than 30 technical papers in the areas of multi-media computing and communications, high-speed networking, serialstorage interfaces, and distributed network computing. Jenwei has a Ph.D. in Computer Science from the University of Minnesotaand a B.E. from Tamkang University in Taiwan.

84 PowerSolutions

Figure 5. Performance/Cost Ratio Comparison with the NPB Programs

1.60

1.40

1.20

1.00

0.80

0.60

0.40

0.20

0.00LU SP IS FT BT MG CG

NAS Parallel Benchmark

Uniprocessor2-way SMP4-way SMP

Ratio

(>

1 in

dica

tes

BETT

ER)

Figure 6. Performance of NPB IS and LU programs on Two Dell Beowulf Clusters

0

10

20

30

40

50

60

0

1

2

3

4

5

6

8×1 8×2Number of Nodes × Number of Processors

Mflo

p/Pr

oc/S

econ

d

Mflo

p/Se

cond

IS Class B1800

1600

1400

1200

1000

800

600

400

200

0

120

100

80

60

40

20

08×1 8×2

Number of Nodes × Number of Processors

Mflo

p/Pr

oc/S

econ

d

Mflo

p/Se

cond

LU Class B

PE2400T

PE6350T

pe2400p

pe6350p

Note: PE2400T and PE6350T are total performance; pe2400p and pe6350p are per-processor performance

Page 81: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 85

With its new power management and plug-and-playfeatures, the Windows 2000 Professional OS enables

businesses to lower the total cost of ownership for their desk-top environments; Active Directory is already a generallyaccepted model for managing an information infrastructure. Asa result, the momentum in favor of Windows 2000 has beenincreasing steadily, and nearly every major Windows NT-basedenterprise is planning to move to Windows 2000 as soonas possible.

To save companies time and money in this process andhelp them create better managed desktop infrastructures,Dell has drawn on its strategic relationship with Microsoftand experience with its own internal migration to developthe Dell Windows 2000 Desktop Deployment Package.This product migrates user systems from Windows 95,Windows 98, and Windows NT 4.0 desktop systems toWindows 2000. Developed as an automated, one-clickprocess, the Windows 2000 Desktop Deployment Packageis repeatable, sustainable, and scalable to meet the migra-tion needs of businesses of any size. It can be updated tohandle future versions of the OS. The documentation,project plan, and schedule included with the Dell

Windows 2000 Desktop Deployment Package can be usedas a template for any regional Windows 2000 deployment.

This article describes the Dell migration experience,explains how the Dell Windows 2000 Desktop DeploymentPackage works, and reviews factors companies should con-sider before migrating to Windows 2000.

The Dell Migration ExperienceBecause of its strategic relationship with Microsoft, Dellwas a critical partner in the development and testing of the Windows 2000 OS. Dell’s internal migration was con-ducted in two distinct phases. The first phase, completedbetween September 1998 and February 1999, included ajoint development program, a rapid deployment program,and an internal adoption program. This phase focused on theinfrastructure side of Windows 2000. It included productionof the Domain Name Service (DNS) structure and design ofthe Active Directory (AD) and Organizational Unit (OU),as well as testing of the production infrastructure. Phase Iconcluded with the infrastructure deployment project tomigrate Dell production server platforms to Windows 2000Server and Advanced Server.

Windows 2000Desktop Deployment

at DellAn enterprise-wide migration to a newer version of an operating system is

challenging in any business environment. Dell Computer is upgrading more than

30,000 worldwide Windows clients to the new Windows 2000 environment.

This article describes Dell’s methodology and offers key insights into the

challenges of a mass-production migration environment.

By Max Thoene

E N T E R P R I S E M A N A G E M E N T

Page 82: Building Your Internet Data Center

Phase II, which began inMarch 2000, focused on theproduction deployment ofWindows 2000 Professionalto the desktop and laptopcomputers of Dell employees.Dell’s goals for the deploy-ment were to keep migrationcosts below the industryaverage and achieve 25 per-cent client penetration inthe first three months of thedeployment period.

Success FactorsIn planning the productionrollout of the Windows 2000 client to its internal businesssegments, Dell quickly discovered that the success of theproject would depend on the team’s ability to:� Define the scope of the project (that is, to identify,

qualify, select, and schedule the existing clients to bemigrated from the current OS to Windows 2000)

� Select appropriate deployment methods and apply themefficiently and effectively

� Limit user downtime during the migration process andimprove on industry averages for migration times

� Select and integrate internal and external tool sets forefficient use during the migration process

Integrating Customized SolutionsDell learned many lessons in its deployment project. To give Dell customers the benefit of this experience,Dell decided to develop a repeatable, sustainable, andstandardized turnkey solution for migrating fromWindows 95, Windows 98, and Windows NT 4.0 toWindows 2000: the Windows 2000 Desktop DeploymentPackage. To meet mission-critical user and businessrequirements for the migration, Dell determined that the Package must:� Provide a user-friendly, Web-enabled process aided by

tools, wizards, and online guides� Be usable during a normal business schedule or after hours

to reduce impact on critical business activities� Require a smaller amount of downtime than other migra-

tion alternatives� Ensure data security and preservation through a storage

area network (SAN)

Dell took advantage of the Dell enterprise server envi-ronment to develop an end-to-end migration process driventhrough and controlled by end-user interaction with a secure

Web site. The Dell enterprise server environment includesdatabase and application servers, backup tape libraries, and aSAN data environment.

Operating in this environment, Dell was able to design anautomated migration process requiring minimal user input.The challenges identified early in the Dell migration projectwere resolved through proof-of-concept and pilot testing.

Dell’s migration strategy used both in-house Dell solutionsand vendor applications. Figure 1 summarizes the functionshandled by these applications.

With these Dell and third-party solutions, Dell was able toachieve its overall goals for the migration by largely automat-ing the process. Migration time was two to three hours, a 50percent reduction compared to the four to six hours requiredfor other migration solutions. The average transition cost persystem was less than one-half of the original internal forecast.Performing migration activities at night rather than duringnormal business hours further reduced the impact on endusers. Dell completed the first phase of the client deploymenttwo weeks ahead of schedule, also as a result of automation.

The Dell and third-party solutions listed in Figure 1 hadthe following effects on the success factors defined above: � Define the scope of the project: SQL queries from

human resources, Remedy, applications certification, andpremigration databases quickly identify, qualify, select,and schedule migrations, and also handle communica-tions with targeted system owners.

� Select appropriate deployment methods and applythem effectively: Dell’s effort to automate the migrationprocess resulted in two methods that use the sametools—Night Migrations (the en masse scenario) andSelf-Install (the one-off scenario). In Night Migrations,technicians migrate a large number of co-located sys-tems outside normal working hours. The Self-Installmethod allows Dell users with some technical back-

ground to migrate their ownsystems to Windows 2000 withlittle or no outside assistance.

� Limit user downtime duringthe migration process andimprove on industry aver-ages for migration times:With automation, migrationtime was two to three hours,compared with four to sixhours required by othermigration solutions. Nightmigration further reduces the impact on end users ofmigration during regular business hours.

86 PowerSolutions

Dell took advantage of

the Dell enterprise

server environment to

develop an end-to-end

migration process driven

through and controlled

by end-user interaction

with a secure Web site.

To reduce downtime,

migration can be

scheduled for periods

when targeted users

expect to be out

of their offices,

possibly during

off-business hours.

Page 83: Building Your Internet Data Center

� Select and integrate internal and external tool sets forefficient use: To automate migration, Dell integratedthird-party applications with internally developed appli-cations and Dell hardware and knowledge platforms toenhance the overall process.

Risks and Lessons LearnedIn migrating its clients to Windows 2000, Dell identifiednumerous project risks and learned a variety of lessons appli-cable to other organizations considering a migration, as out-lined below.

Risks. Dell documented and mitigated a variety ofrisks during the migration effort. Four risks seem to be themost prevalent and persistent threats to successful projectcompletion:1. Application certification. Success depends on ensuring that

all third-party and internally developed business-criticalapplications are identified, tested, and certified asWindows 2000 compliant prior to migrating systems inany business segment.

2. Business partner involvement. Business segment support,involvement, and participation are required to effectivelytarget, qualify, schedule, and migrate systems within aparticular business segment.

3. Hardware compatibility. Only systems that meet minimumsystems component requirements for Windows 2000 com-patibility should be migrated. (Up to 30 percent to 50percent of potential targets may not meet this criterion.)Methods for making systems Windows 2000 compliantand criteria for deciding which systems to upgrade shouldbe determined in advance.

4. Desktop settings. Determining the security context of endusers (Administrator, Power User, or User) is necessary toavoid creating security problems in migrating their systems.

Lessons Learned. The lessons learned from Dell’s inter-nal deployment include the following best practices:1. Use a formal program/project management approach. The core

project team should be an experienced group of informa-tion technology (IT) project or program managers.

2. Involve business segment leadership early and often. Theleaders of each business segment should appointknowledgeable business liaisons for migration who areresponsible for defining clear functional requirementsand promoting communication between the migrationteam and the business segments regarding all aspects ofproject progress.

3. Solicit executive sponsorship. Executive support for projectsuccess should be clearly communicated throughout theproject life cycle.

4. Encourage cross-functional teaming. Cross-functional teamsincluding representatives from all IT groups (engineering,support, applications development, database manage-ment, and the core migration team) as well as other cor-porate functions (business segments, corporate communi-cations) serve as catalysts for project success.

5. Maintain a strong Web site presence. The Web site is a criti-cal tool for all aspects of the project. It gives users accessto information, answers to frequently asked questions,reports on project progress, and end-user documentationand feedback. It also provides tools for migration techni-cians and self-installers and a central location for report-ing and displaying project progress metrics.

www.dell.com/powersolutions PowerSolutions 87

Third-party applications� Microsoft System Management Server (SMS) for hardware

compatibility reporting and enterprise software application

loading� Microsoft User State Migration Tool (USMT) for data

preservation� BrioQuery Designer reporting package from Brio as a tool

for generating reports for multiple databases� People Table database from Remedy for system and

employee data reporting

Internally developed Dell applications� ActiveX scripts to perform premigration application certifica-

tion screening and initiate the migration processes

� Wise Installer scripts to automate BIOS flash for systems

requiring an update prior to migrating to Windows 2000

and for third-party application calls� Auto-e-mailer to issue migration notifications and instruc-

tions to the target users� Intranet Web portal to post information, end-user education,

end-to-end documentation, and premigration, migration, and

metrics tool sets� Applications Certification database to track the status of

identified applications and determine their Windows 2000

compliance � SQL queries to join various databases (premigration,

Remedy, human resources, application certification) for

scheduling reports and reporting on enterprise status

DELL AND THIRD-PARTY APPLICATIONS IN DELL WINDOWS 2000 DESKTOP DEPLOYMENT PACKAGE

Figure 1. Dell and Third-Party Applications Used in the Dell Windows 2000 Desktop Deployment Package

Page 84: Building Your Internet Data Center

1. Target systems for migration based on business require-

ments and coordinate migration schedule through a

business liaison.

2. Send background information and instructions for pre-

migration activities to targeted users by automated e-mail,

and post them on the migration Web site. Users should be

informed about expectations and requirements.

3. Check user’s system for hardware and software compat-

ibility. Verify that the system is running the latest BIOS

version, and flash BIOS if necessary.

4. Notify user of hardware and software incompatibilities,

if any.

5. Back up user data, drive mappings, printer mappings,

personalized settings, and network login information.

Create a complete hard-drive image of each system for

redundant backup.

6. Install Windows 2000, create computer account, and

connect to the Windows 2000 domain.

7. Restore user data and desktop settings, such as drive

mappings and printer mappings.

8. Install SMS and company-specified applications through

the network.

9. Install user-specified third-party applications.

10. Verify that migration is complete and send survey to

solicit user verification of the migration and feedback on

the process.

6. Choose a powerful storage solution. The storage solution isa critical tool to ensure the preservation of end-userdata, personal settings, mapped drives, printer settings,shortcuts, favorites, and wallpaper settings. The storagesolution must be scalable to meet the number of systemsto be migrated while maintaining sufficient space tostore data to meet business expectations for online andoff-line storage if data retrieval becomes necessary.

7. Handle scheduling carefully. Constant attention to logisticsis needed to connect the qualified targets and geographiclocations while meeting the needs of business segmentsand migration technicians. Scheduling will require theefforts of one or more full-time staff members, dependingon the number of targets.

8. Plan a productive application certification effort. A proactiveand continuous effort must be made to identify, catalog, andtest both third-party and internally developed applicationsto certify them as Windows 2000 compliant. Business seg-ments should be the final certification authorities.

9. Use flexible deployment methods. The migration team mustbe able to change deployment methods to address busi-ness needs related to time windows and time-of-dayrequirements for upgrades.

The Recommended Migration ProcessDell distilled the results of its own migration experience intoa set of guidelines for production migration to Windows2000. Businesses that follow these guidelines and use the

88 PowerSolutions

START

Initiate Data Restore

Load Windows 2000Image

Initiate Data Backup

Verify SystemPasses Validation

Communicate to Users• Expectations• Requirements• Send them to Web Site

Target Usersfor Roll-Out

PassesValidation?

Yes

No

SMS Installation andDellSoft Apps Installed

User Verifies Completionand Fills Out Survey

Message to UserReason(s) WhySystem Failed

Install Applications Windows 2000

User FixesDeficiencies

Yes

No

Hardware Refresh

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0 10.0

Figure 2. Windows 2000 Migration Process

TEN-STEP MIGRATION PROCESS

Page 85: Building Your Internet Data Center

Dell Windows 2000 Desktop Deployment Package will savetime and money in their migration efforts.

Dell uses a Web-enabled, user-interactive, 10-step processto migrate each client. Figure 2 illustrates the migrationprocess. The box describes each step. All 10 steps can becompleted through the Web using Dell’s Windows 2000Desktop Deployment Package.

The migration team makes business decisions, such asscheduling the migration, and communicates with tar-geted users. To reduce downtime, migration can be sched-uled for periods when targeted users expect to be out oftheir offices, possibly during off-business hours. The abil-ity to migrate many users at once simplifies the task ofpreparing employees for the migration and helping themadjust to the new OS.

Details of the Windows 2000 Desktop Deployment PackageDell’s Windows 2000 Desktop Deployment Package is a totalWeb-based solution using portal technology. Figure 3 showsthe portal for the Package.

The Package provides an integrated suite of Web-basedtools built on an advanced infrastructure that is transparentto the user. The user community for the Package comprisesboth basic users (who require assistance in migrating) andadvanced users (who can install through a Web site, guidedby wizards and documentation).

InfrastructureThe Dell Windows 2000 Desktop Deployment Package usesa highly available, high-performance, three-tier architecture,as illustrated in Figure 4. This architecture results in central-ized management; increased bandwidth, data integrity, andstorage space; and increased reliability compared toWindows NT 4.0. The architecture is scalable to meet anysize business and can be easily expanded.

Client Hardware RequirementsThe minimum hardware requirements for a successfulupgrade are listed below: � 200+ MHz Pentium processor� 128+ MB RAM� 4+ GB hard drive� Floppy drive or CD-ROM Install

Systems that do not meet the minimum requirementsshould be upgraded or replaced before migration.

Dell SAN ArchitectureThe Dell enterprise server environment includes databaseand application servers, backup tape libraries, and a SANdata environment. The Dell SAN architecture:

www.dell.com/powersolutions PowerSolutions 89

Figure 3. Windows 2000 Desktop Deployment Package Web Page

Client IE Browser

Dell Web Server

PowerEdge6350 Servers

PowerVault 51FFibre Channel Bridges

PowerVault 650Fand 630F Processors

and Storage

PowerVault 130TDLT Library

PowerVault 35FFibre Channel Switch

PowerVault 130TDLT Library

PowerVault 35FFibre Channel Switch

Gigabit Ethernet

Intranet

Client/End-UserWeb Interaction

Web-BasedApplications

Back-endStorageSolution

(Online andOff-line

Storage)

Figure 4. Windows 2000 Migration Architecture

Page 86: Building Your Internet Data Center

� Scales easily—it can be duplicated for dif-ferent regions, and migration scripts canbe load balanced

� Includes two terabytes of total storage foruser data and system images

� Includes one-quarter terabyte of storage forquick restores from tape

Issues to Consider Before MigratingMigrating to Windows 2000 on a large scalerequires preparation beyond the capacity ofmany companies’ IT departments. Withoutthe proper assessment, planning, and execu-tion, the intended benefits may be delayed or diminished. Furthermore, if technicalresources become strained, the disruptions tothe company’s critical business can addunforeseen costs to the project.

Preparation can reduce the risks associated with migrat-ing users to a new OS. Companies can use the tools providedwith Dell’s Windows 2000 Desktop Deployment Package toidentify needs and manage known risk factors before migrat-ing users to the new operating system, and they can leverageDell’s knowledge to optimize the migration. The Web-basedtools provided with the Dell Windows 2000 DesktopDeployment Package can help businesses manage theknown risks listed in the box.

Cost-Effective Automated DeploymentDell Computer is deploying Microsoft’s newWindows 2000 Professional OS to itsemployees’ personal computers more quicklyand less expensively than anticipated. Tofacilitate the deployment, Dell developed anautomated process that includes Web-basedtools enabling employees to automaticallytest their PC configurations for hardwareand applications compatibility with the newOS. This process helped Dell achieve its ini-tial deployment objectives two weeks aheadof its own aggressive schedule, with minimalinconvenience to employees. Installationtimes were less than one-half of Dell’s origi-nal estimates, and the average transitioncost per system was significantly lower thanDell’s internal forecast.

This experience at Dell demonstrates that companiescan rapidly and cost-effectively deploy Windows 2000Professional in a complex environment. The tremendousbenefits associated with Windows 2000 Professional,including improved reliability, stability, and performanceacross the enterprise, provide a powerful incentive for Dellcustomers to make this transition as quickly as possible.Dell expects the industry transition to Windows 2000 toaccelerate during the second half of this year.

To assist its customers in the transition, Dell is offer-ing the Dell Windows 2000 Desktop DeploymentPackage. This integrated, user-friendly suite of Web-basedtools provides a repeatable, sustainable, and scalable toolto meet the migration needs of businesses of any size, andit can be updated to handle future versions of the OS.The documentation, project plan, and schedule providedwith the Package can be used as a template for anyWindows 2000 deployment. For additional information onthe Windows 2000 Deployment Package, contact DellTechnology Consulting.

Max Thoene ([email protected]) is a program manager inDell Technology Consulting at Dell Computer Corporation. Hewas program manager, product manager, and logistics manager inIT Operations for the Windows 2000 client project at Dell. Maxhas a B.B.A. in Marketing/Management from the University ofTexas at Austin and M.S. degrees in Systems Management andInformation and Telecommunications Systems from CapitolCollege in Laurel, Maryland.

90 PowerSolutions

Certification of Applications� Track status of third-party and Dell-developed applica-

tions with the Application Certification Database� Repackage installation programs by using Microsoft SMS� Scan for standard software applications and force soft-

ware licensing requirements

Desktop Settings� Establish a standard desktop environment� Retain current user desktop settings and personalization

Hardware Compatibility� Scan for minimum hardware requirement, and match

hardware compatibility list � Generate reports for asset management to plan for hard-

ware upgrades

Business Involvement� Monitor migration progress� Provide Web-based training and documentation

KNOWN MIGRATION RISKS

The Web-based tools

provided with the

Dell Windows 2000

Desktop Deployment

Package can help

businesses manage

the known risks

of migrating to

a new OS.

� Visit us online at WWW.DELL.COM/POWERSOLUTIONS

Page 87: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 91

The Dell OpenManage Server Assistant CD (see Figure 1)provides utilities and functions for getting your PowerEdge

server up and running. Historically, the purpose of the ServerAssistant CD has been to deliver the most current Dell-optimized drivers for PowerEdge servers. In addition, the most up-to-date Dell-optimized drivers can always be found atwww.dell.com/support. Although the Dell-optimized driversare available from both of these sources, they are not alwaysavailable on the operating system (OS) CD provided by theOS vendor: Microsoft, Novell, or Red Hat.

For optimal operation of Dell PowerEdgeservers, Dell recommends using the device drivers delivered on the Server Assistant CDinstead of the native drivers on the CD sup-plied by the OS vendor. However, DellPowerEdge customers will face a cumber-some installation process if they try toinstall the Server Assistant device driverswith the installation tools provided by theOS vendor, because the various OS vendors’installation tools are geared toward installing thedrivers on the OS CD. By using the Server Assistant

CD to facilitate the OS installation, PowerEdge customers caninstall the OS in significantly less time than would be requiredto install it using the manual setup programs provided by theOS vendors. Administrators who are either working with asystem in which the OS was not pre-installed by Dell orembarking on an OS reinstallation should Start Here by usingthe OpenManage Server Assistant 6.0 CD.

The Latest Server Assistant ReleaseServer Assistant 5.x and previous versions werebased on an embedded Microsoft Windows 95

operating environment. OpenManage ServerAssistant Version 6.0 has been re-architectedto use an embedded Microsoft Windows NTEmbedded (NTE) operating environment.The move from a PC-based Windows 95environment to a server-oriented WindowsNTE environment provided the foundation

for making numerous improvements toServer Assistant 6.0. The more server-oriented

Start Here…for Fast, Reliable

Operating SystemInstallation

For many PowerEdge server administrators, the Server Assistant CD is already

a familiar utility that eases the often cumbersome process of installing an

operating system (OS). This article describes the features and improvements

in the latest Server Assistant CD that further enhance and streamline the

OS installation process.

By Geoff Meyer

Figure 1. The Dell OpenManage Server Assistant CD

E N T E R P R I S E M A N A G E M E N T

Page 88: Building Your Internet Data Center

NTE OS offered increased tools and addi-tional capabilities for Server Assistant withoutthe need for a significant facelift for the ServerAssistant user interface. Server Assistant 6.0has not only introduced feature enhance-ments that eliminated installation errors, butalso improved the performance of OS installa-tions by almost 30 percent in several serverand OS configurations.

One way of looking at the upgrade is toimagine Server Assistant as having undergonean engine overhaul. The changes betweenreleases 5.0 and 6.0 can be likened to the waymajor automobile manufacturers customizecars for police departments. While the policecars (6.0) are superficially similar—except forthe lights on top—to their consumer models(5.0), the modifications hidden under the hood give thepolice cars extra horsepower and an edge in high-speedchases (and OS installations).

Simplified Server SetupThe primary feature of Server Assistant, the Server Setupoption, provides two benefits. First, it minimizes user errorsduring the OS installation process, shown in Figure 2.Through a combination of hardware detection, user input,and built-in validation, Server Setup virtually eliminatesinstallation errors for Windows NT, Windows 2000, Linux,and NetWare. Second, it shortens the amount of timerequired to install, or reinstall, the OS. The majority of thefeature improvements in Server Assistant 6.0 focus onstreamlining the server setup process.

Reduced OS Installation Time With the availability of more tools and drivers,Server Assistant 6.0 can perform all of its func-tions from within the Windows environment,without having to reboot the server. This is themost significant enhancement to the improvedperformance of Server Assistant 6.0.

In previous versions of Server Assistant,installing the Windows NT 4.0 OS requiredbetween 33 and 60 minutes. ServerAssistant 6.0, on the other hand, requiresonly approximately 20 minutes. Even forthe most seasoned administrators, thisreduced installation time makes ServerAssistant 6.0 a compelling alternative to themanual, F6-required installation process forWindows NT and Windows 2000. Figure 3

compares the performance of different Dell OpenManageServer Assistant versions.

Automated Hardware Detection and Driver InstallationServer Assistant has always provided the most current Dell-optimized drivers for a RAID, network interface card (NIC),or other peripheral. But previous releases have been unableto detect the peripherals that were actually installed on theserver. Consequently, Server Assistant releases prior to 6.0relied on some user intervention to determine the appropri-ate driver required by these peripherals. Therefore, theadministrator received a prompt during the OS setup processfor the actual video, NIC, and other drivers. This left thesetup process susceptible to inadvertent errors introduced bythe administrator. Of course, the PowerEdge administrator

92 PowerSolutions

Server bootsto the ServerAssistant CD

The NTE OS and dsa.exe load from

the CD into memory

Server initiatesNTE and executes

the memory-resident dsa.exe

Server Assistant detectshardware configuration

and identifies therequired drivers

User configuresthe boot partition

via ExpressRAID setup

User specifies Server OSinstallation parametersand Server Assistantsaves these to the OSInstall parameter file

Server Assistant formats the hard drive and writes

the Utility Partition

Server Assistant copies the appropriateDell-optimized drivers

to the hard drive

Server Assistantcopies the OS

installation parameterfile to the hard drive

Server Assistant initiatesthe OS setup routine for Windows NT, Windows2000, Linux, or NetWare

Setup routineperforms OS setupprocess

Server Assistant installs the

Service Pack(Windows NT only)

OS Installation Using Server Assistant 6.0 Server Setup

Figure 2. Server Assistant 6.0 Server Setup Process

Through a combination

of hardware detection,

user input, and built-in

validation, Server Setup

virtually eliminates

installation errors for

Windows NT,

Windows 2000, Linux,

and NetWare.

Page 89: Building Your Internet Data Center

was then required to know specific details of the server con-figuration, such as NIC vendor and model numbers.

Server Assistant 6.0 automates this entire process. It alsoeliminates the need for the user to know the hardware con-figuration and the interrupting prompt during the OS setup.The ability of Server Assistant to detect and install theappropriate drivers eliminates guesswork, further improvesthe OS installation process, and, most importantly, elimi-nates potential user-introduced errors.

Simplified Setup with Windows-based RAID ConsoleServer Assistant 6.0 simplifies the setup of the OS bootpartition on RAID-based systems. Previous versions ofServer Assistant on PowerEdge servers required the use ofeither the BIOS or DOS-based configuration utility topreconfigure the RAID containers for the specific RAIDcard configured in that server. As a result, the user experience during server setup and OS installation couldvary significantly depending on which RAID card wasinstalled in the system.

Because Server Assistant 6.0 is now based on theWindows NT OS, the Dell OpenManage Array Managerutility, shown in Figure 4, has been integrated into the oper-ating environment. From the System Tools menu within theServer Assistant user interface, the administrator can nowcreate, modify, and delete virtual disks prior to installing theOS on the PowerEdge server.

Simplified Configuration with RAID Setup WizardWith the Dell OpenManage Array Manager integrated intothe Server Assistant environment, the next step was to sim-plify the configuration of a RAID boot partition for the masses,especially for those administrators configuring servers equippedwith RAID On Mother Board (embedded RAID controller onthe server’s system board). The RAID setup wizard, shown inFigure 5, provides a single screen with three preconfiguredoptions (provided with online descriptions) that standardizesand simplifies the RAID boot partition configuration.

Customizable Boot Size Partitions Previous versions of Server Assistant enabled the administra-tor to create a boot partition size greater than 2 GB. Usershad no granularity options except to extend the partition tothe maximum size allowed by the OS and the hard drive.Therefore, Server Assistant automatically assigned the fullpartition of the specified disk as the boot partition.

Server Assistant 6.0, however, can detect and thencalculate the valid range for the boot partition. Thisallows the administrator to select any size within thatrange (see Figure 6). The default size is now 4 GB forboot partitions.

www.dell.com/powersolutions PowerSolutions 93

0:00 0:10 0:20 0:30 0:40 0:50

0:33:23

0:21:08

0:44:34

0:48:43

0:24:26

0:27:53

0:37:15

0:31:16

Windows NT 4.0 usingServer Assistant 5.4 Express Setup

Windows NT 4.0 usingServer Assistant 6.0 Server Setup

Windows 2000 using Manual Setup

Windows 2000 usingServer Assistant 6.0 Server Setup

Red Hat Linux usingManual Setup

Red Hat Linux usingServer Assistant 6.0 Server Setup

NetWare 5.x usingServer Assistant 5.4 Express Setup

NetWare 5.x usingServer Assistant 6.0 Server Setup

Figure 3. Performance Comparisons for Different Dell OpenManage Server Assistant Versions

Figure 4. The Dell OpenManage Array Manager Utility

Figure 5. RAID Setup Wizard

Page 90: Building Your Internet Data Center

Faster OS Setup with Time Zone/System Clock WizardsThe OS setup program typically sets the system clock andtime zone for a server. Because this capability is providedwithin the Server Assistant process, users respond to all ofthe server setup and OS installation prompts specific totheir configuration during the Server Assistant interview.Virtually the entire user interaction that normally occursduring the OS setup is now suppressed because ServerAssistant has scripted all the required information.

Service Packs on the CDIn the continuing effort to streamline the OS installationprocess, the Server Assistant CD now includes Windows NT 4.0Service Packs. Service Packs are now common forWindows NT 4.0. Their availability on the Server Assistant CDsaves the administrator the step of downloading them from theMicrosoft support site. The CD includes English versions ofService Packs 5 and 6.

Supported Operating SystemsServer Assistant supports installation of the following operating systems on PowerEdge servers:

� Windows NT 4.0 Server� Windows NT 4.0, Enterprise Edition� Windows 2000 Server� Windows 2000 Advanced Server� Novell NetWare 4.2� Novell NetWare 5.x� Red Hat Linux 6.2 SBE2

The Future of Server Assistant Dell OpenManager Server Assistant continues to improvethe process of installing PowerEdge Servers. The latestrelease—6.0—delivers added RAID configuration, supportfor configurable boot partitions, and reduced time toinstall the OS.

The next release of Server Assistant is already on thedrawing board. Upcoming releases of Server Assistant willfeature replication and unattended installation capabilities.Support for replication and unattended installation eases theOS installation burden when installing 1 to n servers of thesame type and configuration. Using these capabilities, theadministrator would first complete the Server Assistantinterview process. Then, rather than invoke the setup andOS install process, Server Assistant would save the setupparameters for systems 2 through n.

Additionally, Server Assistant will extend the serversetup process to include Dell OpenManage applicationinstallation. The server setup interview will also include theoption to install Dell OpenManage agents and softwareapplications.

Future releases will continue improvements in the instal-lation process, providing Dell customers with one of the bestinstallation experiences in the industry.

Geoff Meyer ([email protected]) is a software developmentmanager in the Systems Management Development group at Dell Computer Corporation. In addition to the Server Assistant program, Geoff manages the software development efforts of theVersion Assistant and OpenManage Software Installation programs.

94 PowerSolutions

Figure 6. Customizing the Boot Partition Size with Server Assistant 6.0

Red Hat Linux—Factory Installed on PowerEdge ServersDell offers Red Hat Linux 6.2 as one of three strategic operating system environments (in addition to Windows NT/

Windows 2000 and NetWare) for PowerEdge servers and PowerApp appliances. It is factory installed on PowerEdge servers.

Linux provides a cost-effective, optimal operating environment for the front end of an Internet infrastructure using

PowerEdge servers and/or PowerApp server appliances. All network interface cards (NICs), SCSI RAID controllers, and

SCSI storage offered by Dell support Red Hat Linux.

For additional information, visit www.dell.com/linux, www.redhat.com, or www.linux.org.

Red Hat Linux—Factory Installed on PowerEdge Servers

Page 91: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 95

Windows NT Performance Monitor (Perfmon) may bethe single most powerful—yet most underutilized fea-

ture of the Windows NT operating system. Perfmon is widelyused by the performance benchmarking community, whichrelies on the tool daily to identify system and application-level performance issues. However, most end users eitherignore the tool completely or fail to make full use of itstremendous potential. IT professionals often fail to fullyrealize the value Perfmon can add by enabling them toperform advanced performance analysis on their Windows NTinfrastructures.

Note: This article describes the operation of Perfmon in aWindows NT context. According to Microsoft, “Perfmon.exein Windows 2000 looks and operates like Performance Monitorin Windows NT.”

Getting Started with PerfmonTo launch Perfmon, simply type “start perfmon” at the com-mand prompt. You can also invoke Perfmon by navigating to

Start Menu –> Programs –> Administrative Tools –> Performance Monitor or Start Menu –> Run –> Perfmon.

After you have launched Perfmon, you must select atarget system, or computer, to monitor. This system can bethe localhost from which you launched Perfmon or anotherWindows NT system on the local network. Because of theoverhead associated with running Perfmon, it is recom-mended that Perfmon run on a remote computer while scru-tinizing production servers for performance issues across anetwork. To select a system, click on the “plus sign” icon inthe Perfmon toolbar. This action invokes a network browsershowing localhost as the default monitoring target.

After you have selected the system you wish to monitor,choose an object, or subsystem, to monitor. These includesuch components as system, memory, network interfaces, ordisk I/O subsystems. Next, choose the counters you wish tomonitor. A counter is an indicator of a subsystem that moni-tors performance that can be quantitatively measured, suchas the percent CPU time, pages in and out per second for

Windows NTPerformance

Monitor—A Practical Approach

The Dell Application Solution Center uses several workload and benchmarking

tools to stress, monitor, analyze, and tune solutions during customer and

independent software vendor engagements. This article describes how to get

the most out of one of their more popular choices, Windows NT Performance

Monitor (Perfmon).

By Paul Del Vecchio

E N T E R P R I S E M A N A G E M E N T

Page 92: Building Your Internet Data Center

virtual memory, or packets sent and received per second forthe network interface. Figure 1 shows the Perfmon dialogbox to select hosts, objects, and counters.

After selecting the target system, its objects, and countersto monitor, you can run Perfmon in any of three modes:� Chart mode—graphically displays information on the

counters of a selected object and allows you to monitorthe system in real time

� Log mode—collects all counter data from selectedobjects, allowing you to analyze the logged data later

� Report mode—raw data from counters that can be easilyimported to spreadsheets like Microsoft Excel for the cre-ation of graphs and charts

Getting the Most Out of PerfmonA common mistake among Perfmon users is failing to fullyutilize the tool. Of the three modes, most users run Perfmonin the default mode (chart) and never explore what theother two modes do. Charting in real-time mode poses twosignificant limitations: first, data is not captured, but merelyviewed for a short time before it simply falls off the display;and second, detailed analysis of on-screen data in real time isdifficult, if not impossible. The Log and Report modes over-come these limitations by allowing the user to capture thedata rather than simply viewing it for a limited time interval.

Chart Mode Provides a Quick GlanceIn chart mode, Perfmon displays information on a selectedobject’s counters in either a horizontal graph (default) orvertical histogram. The default graph view is usually ade-quate; however, the histogram may be useful for comparingthe values of many instances, such as the thread state of manythreads belonging to a single process. The horizontal graphfails to present this type of comparison in a way that is easilydigestible and understandable to the user.

In chart mode with the vertical histogram view, you canadjust the sampling rate of Perfmon, by either increasing ordecreasing the rate at which the tool samples and displaysdata. The default rate is one second; that is to say, samplesare taken and displayed at one-second intervals. The verticalhistogram offers only real-time information from the lastsample taken. Once the counter changes value, the previousdata is gone and cannot be retrieved: Perfmon does recordan average, but over a limited period of time.

In chart mode with the horizontal graph view, you canmonitor the levels of specific counters in real time and alsoreceive history information. The amount of history availableis extremely limited, however, because the display size is relatively small.

In either case, chart mode is not intended to capturedata. Its main purpose is to give the user a quick glance at afew metrics over a short period; it is not intended to providein-depth monitoring capability over time.

Log Mode Captures InformationIn chart mode, you can monitor the system in real time.Even if you decrease the sample rate from the default of onesecond, you are not capturing information. To properly ana-lyze system performance, you must log the data from thecounters for analysis later.

To log counter data, follow these steps:1. Bring up the log dialog box by selecting the “disk” icon

on the Perfmon toolbar. 2. Select the “plus sign” icon on the toolbar to again display

the browser, which allows you to select both the host andthe objects you wish to capture.

3. From the Options menu in the Perfmon menu bar,select “Log.”

4. In the log dialog box, name your log file and specify thepath you wish to store it in. Specify a sample rate, thensave the log file.

5. From the “plus sign” icon on the toolbar, select theobjects you wish to monitor. All of the counters associ-ated with a particular object will be captured and avail-able for analysis later.

6. Select “Start Log” in the log dialog box to start logging systemperformance. Once the log as been started, you will notice thestatus of the log change from “closed” to “collecting.” Thismeans Perfmon has begun collecting object and counter infor-mation and is saving the data to your log file. You can returnto the chart mode while your log is collecting information, toview some counters of interest to you in real time. The diskicon appears in the lower right-hand corner of the chart alongwith its current size. This is to remind the user that a loggingsession is currently underway and will continue to grow in sizeuntil the user stops the collection process.

96 PowerSolutions

Figure 1. Perfmon Host, Object, and Counter Selection Dialog Box

Page 93: Building Your Internet Data Center

7. Select the log dialog box and click “Stop Log” to halt thecollection process. Your log file is now ready for immedi-ate analysis in the chart mode. To view its contents,simply select “Chart” from the drop-down menu on thetoolbar and select “Data From.” The choices are “currentactivity” (real time) or “log file,” in which case a filebrowser dialog will appear enabling the user to browse thesystem for their newly created Perfmon log.

To obtain the greatest benefit, you should log severalobjects on a particular system, depending on what applica-tion is running and what subsystems are involved inmoving and processing data. For example, Web serverapplications tend to stress processor and network resources,but not memory or disk resources. Database applicationstend to stress processor, memory, and disk resources, butrarely network resources. Messaging applications tend tostress all of the above.

You should begin in log mode by monitoring more objectsrather than less, at least until you have deemed that particularsubsystems are not at risk for potential performance issues.

After capturing the right data by logging the appropriateobjects on a particular system, you can move to the analysisstage. To help narrow your area of focus, you should analyzethe counters of a single object. Do not bring up countersacross multiple objects for analysis until you have properlyanalyzed each object separately and documented all yourfindings (data).

Because Perfmon was written with hardware and softwaresubsystems in mind, it is logically organized to lend itself toviews and presentations sorted by subsystem type. For exam-ple, the System object enables you to monitor counters such as % Processor Time, % Privileged Time, SystemCalls/sec, Interrupts/sec, and so on. These counters are allrelated to hardware and software system-level resources. TheMemory object allows you to scrutinize counters such as thepaging file reads or writes per second, memory utilization,and cache faults per second.

When logging, choosing the right objects to capture isnot your sole consideration. It is important to choose asample rate that best suits your needs. The log file size willalso depend upon the sample interval, as well as the number

www.dell.com/powersolutions PowerSolutions 97

One particular software vendor engaged the ASC in an effort

to optimize its messaging application written for Windows NT

4.0. The application was a messaging server, compliant with

Simple Mail Transfer Protocol (SMTP), Post Office Protocol 3

(POP3), and Internet Message Access Protocol 4 (IMAP4),

designed to send and receive e-mail within a corporate IT

infrastructure and over the Internet.

While testing the application, we ran a messaging bench-

mark against the server to generate stresses typical of messag-

ing server workloads. We used Mailstone 2.0, one of the pre-

mier messaging benchmarks at that time. After collecting

Perfmon data via logging mode, we scrutinized the logs with

the chart mode.

After reviewing the logs, we noted a potential disk I/O

performance problem: specifically, the transfer latencies and

disk queue length counters (sec/transfer and average disk

queue length) under the logical disk object. By doing this,

we determined the disk subsystem was saturated. As

requests from the application made their way through the

OS to the disk I/O subsystem (SCSI device driver), the time

to process a request by the subsystem steadily increased

past acceptable limitations.

After determining that a disk I/O bottleneck existed, our

first instinct was to add more spindles (physical disk drives) to

the RAID (Redundant Array of Independent Disks) array that

contained the messaging stores. We believed this strategy

would help to reduce the growing transfer latencies during our

tests. The RAID array was already properly tuned to deal with

the small, random nature of the I/O requests sent by the mes-

saging application. Regardless of the robustness of the I/O

subsystem we put in place, the system continued to suffer

from severe disk transfer latency issues. We decided to look

further into the I/O subsystem of the application architecture

rather than simply adding more hardware.

We used a file system tool called NT Filemon, a tool devel-

oped by a group known then as NT Internals, now SysInternals

Group. Filemon is a lower level tool (provides more detailed

information) than Perfmon and enables the user to measure

which processes interact with the Windows NT file system

(NTFS), at what rate, and for what duration. We decided to

measure the file system hits that resulted from sending a

single mail message. This seemed like a good first test.

Instead of capturing the expected four to five hits on the

file system per mail message, to our surprise we found that our

mail message interacted with NTFS more than 30 times. The

developers worked quickly to correct the issue in their applica-

tion’s I/O architecture. They modified the code to utilize main

memory for work that had previously utilized NTFS to tem-

porarily store message header, data, and error control informa-

tion in the form of caches. The real solution lied in modifying

software, not hardware, and enabled us to break through the

disk I/O bottleneck.

ADDING HARDWARE WAS NOT ENOUGH: A PERSONAL EXPERIENCE WITH PERFORMANCE MONITORING

Page 94: Building Your Internet Data Center

of objects you monitor. If you are monitoring a productionserver for performance issues over a normal workday (8 to 10hours), your sample interval should be somewhat longer thanif you were running a 10-minute benchmarking test.

A fairly frequent sample interval, such as 5, 10, or 15 sec-onds, captures specific events and produces very granular anddetailed information. It also generates a lot of data, andtherefore can lead to massive log files when you sample sys-tems over long periods of time. Since disk space is sometimesa very sought-after resource, you might want to take this intoconsideration.

If you log with sample intervals that are too long on theother hand, you may miss crucial system events or transi-tions. For benchmarking scenarios, in which tests rarely lastlonger than 30 to 60 minutes, a log file sample intervalbetween 5 and 15 seconds is ideal. For longer tests, or formonitoring production servers over several hours, you shouldbump the interval to a number between 15 and 30 seconds,and in some cases to one minute or greater. There is no sub-stitute for experience, so experiment often to determinewhich interval is best for your particular needs.

To end the logging process, return to the log dialog boxand select “Stop.” The log file then closes and is ready for

analysis in chart mode. Figure 2 shows the log dialog box forselecting sample rate, log path, and filename.

Reporting ModeThe reporting mode simply shows raw data as taken fromPerfmon. This enables the user to import the data to aspreadsheet program for analysis, creation of charts orgraphs, and so forth. Its purpose is primarily to enable theformatting of data taken and not the collection of it.

When using the reporting function, the raw data fromobject counters is displayed in a table-like format. Thereporting mode lends itself to documentation better thancharting but makes visual comparison difficult and relation-ships harder to recognize.

Combine all Perfmon Modes for Greater Benefit You should use all three modes together to attain the fullbenefit of Perfmon. The log mode is the most useful for cap-turing data for detailed analysis in the chart mode. Thereport mode is most useful for viewing and formatting rawdata. The input of the chart can be toggled between the cur-rent system activity and data from a log file captured earlier.

Proceeding to AnalysisAfter properly logging with Perfmon, you can begin analysisin the chart mode. The best method is to select each objectindividually, analyze it, write down all the displayed data,and determine if a performance problem exists. Figure 3shows some typical Perfmon objects and associated countersthat Dell ASC engineers often monitor in the Dell Appli-cation Solution Center.

You can begin the analysis stage by pulling up Perfmon inthe chart mode and choosing the “Data From” option. Thisaction selects your log file as the input, or source. Next,begin analysis by adding counters from a single object; you

98 PowerSolutions

Figure 2. Perfmon Log Dialog Box for Selecting Sample Rate, Log Path, and Filename

The Dell Application Solution Center (ASC) is a laboratory

environment designed to help customers and software ven-

dors test and tune their applications and solutions before

they go live. Dell ASC engineers are specifically trained in

the performance-tuning disciplines and adhere to a strict

process methodology.

The heart of the Dell ASC methodology is a process

known as the “top-down approach,” which focuses on scruti-

nizing the entire solution stack from the top down. The

approach starts at the system level (CPU, memory, I/O),

works down through the application level (application code

and logic), and concludes at the micro-architecture level by

testing for Intel processor instruction-level performance

issues (Level 1 and Level 2 cache; branch prediction; single-

instruction multiple-data, stream processing; and so on).

A Dell ASC lab engagement involves the installation and

configuration of a customer solution or vendor application

on Dell server and storage hardware, followed by stress test-

ing to evaluate performance. Stress testing involves one of

several workload or benchmarking tools to generate

stresses representative of real-world client/server activity.

For more information about the Dell ASC, contact a Dell

representative or visit www.dell.com.

TESTING AND TUNING AT THE DELL APPLICATIONSOLUTION CENTER

Page 95: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 99

may want to start with the System object. If you chooseSystem, begin adding the associated counters mentioned inFigure 3, such as % CPU Time, % Privileged Time, and soon. Figure 4 shows 12 popular counters and their recom-mended ranges for normal system operation.

Resolve Performance IssuesThe performance monitor logs may show that one or moreof your target’s subsystems appear to be excessively utilizedor that some component has reached its saturation point.Your first instinct could be to provide additional hardwareresources at each bottleneck; but this strategy is not alwaysthe correct approach.

If you have now identified a resource contention issue onyour system, you should properly isolate the root cause ofthat contention. First, you must understand what the hard-ware needs to do and why. The operating system simplyfacilitates and carries out requests from applications.

Applications can request a system (and its hardware) toperform work in a variety of ways. There are good and badways to perform work or move data on any operating system.Solid, source-controlled network operating systems rarelyexhibit true scalability problems. Whether they do or do notis immaterial, because they are always static, beyond yourcontrol, and finite for that particular release or servicerelease. True, combative scalability issues exist within thesoftware applications written for a particular NOS. There-fore, you should be at least as familiar with the applicationsyou are running, their behaviors and architectures, as theoperating system and hardware you are running them on.

Using Perfmon as a First DefenseAlthough hardware subsystems often require adjustment,tuning, or modification to defeat resource contention issues,you must examine all the factors carefully before tackling aparticular performance foe. In addition, you should store allof your performance monitor logs for future reference, asboth hardware and software technology change at a rapidpace. This will help you to identify problems in the future bycomparing the performance information of a properly tunedsystem with one you believe to be performance impaired.

The key to a healthy system is to use Windows NT Perfmon often to measure and monitor its performance forpotential resource contention issues. Let Perfmon be yourfirst line of defense. Use lower level tools (such as NT Filemon) when necessary, to increase the granularity anddetail of your data, as well as your understanding of the rootcauses of these problems, and also for data correlation pur-poses. Never allow one piece of data from a single sourceto determine you have found a potential performanceproblem or resolution!

Objects Counters

System % CPU Time

% Privileged Time

Context Switches/sec

File Control Operations/sec

File Data Operations/sec

System Calls/sec

Interrupts/sec

Memory Pages/sec

Page Faults/sec

Demand Zero Faults/sec

Transition Faults/sec

Cache Faults/sec

Pages In/sec

Pages Out/sec

Logical Disk Average Disk Queue Length

Average Disk sec/Transfer

Disk Bytes/sec

Disk Reads/sec

Disk Writes/sec

Disk Transfers/sec

Average Disk Bytes/Transfer

Network Bytes Total/secInterface Output Queue Length

Packets/sec

Packets Sent/sec

Packets Received/sec

Web Service Current Anonymous Users

Get Requests/sec

CGI Requests/sec

Active Server Requests/secPages Requests Rejected

Request Execution Time

Request Wait Time

Request Queue Length

SQL Server Cache Hit RatioGeneral Stats I/O Lazy Writes/sec

I/O Transactions/sec

Max Tempdb Space Used

RA (read ahead) Pages Found in Cache

SQL Server Page Locks—ExclusiveLocks Lock Waits/sec

Max Users Blocked

Total Blocking Locks

Lock Escalations/sec

Lock Wait Time

Note: This list is not exhaustive and does not define individualcounters. For a complete list of all counters and their technical definitions, click on the “Explain” button in the counter selectiondialog box within Perfmon.

POPULAR PERFMON OBJECTS AND COUNTERS

Figure 3. Popular Perfmon Objects and Counters

Page 96: Building Your Internet Data Center

You also should consider the Microsoft Software Develop-ment Kits (SDK) and Windows NT and Windows 2000Resource Kits for a veritable wealth of low-level tools to helpyou identify hardware and software performance issues.

Additional ResourcesThe following sources provide additional information aboutperformance monitoring:

� Zen and the Art of Performance Monitoring: Windows NT 3.51 Resource Kit, Volume 4:http://msdn.microsoft.com/library/default.asp?URL=/library/winresource/dnwinnt/S7883.HTM

� Microsoft Platform Software Development Kit (SDK)home page: http://msdn.microsoft.com/library/default.asp?URL=/library/psdk/portals/mainport.htm

� Windows NT 4.0 and Windows 2000 Resource Kitsfor MSDN subscribers

� SysInternals: http://www.sysinternals.com� Winternals: http://www.winternals.com

Paul Del Vecchio ([email protected]) is a senior performance analyst for the System Software Engineering Lab atIntel Corporation. He has spent the past six years helping cus-tomers, ISVs, and OEMs achieve optimal performance of theirsolutions that run on Intel architecture.

100 PowerSolutions

Counters (Parent Object) Recommended Range

% CPU Time (System) 0-90% (> 90% indicates potential processor bottleneck; may also indicate thread contention problem; investigate Context Switches/sec and System Calls/sec for potential thread issues)

% Privileged Time (System) 0-40% (> 40% indicates excessive system activity; correlate with System Calls/sec)

Context Switches/sec (System) 0-10,000 (> 10,000 may indicate too many threads contending for resources; correlate with System Calls/sec and threads counter in Windows NT Task Manager to identify process responsible)

File Control Operations/sec (System) Ratio dependent (The combined rate of file system operations that are neither reads nor writes [file control/manipulation only, non-data related]. Inverse of File Data Operations/sec)

File Data Operations/sec (System) Ratio dependent (Combined rate of all read/write operations for all logical drives. Inverse of File Control Operations/sec)

System Calls/sec (System) 0-20,000 (> 20,000 indicates potentially excessive Windows NT system activity; correlate with Context Switches/sec and threads counter in Windows NT Task Manager to identify process responsible)

Interrupts/sec (System) 0-5000 (> 5000 indicates possible excessive hardware interrupts; justification is dependent on device activity)

Pages/sec (Memory) 0-200 (> 200 warrants investigation into memory subsystem; define reads (pages in) versus writes (pages out); check for proper paging file and resident disk configuration; May indicate application memory allocation problems, heap management issues)

Average Disk Queue Length (Logical Disk) 0-2 (> 2 indicates potential disk I/O bottleneck due to I/O subsystem request queue growing; correlate with Average Disk sec/Transfer)

Average Disk sec/Transfer (Logical Disk) 0-.020 (> .020 seconds indicates excessive request transfer latency and potential disk I/O bottleneck; define reads/sec versus writes/sec; correlate with Average Disk Queue Length)

Bytes Total/sec (Network Interface) Depends upon interface type (10baseT, 100baseT) A potential network I/O bottleneck exists when throughput approaches theoretical maximum for interface type. (For example, 10baseT theoretical maximum = 10 × 1,000,000 bits = 100Mbits/sec divided by 8 = 12.5 Mbytes/sec)

Packets/sec (Network Interface) Depends upon interface type (10baseT, 100baseT)

Note: These general guidelines are meant only to help detect a potential performance issue. Further investigation will confirm or deny whether a performance issue really exists.

COMMON COUNTERS AND RECOMMENDED RANGES

Figure 4. Common Counters and Recommended Ranges

Page 97: Building Your Internet Data Center

www.dell.com/powersolutions PowerSolutions 101

M icrosoft Exchange 2000 Server arrives this fall, repletewith technologies that will extend the messaging and

collaboration features of its predecessor, Exchange 5.5.While organizations requiring increased uptime will welcomeits availability features, such as active/active clustering andmultiple databases per server, Exchange 2000 Server will alsobe scalable enough to serve populations larger thanAmerica’s biggest corporations.

Organizations intending to host high-capacity Internetmessaging solutions will likely consider Exchange 2000Server and its re-engineered Outlook Web Access (OWA).No longer an Exchange Server add-on, OWA is native toevery Exchange 2000 Server and capable of supporting sig-nificantly larger user loads per server while leveraging thehigh-availability features of Windows 2000.

Exchange 2000 Server OWA (OWA 2000) solutions willbe especially attractive to certain types of organizations:� Educational institutions with large, transient populations� Application Service Providers (ASPs) requiring a scala-

ble and highly available platform that supports multipleorganizations on shared servers

� Organizations building extranets or portals for customersand business partners

� Organizations with distributed mailbox servers thatwould like to centralize messaging services

The revamped information store (messaging database) of Exchange 2000 Server has contributed to the increasedscalability of OWA 2000. The information store is now a WebStore that provides access to any mailbox object through aURL. Messaging operations no longer require a MAPI sessionand can be performed using HTTP-DAV methods. HTTP-DAV, also known as WebDAV, incorporates extensions toHTTP that provide collaborative file authoring capabilities.Within OWA itself, Active Server Pages have been replacedby Web Forms that leverage Dynamic HTML and ExtensibleMarkup Language (XML), which reduces server-side process-ing and client-to-server communication.

Enhancements Improve User Experience Microsoft has concentrated on delivering a richer clientexperience in OWA 2000, especially for those using

Outlook Web Access:Bringing

Exchange 2000 Serverto the Masses

With its improved feature set, scalable architecture, and high-availability

options, Microsoft Outlook Web Access (OWA) has been reborn in Exchange

2000 Server. This article is intended to help you design an OWA solution by

explaining the technologies that provide its scalability and availability: the

Exchange 2000 Server front-end/back-end topology, Windows 2000 Network

Load Balancing, Microsoft Cluster Services, and storage area networks.

By David Sayer

K N O W L E D G E M A N A G E M E N T

Page 98: Building Your Internet Data Center

Internet Explorer 5. Drag-and-drop operations, rich textediting, right-click context menus, and a preview pane area few of the “fat-client” features now available in OWA2000. Embedded objects and multimedia messaging are alsosupported. Before you toss your Outlook Deployment Kit,however, note that these Outlook features are not includedin OWA 2000:� Task and journal entries� Spelling checker� Background name resolution� Off-line use

OWA 2000 Raises Active User Load Ceiling Exchange Server 5.5 OWA is not recommended for activeuser loads above 500, and 800 users is the acknowledgedceiling. The Exchange 2000 Server development team hasraised this ceiling and aims to have servers support as manyWebDAV/OWA 2000 users as normal MAPI/Outlook users.

Judging from tests performed in Dell labs using ReleaseCandidate code, the development team appears to have suc-ceeded. Preliminary results indicate that OWA 2000 willsupport peak concurrency rates of 3,000 users or more perserver. As always, load capacity will vary depending on usageprofiles, server specifications, and solution design, but athree- to five-fold increase in client capacity over OWA 5.5is expected.

The Web Store is not the only architectural change inExchange 2000 Server. All core services—directory, protocol,

and storage—have been re-implemented. The Exchangedirectory services have been transferred to Active Directory.IIS 5.0, which now runs on every Exchange 2000 Server, han-dles all messaging protocols. The addition of storage groupsallows a single server’s data to be distributed across multiplestores, minimizing data corruption risk and increasing servicelevels. These changes were not essential to increase ExchangeServer performance, but rather to increase the scalability andavailability of the application architecture.

Front-End/Back-End Topology Provides a Scalable ArchitectureA chief scalability enhancement in Exchange 2000 Server isits support of a front-end/back-end topology. A front-endExchange 2000 Server hosts no stores and is dedicated tocommunicating with messaging clients, whether they speakHTTP, Post Office Protocol 3 (POP3), or Internet MessageAccess Protocol 4 (IMAP4). A front-end server is only aprotocol handler, and back-end servers containing the publicand private Web Stores ultimately process all client requests.

Figure 1 illustrates a simple front-end/back-end environ-ment. The messaging client communicates only with thefront-end server, which proxies requests to the back-endservers. The Active Directory server (domain controller),while not part of the Exchange 2000 Server front-end/back-end topology, plays a central role in enabling thisprocess. The front-end server cannot authenticate the clientrequest without consulting Active Directory. The front-endmust also perform a Lightweight Directory Access Protocol(LDAP) query against the domain controller to obtain auser’s home server. In a dual-authentication scenario, back-end servers will also communicate with the domain con-trollers before processing a request.

Promoting an Exchange 2000 Server to a front-end simplyinvolves checking the box next to “This is a front end server”under the server’s General properties tab. This change causesthe Exchange/IIS server’s DAVEX.DLL to be replaced withEXPROX.DLL, the high-speed proxy agent that forwards allclient requests to the appropriate back-end servers.

In this role, front-end servers are suitable for placementin a “DMZ” perimeter network, because the servers containno user or message databases. To provide login and sessionsecurity, the front-end server can perform Secure SocketsLayer (SSL) encryption and decryption with a server certifi-cate. Note, however, that SSL overhead carries a CPU per-formance penalty of at least 15 percent.

Front-end/back-end architectures can scale elegantly byproviding users with a single namespace to access all back-endservers. This namespace gives the illusion of a large, mono-lithic system. More importantly, it minimizes user confusionover server names and enables a single URL for OWA.

102 PowerSolutions

MessagingClient

Exchange 2000Front-end

Exchange 2000Back-end 1

Exchange 2000Back-end 2

Exchange 2000Back-end 3

Exchange 2000Back-end 4

Windows 2000Domain Controller(Active Directory)

"Protocol Handler”HTTP, POP3, IMAP4

Public andPrivate Web Stores

User Authentication;User-to-Server Mapping

Note: The 1:4 front-end to back-end server ratio shown in this diagram may be used as a Microsoft design guideline for Exchange 2000 Server deployments.

Figure 1. A Simple Front-End/Back-End Topology

Page 99: Building Your Internet Data Center

Network Load Balancing Enhances Front-End Scalability The Windows 2000 Network Load Balancing (NLB) servicecomplements the front-end/back-end topology by providingboth scalability and application availability. NLB is used to scale the front-end horizontally by creating a cluster ofservers that distribute and balance client requests. An NLBcluster (not to be confused with Microsoft Cluster Services)can include as many as 32 servers that act as a singleExchange 2000 Server front end. Should one or more ofthose servers fail, client traffic is redirected to the remaininghosts, thereby maintaining application availability.

Of course, front-end application availability means noth-ing to users if they cannot also access mailbox data on theback-end server. To offer the highest availability of data,Microsoft Cluster Services (MSCS) can be used to providestorage group failover if a back-end server fails. A storagegroup is one or more stores sharing a transaction log set.

MSCS Enhances Back-End AvailabilityMSCS should be implemented on back-end servers whenmission-critical Exchange 2000 solutions demand high avail-ability of data. MSCS supports Exchange 2000 Server in atwo-node, active/active configuration (with four-node clus-tering available in the Windows 2000 Datacenter Server).This support makes clustering more cost-effective thanactive/passive clustering in Exchange 5.5 because all serversare being utilized. Each cluster partner actively supports usersand is ready to take over the other’s stores if a failure occurs.Although this means that a cluster can potentially servetwice as many users, user loads per server should be held to50 percent of capacity to enable one server to support allusers during failover.

SANs Provide Storage FlexibilityUsing a storage area network (SAN) for back-endExchange 2000 Servers provides superior storage flexibility(as well as performance and redundancy) by accommodat-ing growth and consolidation. SANs also support high-availability configurations such as MSCS. With the intro-duction of storage groups and multiple databases perserver, Exchange 2000 Server may require flexible storagedesigns best supported by SANs. Using a SAN platform forthe back-end Exchange 2000 Server makes sense in mostenterprise or hosted environments, particularly for organi-zations experiencing irregular growth patterns and ever-changing user populations.

Figure 2 illustrates an OWA 2000 environment that usesall of the technologies for availability and scalability.Organizations with a SAN may choose to install domaincontrollers in the SAN. The domain controllers in Figure 2are left standing alone to emphasize their functional role.

Consider Server Design for Best PerformanceAs Exchange 2000 Server roles become more specialized, sodo their hardware specifications and design principles. Forexample, front-end servers do not require additional disk

www.dell.com/powersolutions PowerSolutions 103

PowerEdge 2450Front-end Servers

PowerEdge 6450Back-end Server 1

Windows 2000Domain Controllers(Active Directory)

PowerEdge 6450Back-end Server 2

PowerEdge 6450Back-end Server 3

PowerEdge 6450Back-end Server 4

PowerVault SAN:1 PowerVault 650F3 PowerVault 630F

PowerVault 130TTape Library

PowerVault 51FFibre Channel Bridge

PowerVault 51FFibre Channel Bridge

PowerVault 35FFibre Channel Switch

3

5

4

6

1NLB

MSCS

2

Assemble the Pieces for an Available, Scalable Environment

The labeled components for the OWA 2000 solution include:1. Two front-end servers in a Windows 2000 NLB cluster, which can be

scaled up to 32 servers. With NLB, a client request is broadcast to all front-end servers, and a filtering algorithm causes all but one server to drop the request.

2. Two Windows 2000 domain controllers provide Active Directory redundancy.

3. MSCS is used to create two separate two-node Exchange 2000 Server clusters. Back-end servers 1 and 2 are live partners—as are servers 3 and 4—with each pair acting as one or more virtual servers. If one partner fails, all storage groups (containing one to five Web Stores each) are transferred to the surviving partner.

4. A PowerVault 650F provides the foundation for the SAN’s Fibre Channel loop. Up to 11 PowerVault 630F arrays may be added to each PowerVault 650F to grow the solution as needed.

5. Two PowerVault 51F Fibre Channel switches create a fault-tolerant SAN switching fabric.

6. A PowerVault 35F Fibre Channel-to-SCSI bridge enables efficient SAN-based backup and recovery.

Figure 2. A Mission-Critical Exchange 2000 Server OWA Environment

Page 100: Building Your Internet Data Center

volumes for stores and transaction logs. A sepa-rate SCSI channel and volume for the pagefileis recommended. Providing hefty CPU andRAM resources is essential for front-end per-formance and will increase user concurrencyrates. Front-end servers in an NLB cluster willnot require many availability features because a failure will not affect a client’s ability to usethe solution. The PowerEdge 2450—shown inFigure 2—is an excellent front-end serverchoice because it offers dual Pentium III proces-sors, integrated dual-channel SCSI, and redun-dant power in a 2U rack-optimized design.

Unlike Exchange 5.5, Exchange 2000 Serverreturns a performance benefit when equippedwith more than two processors. Many organiza-tions will choose to deploy four- and eight-wayback-end servers, which will drive the numberof mailboxes per server higher. Such a trendrequires the back-end servers to have amplestorage capacity—preferably SAN-based storage.

Plan For Effective Use of Disk Resources Disk resources are used differently in Exchange 2000 Serverbecause of the flexibility of storage groups. Additional storescan be created on a server in one of two ways: A store maybe added to an existing storage group, or a new storage groupcan be created to contain the new store. The differencebetween the two approaches involves the transaction logs.Each storage group has one set of transaction logs for all ofits databases (up to five in Release Candidate 2), so creatinganother storage group will require another set of transaction

logs. This will not necessarily require an addi-tional disk volume since multiple sets of trans-action logs can share the same logical drive.The same applies for stores, as illustrated inFigure 3.

For best performance, continue to separatetransaction logs (on a RAID 1 or RAID 10volume) from databases (RAID 5 or RAID 10),as with Exchange Server 5.5. If using RAID 10(mirrored stripe sets) seems wasteful with its 50percent storage efficiency, consider its excellentread/write performance and that multiple setsof transaction logs can share the volume.

Since the Active Directory database usesan Exchange Server-like Extensible StorageEngine (ESE) database, apply these same disklayout principles to domain controllers. If youdo not expect the Active Directory databaseto grow large enough to justify the three disksneeded for a RAID 5 volume, then use a two-disk RAID 1 mirror set. Microsoft’s

Active Directory Sizer tool can help to estimate the size of the database file (NTDS.DIT) for a given ActiveDirectory implementation. You can download this toolfrom www.microsoft.com/windows2000/downloads/deployment/sizer/default.asp.

Dell also has created the PowerMatch for MicrosoftExchange 2000 Server sizing tool, which generates detailedDell server and storage configurations based on user, storage,and growth requirements. It even specifies hardware forOWA 2000 configurations that use the front-end/back-endtopology. This tool may be downloaded from Dell’s ExchangeWeb site at www.dell.com/exchange.

Leverage OWA 2000 for a Hosting-Ready Solution Exchange 2000 Server greatly improves upon Exchange 5.5as an Internet messaging platform, thanks to its Web Store,flexible application architecture, and the Windows 2000availability features—NLB and MSCS. You can leveragethese technologies in OWA 2000 to produce hosting-readymessaging solutions for use in such environments as univer-sities, corporate extranets, and ASP data centers.

David Sayer ([email protected]) is a senior consultant withDell Technology Consulting and specializes in Exchange Server.Prior to joining Dell, he provided messaging and migration servicesto companies such as GE, Xerox, and United Technologies. Hecurrently designs and deploys Windows 2000 and messaging solutions for Dell’s Enterprise and ASP customers. David is also a member of Microsoft’s Exchange 2000 Task Force and is aMicrosoft Certified Systems Engineer (MCSE).

104 PowerSolutions

Storage Group 1Transaction Logs

Storage Group 2Transaction Logs

Storage Group 1Private Store 1

Storage Group 1Private Store 2

Storage Group 2Private Store 1

E:RAID 1

F:RAID 5

Note: To conserve server resources, do not create new storage groups until existing ones are full, as each storage group will require its own instance of the STORE.EXE process.

Figure 3. Storage Groups Can Share Disk Resources

Exchange 2000 Server

greatly improves upon

Exchange 5.5 as an

Internet messaging

platform, thanks to its

Web Store, flexible

application

architecture, and

the Windows 2000

availability features—

NLB and MSCS.