b406 information systems distributed systems - babu … · 2008-01-17 · b406 information systems...

46
B406 Information Systems Distributed Systems See also Whitten et al. (2004), chapter 13

Upload: haque

Post on 27-Jun-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

B406 Information SystemsDistributed Systems

See also Whitten et al. (2004), chapter 13

Distributed Systems

This Week

Distributed vs. centralized systemsClient / Server and network architecturesDistributed databases

Distributed Systems

Early Computers

- In the early days, expensive and monolithic mainframe computers

- The internet / networks have been developed through military drivers- Distributed control over

machinery in case of nuclear attacks

Distributed Systems

Control RoomsCentralizedcontrols

Complexinterfaces

Very complicate

Distributed Systems

A Today’s office ?!

Then came desktop workplaces(with Graphical User Interfaces)- In working environment usually

come with central file storage or at least central backup

- Central databases (for employees, customers, products, finance, …)

- Still dominant today- Often not used consistently

employers get problems with data inconsistencies or loose data

Distributed Systems

Internet

The internet virtuallyconnects allcomputers- More and more

mobile devices,Smartphones, PDA’s, …

Distributed Systems

Today’s Mainframes

• Centralized services- Data storage- Data indexing

(google & co.)- Communication

(email, chat, sykpe)- Rich Internet

Applications (RIA)

Distributed Systems

Mobile Computing

- Aim is to workanywhere, anytime

- Home based work- During travel,

in hotel, …

Distributed Systems

Centralized Systems

• Centralized system comprises a central, multi-use computer

• Hosts DATA, PROCESS & INTERFACE components of the information system

• Users interact with the host machine via terminals (or PCs pretending)

• Virtually all processing performed on the host – terminals can be weak (i.e. inexpensive)

• Sounds OK, so why distribute the information system?

Distributed Systems

Why the Trend Toward Distributed Systems?

• Modern businesses are already decentralized (distributed).• Distributed computing moves information and services

closer to the customers and users who need them.• Distributed computing consolidates the power of personal

computers across the enterprise.• Distributed computing solutions are in general more user-

friendly because they use the PC as the user interface processor.

• Personal computers and network servers are less expensive than mainframe computers – Though total cost of ownership is at least as expensive

Distributed Systems

Distribution Cons

• Lot of network traffic involved, which can compromise performance (without potentially expensive upgrade)

• Security and integrity more easily compromised– Easier to 'get into' isolated sources of information (unless one goes

for the 'jackpot' of central system)– Harder to ensure that the system is kept up-to-date

• Nevertheless, the practical advantages of distributed systems maintain their popularity

Many centralized systems are transformed into distributed systems.

Distributed Systems

The Big Picture

3 architectures5 layers

[Whitten et al., p.511]

Distributed Systems

Computing Layers

• Presentation layer — the user interface; presentation of inputs and outputs to the user.

• Presentation logic layer — processing that must be done to generate the presentation, such as editing input data or formatting output data.

• Application logic layer — the logic and processing to support business application (rules, policies, and procedures)

• Data manipulation layer — all the commands and logic to store and retrieve data to and from the database

• Data layer—the actual business data (databases)

Distributed Systems

Distributed Architecture Types

• Three main architecture types

– File Server

– Client / Server (2-tier)

– Client / Server (3-tier)

– Client / Server (N-tier)

– Network (N-tier)

Distributed Systems

File Server Architecture

Local area network (LAN) – a set of client computers (PCs) connected over a relatively short distance to one or more servers.

File server system – a LAN in which a server hosts the data of an information system. – All other layers are implemented on the client computers– Frequently excessive network traffic to transport data between

servers and clients– Client must be fairly robust (“fat”) because it does most of the work– Database integrity can be compromised

(file locked during edit by a single user)

Distributed Systems

File Server Architecture

Distributed Systems

Client/Server Architecture

Client/server system – a distributed computing solution in which the presentation, presentation logic, application logic, data manipulation, and data layers are distributed between client PCs and one or more servers.

“Currently the dominant model for distributed computing at current”

Distributed Systems

Client/Server ArchitectureClients

Thin client – a personal computer that does not have to be very powerful because it only presents the user interface to the user.– Largely for interaction with processing layers– Increasingly PDA’s, Handhelds or Smart phones

Fat client – a typically powerful personal computer– capable of independent application processes– Also notebook computer or workstation

Distributed Systems

Client/Server Architecture Servers

• DATA, PROCESS & INTERFACE layers distributed across client PCs and one, or more, servers (typically more powerful / capable than file server)

Distributed Systems

Client / Server ArchitectureDifferent types of server• Database server – a server that hosts one or more

databases. (Executing all data manipulation commands at the server)

• Transaction server – a server that hosts services which ensure that all database updates for a transaction succeed or fail as a whole. (Crucial in banking contexts – cash machines; online shopping …)

• Application server – a server that hosts application logic and services for an information system. (between interface & database)

Distributed Systems

Client / Server ArchitectureDifferent types of server• Messaging or groupware server – a server that hosts

services for e-mail, calendaring, and other work group functionality.

• Web server – a server that hosts Internet or intranet websites. (sends data (XML) and documents (HTML) to clients)

Distributed Systems

Distribution Types

• Various layers = various types of distribution

• Distributed Presentation

• Distributed Data (2-tier)

• Distributed Data & Application (3(+)-tier)

Distributed Systems

Client/Server ArchitectureDistributed Presentation• Distributed Presentation

– Presentation & presentation logic layers reside on client– Rest of the system could be an existing legacy

database• Good, because

– Rapid implementation– Fast, familiar interface to existing database

• Bad because– System functionality can’t be significantly improved– Doesn’t maximise client PC potential

Distributed Systems

Client / Server - Distributed Presentation

Distributed Systems

Client/Server Architecture (2-tier)Distributed DataDistributed data – a client/server system in which

the data and data manipulation layers are placed on the server, and other layers are placed on the client.– Sometimes called two-tiered client/server computing– Difference to file server systems is where the data

manipulation commands are executed– Much less network traffic than file server systems

because only the database requests and the results of those requests are transported across the network.

– Database integrity is easier to maintain.

Distributed Systems

Distributed Systems

Client/Server Architecture (2-tier)Distributed Data

• Good, because- Less traffic (only requests & required records are

transported)- Integrity is easier to maintain – only records in use need

be locked• Bad, because

- Requires ‘fat’ clients- Application logic must be duplicated and maintained

across all clients

Distributed Systems

Problem with 2-tier client/server systems

• With larger number of clients, performance problems occur (mainly related to application logic)

• In transaction processing systems (or online application processing (OLAP)), transactions must be managed centrally to ensure processing as single unit– Example: 2 people want to withdraw money from a

bank account at the same time…

Distributed Systems

Client / Server Architecture (3-tier)Distributed Data and Application

• Designed to avoid inefficiency of maintaining application logic on client PCs

• Multi-user transaction processes require careful monitoring to maintain integrity

• Data and data manipulation layers reside on their own server(s)

• Application logic resides on its own server• Presentation logic and presentation reside on the

client PC

Distributed Systems

Client / Server Architecture (3-tier)

• Good, because– Only application server need be maintained to ensure

integrity (beware stored procedures embedded in database)

– Simplified, stable client configuration & maintenance– Greater flexibility

• Bad, because– Complexity of design / how best to partition the system?– Available prototypes?– Again, potential withdrawn from client PCs

Distributed Systems

3-tier system

Data

Business logic / process /

application

Interface / presentation

Distributed Systems

Architecture & Layers

• 3-tier system can be decomposed into 'layers'PRESENTATION

LOGIC LAYER(GENERATE

PRESENTATION)

USERINTERFACE

DATA FORMAT

& EDIT

DATAANALYSIS

DATABASESTORE &RETRIEVE

PRESENTATION LAYER

(PROCESSING REQUIRED TO SUPPORT APPLICATION)

APPLICATIONLOGIC LAYER

DATA MANIPULATIONLAYER

DATA LAYER

Distributed Systems

Distributed Data & Application

• User• Client• Application

Server• Database

Server

Distributed Systems

Internet- and Intranet-based Architectures(N-tier)

Network computing system – a multi-tiered solution in which the presentation and presentation logic layers are implemented in client-side Web browsers using content downloaded from a Web server– presentation logic layer connects to the application logic

layer that runs on the application server, which connects to the database servers on the backside of the system

Distributed Systems

Network Architecture for Information Systems

The greatest potential of this approach is its applicability to redesign of traditional information systems to run on an intranet

Intranet – a secure network that uses Internet technology to integrate desktop, work group, and enterprise computing into a cohesive framework.

Distributed Systems

Network Architecture (N-tier)

• Presentation and presentation logic reside in web browsers

• Web server provides browser content and user authentication (security)

• Application and data layers can be as before (or whatever configuration best performs the task)

• Intranet = secure, internal network• Internet = should be familiar!!

Distributed Systems

Network Architecture (N-tier)

• Good, because– Unified, easy-to-interpret presentation interface– Enormous flexibility– Java, etc. allows platform independence– Unified content structure (HTML)

• Bad, because– Technologies involved are undergoing rapid

development– Any others? Security? Pace of working life?

Distributed Systems

Network Computing System: Internet/Intranet

• User• Client• Web Server• Application Server• Database Server

Distributed Systems

One more ArchitecturePeer-to-peer networks (P2P)• network that relies on the computing power and

bandwidth of the participants in the networkrather than concentrating it in a relatively few servers

• P2P networks are typically used for connecting nodes via largely ad hoc connections

• Such networks are useful for many purposes– Sharing content files (see file sharing) containing audio,

video, data or anything in digital format is very common– and realtime data, such as telephony traffic, is also

passed using P2P technologyhttp://en.wikipedia.org/wiki/Peer-to-peer

Distributed Systems

Peer-to-peer networks (P2P)

• A pure peer-to-peer network does not have the notion of clients or servers

• only equal peer nodes that simultaneously function as both "clients" and "servers" to the other nodes on the network This model of network arrangement differs from the client-server model where communication is usually to and from a central server

Distributed Systems

Advantages of peer-to-peer networks

• Efficiency - all clients provide resources, including bandwidth, storage space, and computing power

• Scalability - as nodes arrive and demand on the system increases, the total capacity of the system also increases

• Robustness - in case of failures, data is replicated over multiple peers

• If true P2P, there is no single point of failure in the system

Distributed Systems

Relational Databases

• Already met relational databases• Direct analogy to E-R modelling• Consist of tables (entities) & relationships• Table rows = records, columns = fields• Can be distributed

as with systems

Distributed Systems

A database example

Distributed Systems

Distributed Relational Databases

• A distributed relational database management system (distributed RDBMS) allows sophisticated access to and maintenance of an RDB that is distributed across a network

• Provides backup, recovery and security• Sometimes called client/server database

management system• In a distributed RDBMS, the underlying database

engine that processes all database commands executes on the database server (less data traffic)

Distributed Systems

Distributed Relational DatabasesTwo types of distributed data

• Data partitioning– Little duplication between servers– Vertical partitioning: Different columns

assigned to different database servers– Horizontal partitioning: Different rows in a

table allocated to different database servers

Distributed Systems

Distributed Relational DatabasesTwo types of distributed data

• Data replication– Duplicates some or all tables (rows and columns)

on more than one database servers– Can duplicate entire tables or subsets on different

servers• Google indexing (popular search queries vs. rare queries)• Amazon (popular products vs. rare products)

– Management system with replication technology controls access to and management of each database server and also schedules updates between database servers with duplicated data