cs16ee enterprise architecture - contentserv...the multi-tier system architecture brings best of...

26
Whitepaper CS16EE Enterprise Architecture Revised on 01.01.2016

Upload: others

Post on 25-Jul-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

Whitepaper

CS16EE Enterprise Architecture Revised on 01.01.2016

Page 2: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

Table of Contents

1 Management summary ........................................................................................................................................................3

2 Enterprise Solution - Basics ..............................................................................................................................................4

3 Best-of-Breed in data persistence ..................................................................................................................................5

4 The Strategy Core .................................................................................................................................................................7

4.1 Data Management with strategy ............................................................................................................................7

4.1.1 Data consistency while writing ...............................................................................................................................8

4.1.2 High performance and cost savings through specialised storages .................................................................9

4.1.3 PIM, MAM, MRM and Editorial Data Model .........................................................................................................9

4.2 Data integration .......................................................................................................................................................10

4.2.1 API-based .................................................................................................................................................................10

4.2.2 Import Staging-based .............................................................................................................................................11

4.2.3 Export Staging-based ..............................................................................................................................................12

4.3 Rendering and Publishing ......................................................................................................................................14

4.4 Rules Engine .............................................................................................................................................................14

4.5 Delivery Service .......................................................................................................................................................15

5 Business layer – content, context, target ....................................................................................................................16

6 Customer business layer ..................................................................................................................................................17

7 Infrastructure .......................................................................................................................................................................18

7.1 System Components and Architecture ................................................................................................................18

7.2 Monitoring and recovery ........................................................................................................................................19

7.2.1 Component management .......................................................................................................................................20

7.2.2 System deployment and maintenance .................................................................................................................20

7.3 Disaster management .............................................................................................................................................21

7.3.1 Diagnosis ...................................................................................................................................................................21

7.3.2 Recovery ...................................................................................................................................................................21

8 High performance and high availability........................................................................................................................22

8.1 Horizontal scaling ....................................................................................................................................................22

8.2 High availability ........................................................................................................................................................23

8.3 Deployment options .................................................................................................................................................24

8.3.1 Single Data Centre Setup with local consumers and data sources ...............................................................24

8.3.2 Central Data Repository, local import / export staging .....................................................................................25

8.3.3 Multi-Data Centre setup .........................................................................................................................................26

Page 3: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 3

1 Management summary

An Enterprise Product Information Management (PIM), Content Engine and Publishing Hub solution needs to cater to Enterprise needs. For a full-fledged enterprise marketing solution, architectural requirements are driven by its demands for:

• High availability • Scalability • Distribution • Data safety • Data integration • Real-time content delivery • Cloud-aware architecture • SaaS-ready solution • Seamlessly scalable environment

Enterprises show their readiness in their ability to respond to user demands like:

• Localization based on time zones, countries, languages, regions. • Adaptation to situational context such as weather, time of the day, season, device used,

hyperlocations and many more. • Adaptation to the needs of the audience, based on profile criteria. • Distributed content and media production teams worldwide, without loss of service quality

with regard to speed, availability or accessibility. • Distributed production teams needing access to large documents across multiple

locations. • 24/7 services with various workload peaks. • Collaborative usage. • Easy adaptation on user processes and special workflows. • Seamless integration with backend systems as well as channel systems such as digital

marketing applications and marketplaces. • Productive architecture on premises, as private cloud and as SaaS.

With its new architecture, the CONTENTSERV CS16 Enterprise Edition (CS16EE) brings a completely new and revolutionary approach to handling marketing resources and making them available for publication digitally and on paper, real time or offline.

The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content.

This whitepaper describes the advantages of the new CS16EE architecture from scratch. In a bottom-up approach we will show the advantages of the new architecture in detail.

Page 4: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 4

2 Enterprise Solution - Basics

For the design of the CS16EE Enterprise solution one of the first premises was to provide a powerful content data model with inheritance, versions and variants, flexible schema, dependencies and tagging without degrading performance even for data on a huge scale. Great efforts were made to create a system with real-time ability. A complex semantic, tolerant, facetted search with auto-completion for marketing content needs to be answered in plenty of time for system’s users, as well as content delivered for online shops or websites especially for online real-time targeting.

Page 5: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 5

3 Best-of-Breed in data persistence

For the CS16EE persistence layer it was obvious to choose from best-of-breed DB Systems, with special attention to specific content types such as:

• General data • Relations between objects • Search indexes • Media data • Data I/O • Reporting.

Since the architectural decision was to provide best performance by using high performance databases for special purposes, the list of data storages in CS16EE’s persistence layer looks like this:

DB Usage Speciality

Oracle RAC General data storage

Highly available and high performance general purpose database with trans-actional safety down to the second.

Neo4J / Orient DB

Storing object relations

A graph database is an online database management system with Create, Read, Update and Delete (CRUD) operations working on a graph data model. Unlike other databases, relationships take first priority in graph databases. This means there is no need to infer data connections using things like foreign keys.

Elastic-search

Search indexes

With Elasticsearch, all data is immediately available for search. It builds distributed capabilities on top of Apache Lucene to provide the most powerful full text search capabilities. The search engine supports multilingual search, geo-location, contextual did-you-mean suggestions, auto-complete, and result snippets. Elasticsearch supports faceting and percolating, which can be useful for notifying if new documents match registered queries.

Page 6: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 6

DB Usage Speciality

Swift Asset store

Swift is a highly available, distributed, eventually consistent object store. Swift is used to store lots of data efficiently, safely, and cheaply. Swift can replicate large documents across multiple locations for asset/document creation, editing or distribution.

Cassandra DB

Import/Export tables

Cassandra's data model offers the convenience of a large number of columns with the performance of log-structured updates, strong support for denormalisation and materialised views, and powerful built-in caching. Cassandra DB provides grid computing with unlimited scalability.

Pentaho Reporting

Optimization of cubes for super-fast reporting in OLAP. Aggregation of different external data sources to incorporate other systems into reporting.

One of the core features of the CS16EE is to ensure data integrity in high performance and distributed high availability environments. Read more on data integrity in the next chapter. You will find more information on distributed high availability environments in chapter “7 Infrastructure”.

Page 7: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 7

4 The Strategy Core

The Strategy Core contains all Basic Services of the CS16EE Solution. But why strategy?

Strategy stands for a specific approach in handling different types of basic services. Each service is extensible via plugins. A plugin adds a service, which provides specific functi-onnality for the overall software solution. The Strategy Core is a way to dynamically deal with new services.

The Systems Core acts as an API for every complex business functionality. Besides the data storage and retrieval it provides services for data integration, output rendering and the rule engine.

The system’s core functions are:

• Storing and retrieving data • Versioning and building variations of objects • Output rendering based on templates or in a flow mode • Integration of external data • Workflow and rules-engine mechanisms.

4.1 Data Management with strategy

The best way to store a huge amount of data in a transactionally safe way is a relational database. Enterprise-scaled RDBMS Systems like Oracle RAC are well established components of every IT infrastructure; this is especially true because transactional safety in disaster recovery can be down to the second via writing a transactional log. For special marketing requirements as well as for calculating a best fitting product placement in real-

Page 8: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 8

time, there are specialised databases with much higher answering rates and better scaling performance at a lower cost.

Marketing requirements that need special data-handling are:

• Real-time targeting in online channels. • Real-time composition and rendering of customer-specific documents (PDF, Word, PowerPoint,…). • Inheritance from and data transfer of object attributes. • Versioning and content variations for context-specific delivery on different channels. • Change management / update triggers for publications. • Ad-hoc reporting for placement analysis, return on content, affinity, analysis…

The choice of high performance through specialised databases came with a problem concerning data consistency. If data is distributed between different DBs, each DB becomes a single point of failure.

During tests with user scenarios it became obvious that over 90% of the DB accesses will be reading accesses. With this a further important architecture decision was made. There has to be a single point of truth, from which the system can be re-established in a disaster scenario. This is a task best fulfilled by a RDBMS.

4.1.1 Data consistency while writing

To keep consistency of the specialised DB Systems as well as to keep a single point of truth, “strategy-based data handling” was built.

In the case of a writing operation, the RDBMS (Oracle) is the first data store to be provided with data. Binary data as images and movies is always kept in an SWIFT-based Asset Store.

In the next step all secondary databases are filled with information. The writing strategy is to keep only such information in the secondary DBs that are most fitting to the DB’s purpose.

Page 9: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 9

Object relations are kept within the Graph DB, search-relevant information and search indexes are kept in the Search DB, real-time delivery is powered by column-based Export Store and asset data is maintained in the Asset Store.

On the one hand, all information is kept in the RDBMS for maximum transactional safety. On the other hand, the specialised secondary data stores are a pre-requisite for the application to work smoothly. As such, a recovery strategy for the secondary stores is required for ensuring high availability.

In the case of data recovery, it is possible to recover the secondary data stores out of the relational primary store, but for faster recovery times it is also feasible to recover the secondary data stores from (file-based) backups with a subsequent delta load from the RDBMS master store.

4.1.2 High performance and cost savings through specialised storages

The reading strategy is based on a “best performance decision”. Each core API’s action has its best fitting data store. If this data store is not able to answer on a current request, the RDBMS steps in as the single point of truth.

4.1.3 PIM, MAM, MRM and Editorial Data Model

The Systems Core is a powerful data model that incorporates PIM and MAM functionalities as well as MRM Features. It combines a class-oriented Object store with attribution for

Page 10: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 10

various content types with a strong tagging mechanism. Exact class rules are possible as well as freely defined datasets for direct editing.

The data model therefore combines both worlds seamlessly. The structured world of product and sales information finds its representation together with an individual world of marketing content.

4.2 Data integration

4.2.1 API-based

Data integrations are one of the key requirements for marketing content management. The CS16EE suite provides simple and safe mechanisms to integrate data sources based on web services and message bus.

To provide easy ways of data integration a REST Service Interface is available for plugin extensions. In the case of specialized integrations, a general plugin API is provided for general integrations.

For an ideal performance and scaling, the API-based integrations typically read from export staging areas and write to import staging areas. This ensures that erroneous write operations can be handled with manual intervention, while error-free records are directly imported.

Page 11: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 11

4.2.2 Import Staging-based

CS16EE features a consistent approach for all forms of data onboarding:

• Suppliers and vendors can upload and edit data in a supplier / vendor portal. • Backoffice applications such as ERP, CRM or PLM can connect to the staging area and provide data via API or Message Bus. • Remote locations or external partners such as agencies, content aggregators or freelancers can upload media assets and content via staging areas.

It depends on the maturity of the business relationship or interface, whether provided data is validated automatically and imported or whether data is checked, cleansed and merged manually before import.

One of the challenges with multiple data sources providing a large amount of data is performance:

• Checking provided data against contraints and quality rules can put a high load on the servers. • Resolving relations to existing data and ensuring that no duplicates created for media or text assets requires complex similarity analytics that even increases this load.

In order to ensure almost unlimited scalability, the import process is organized across one or many local staging areas that have full access to the data model, and references data via replication from the global data repository. This ensures that:

• All data can be fully validated, without creating load on the core system. • All relationships can be resolved and targets of relationships can be de- duplicated before import. • Quality rules can be applied to check & validate data before importing to the core. • In the case of redundant data sets, e.g. two different systems or suppliers both provide attributes for the same object, a manual merge or rule-based applied survivorship can take place before importing. • The load process from staging to the core system can take place without the need for data validations with superfast write performance. • The only possible conflict to resolve after import will be possible concurrent writes, which are resolved in a guided merge process.

Page 12: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 12

4.2.3 Export Staging-based

Providing API-based read access to the core system by channel systems is typically not possible:

• The amount of read operations is far too high, considering the high load generated by Web Management or E-Commerce solutions. • The systems are typically hosted in a different trust zone (DMZ vs. trusted zone). • The provisioning of data requires filtering, mapping and transformation, as logic and context would be different.

In order to provide a maximum of performance upon delivery with a minimum load on the core system, all read access are routed to one or more export staging areas:

Page 13: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 13

• A stage 1 filter makes sure that only relevant data is syndicated to the export staging area (e.g. filter by region / country / language / channel). • For different requests, a stage 2 filter can provide data optimized for context with additional filters based on scoring and matching. • The data can be delivered in different data formats (JSON, XML, Excel, CSV) or in layout form (HTML, PDF). • Data can be pulled or pushed (trigger- or schedule-based) via different options (MQ, API / REST, FTP,…).

This architecture ensures that read operations do not put any load on the global data repository. The only required operation is the synchronization via message bus to post all updates to all subscribing staging areas.

Page 14: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 14

4.3 Rendering and Publishing

The CS16EE system provides built-in rendering and publishing abilities for online and offline channels.

With a built-in InDesign integration based on a native InDesign plugin, the CS16EE offers a complete print publishing engine. With native MS Office Support the system closes the gap between print, digital presentation and personal presentation. The PowerPoint rendering engine supplies predesigned PowerPoint slides.

As a content engine, the CS16EE does not only supply static digital, office and paper output, but acts as a real-time engine for online systems. It delivers product information in your online shops, it provides targeting information for your WebCMS and it pushes real-time time information to your mobile APPS. Online is either served with pre-rendered HTML / PDF or via API for JSON or XML.

For real-time document generation the PDF generator comes in handy and if further formats are necessary, also the Rendering and Publishing components are extensible through plugins.

4.4 Rules Engine

The Core Rules Engine is much more than its name suggests. It is the glue between the data input through integration, the data storage and retrieval and finally the output through rendering.

Page 15: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 15

The Rules Engine adds a complex rights and roles system based on data objects as well as on functions to the system. Through the Rules Engine it is possible to transform data, describe workflows and use planning and schedules for specific tasks.

4.5 Delivery Service

Every software component using the strategy core to build up further functionality uses the REST-based Delivery Service to access those functions. This REST service is in use when the CS16EE’s own Business Layer is working with its data or is generating output through InDesign. It is also in use when third-party solutions are integrated with the CS16EE.

Together with the Strategy Core, the CS16EE Delivery Service builds a strong platform upon which you can build your enterprise marketing technology stack.

Page 16: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 16

5 Business layer – content, context, target

The CS16EE offers you all the functionality you need to supply your marketing with the best content in every situation and every context. On the one hand, the CS16EE offers you a complete set of functions to use for integration with tools for

• Delivery in channels – like social media hubs. • Online optimization. • Personalised 1:1 communication. • Target group clustered communication. • …

In order to bring content to all your marketing channels, to reuse your content, to apply content in every context and to target your customers, the CS16EE supplies you with a rich set of modules for content engineering, campaign planning, content delivery planning, in time delivery and many more.

In general, functions like Marketing Content Management, a content engine, Marketing Resource Management, Multichannel Publishing, and Marketing Intelligence are built into the system.

Page 17: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 17

6 Customer business layer

With a special layer for configuration and customization, the CS16EE enables you to bring your business case into the system. Without writing a single line of code you are able to set up your own business processes, as they are applied within your marketing department.

Page 18: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 18

7 Infrastructure

Great efforts while developing the CS16EE where put into creating a highly scalable, and distributable software.

The result is ready for cloud services as well as for operation in distributed data centres. All components are SaaS ready and can be run on Cloud platforms like AWS.

Independent from the way to implement the system, it is recommended to run the systems components on different nodes. Either physical, hardware nodes or virtual nodes.

7.1 System Components and Architecture

CS16EE Application is composed of the following software components:

• Java Application Server – like Tomcat EE, JBoss or WebSphere • Primary Data Store – Oracle DB or Cassandra • Secondary Graph Store - Neo4j or Orient DB • Secondary Search Store - ElasticSearch • Asset Storage – Swift • Secondary Import Export - Cassandra • Secondary Reporting - Pentaho • Print Generation - InDesign Server

A logical system architecture can therefore be presented like this:

Page 19: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 19

The best installation scheme for performance and data integrity was selected for each of the systems. For best availability of the Web- and Application servers, share the user ac-cesses through a Load Balancer. Oracle DB could be used either with a failover cluster or as Oracle Real Application Cluster (RAC). Neo4J supports a Master Slave setup for failover and performance-optimised service, whereas Elasticsearch supports system clustering. Swift is widely used as a platform for Content Delivery Networks and offers clustering functionality. For the Cassandra DB and Pentaho we recommend at least a two-node setup.

InDesign Server is used for high quality output generation. Since InDesign Server doesn’t offer stable high performance features, the CS16EE brings its own job dispatcher granting a reliable output service.

7.2 Monitoring and recovery

The primary objective of monitoring the application’s components is to ensure that the application itself is always available. The application is comprised of several base systems such as Neo4j, Elasticsearch, Swift, Cassandra, Pentaho, etc.

To ensure such a state of availability, it is necessary to detect if or when there is a problem with the application or its components, and how to resolve them expediently.

The core tasks involved here are:

• Stopping a failing or failed node and then starting it. • Stopping a failing or failed component instance, and then starting it.

The use of a monitoring tool helps greatly in detecting such occurrences, thereby freeing up time for the customer to quickly resolve these problems. The monitoring CS16EE is handled by the monitoring tool Nagios® Core™, as depicted in the diagram below:

Page 20: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 20

Nagios Core is an Open Source system and network monitoring application. It serves as the basic event scheduler, event processor, and alert manager for elements that are monitored – alerting you when things go bad and when they get better. It features several APIs that are used to extend its capabilities to perform additional tasks, is implemented as a daemon written in C for performance reasons, and is designed to run natively on Linux/*nix systems.

Each Node or Server within the systems architecture will be monitored for availability. In addition, every software component will be directly monitored by a Nagios Agent. With this setup, nodes as well as system components are under direct surveillance.

7.2.1 Component management

While Nagios will be solely responsible for monitoring the application components, the management of the components themselves is handled by (i.e. started / stopped / restarted) by Puppet.

Puppet is a configuration management system that allows you to automate every step of the software delivery process, from provisioning of physical and virtual machines to orchestration and reporting; from early-stage code development through testing, production release and updates.

7.2.2 System deployment and maintenance

CONTENTSERV uses Docker to ship each component in the environment. Each Node will typically have a Puppet / Nagios agent and one Docker container. For example, an application server node will contain one Docker container which in turn will contain the application server software with all its dependencies. It will also contain a Nagios agent to monitor the application server Docker container. If the application server goes offline, the Nagios agent will inform the Nagios Master, which in turn will call the Puppet Agent to relaunch the Docker container.

Additionally, CS16EE provides an HTML5-based application Administration Dashboard, which integrates with both the Nagios Master and the Puppet Master. The dashboard displays the status of all applications, and thereby provides a mechanism for manual control of each and every component in the application i.e. all start-up or shut-down activities.

Page 21: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 21

7.3 Disaster management

7.3.1 Diagnosis

The status of each component server will be made available via the Administration Dashboard. All component servers maintain trace and error logs daily. These can be made available (via the Administration Dashboard) to further analyse any component failures.

7.3.2 Recovery

The Administration Dashboard provides the interface for the start-up and shut-down of all component servers. For an application failure without data loss, it is in general sufficient to restart.

In the case of data loss, a reliant recovery is granted through a backup through the primary data store, the Oracle DB. To make system recovery faster, the secondary data stores are backed up as well. A time gap between the various backups is closed by a delta reconstruction.

Page 22: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 22

8 High performance and high availability

For a high performance software solution, it is necessary that each component scales horizontally without losing reliability and data safety. In addition, to generate a highly available environment it is also necessary to distribute the solution throughout more than one data centre.

The CS16EE Application supports both simultaneously.

8.1 Horizontal scaling

The term “horizontal scaling” is used for distributing each system component on a variable set of nodes where each node adds additional performance to the component. The scaling through additional nodes is almost linear and therefore ensures enough performance for high usage and high data loads.

You will find a typical setup in the next figure, where the Tomcat is distributed on five nodes. Oracle is based on the Oracle RAC solution, the secondary data stores are implemented on three nodes each, Swift is separated on three nodes for separation of object and storage tier and finally two nodes each for printing and reporting.

In a virtualised environment we recommend you use at least three servers to set up the nodes for Tomcat, the secondary stores and reporting. Two additional servers are recommended for the primary stores. Since InDesign needs special (Apple) Hardware, two Apple servers are necessary here.

Page 23: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 23

8.2 High availability

With the setup described in the scenario above a certain availability is reached through multiple nodes on different hardware. A failover for corrupted nodes and hardware defects on single servers is granted. Together with the built-in monitoring, a high grade of availability is reached within one data centre.

The easiest next availability step is to replicate CS16EE on two data centres where one data centre acts as hot standby.

Page 24: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 24

8.3 Deployment options

To ensure high performance for a system used in various locations, for example on different continents, it makes sense to use a distributed setup with regional orientation.

This minimizes network latency that always becomes an issue with long range access, for example:

• Typical latencies within Europe and within North America are in the range of 10-35 milliseconds. • A transatlantic connection will be in the range of 60-80 milliseconds. • The longest distances, such as U.K. to Australia and New Zealand will be in the range of 300-400 milliseconds. • For China, latency is even higher, as the geographic distance is additionally deteriorated by strict Firewalls and regulated entry points with typical ranges well beyond 500 milliseconds.

In addition to the inevitable latency issue, bandwidth will become another bottleneck. This is why setting up local export staging areas will become imperative for any international high performance delivery. Connecting to local data sources might work much better via a local import staging area. And highly distributed teams might even be required to work on local replicates of the global data repository for efficient and comfortable work.

Having this in mind, the ideal setup depends on a balance between infrastructure cost and requirements of fast local access to write or read operations:

• A central data repository caters best for central PIM deployments, where exports are syndicated to Content Delivery Networks or regionally hosted servers. • A central data repository with local export staging areas can deliver content in real time locally, while synchronizing with the global data centre within seconds to minutes. • In a Multi Data Centre Setup all system components are distributed on local data centres to provide full functionality for local users. In addition, staging areas for export and import can further boost the system performance.

8.3.1 Single Data Centre Setup with local consumers and data sources

The main idea behind this setup is to grant fast access to data for consumers in a confined geographical area.

The operating cost for a single data centre setup is lower, as one central data repository avoids any form of synchronization with conflict resolution problems that are involved with concurrent write issues on the same dataset / attribute.

Page 25: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 25

A single data centre setup might also be favourable when the vast majority of users is working in a central location, while only a few users occasionally access the system, e.g. for translating content.

8.3.2 Central Data Repository, local import / export staging

For a scenario where local consumers like websites, online-shops or data displays are served, a local staging service is established. The staging service provides real-time data to the consumers. Data updates are provided by the global data repository through a message bus system.

Distant data sources make use of a local import staging. In order to avoid read access to the core system while validating imported data against the data model or while resolving references to data in the core system, the data model and the data is replicated to each import staging area.

Local data centres can cater to imports, exports or both import and export in one data centre.

Page 26: CS16EE Enterprise Architecture - Contentserv...The multi-tier system architecture brings best of breed data persistence together with an innovative new way to cope with marketing content

© ContentSphere International AG Whitepaper CS16EE Enterprise Architecture 26

8.3.3 Multi-Data Centre setup

Where distant users interact directly with the CS16EE system it makes sense to set up completely functional local instances of the system.

In this case, local users access the CS16EE via a fully functional application instance. Data synchronization is performed by syndication through the message bus component.

In order to avoid performance bottlenecks, the global data repository typically does not provide services to users, but only acts as an aggregation platform for all local data centres.