cofax scalability document version 1.0
DESCRIPTION
Cofax Scalability Document Version 1.0. Scaling Cofax in General. The scalability of Cofax is directly related to the system software, hardware and network environment in which it is installed. - PowerPoint PPT PresentationTRANSCRIPT
Cofax ScalabilityDocument Version 1.0
2
Scaling Cofax in General
The scalability of Cofax is directly related to the system software, hardware and network environment in which it is installed.
The analysis did not find any scalability bottlenecks in the Cofax design or code itself. It confirmed that Cofax was designed to scale.
This process did result in a significant reliability and performance improvement to our server hosting facility and database setup.
Cofax was developed to allow the rapid deployment of new hardware resources according to demand.
Cofax is not dependent on any one system installation architecture. It scales by reconfiguring the environment to meet current needs.
3
Scaling Cofax at PNI
Description of Installation v2.0.atPNI Strengths of Installation v2.0.atPNI Weaknesses of Installation v2.0.atPNI Description of Installation v3.0.atPNI Strengths of Installation v3.0.atPNI Addressing concerns about Installation v3.0.atPNI
4
File Server
Installation v2.0.atPNI
HTTP Server
HTTP Server
HTTP Server
HTTP Server
Cofax Server
Cofax Server
Cofax Server
Cofax Server
Cofax Server
Cofax Server
Database Server
File Storage
File Storage
File Storage
File Storage
5
Strengths of Installation v2.0.atPNI
HTTP Servers are distributed. Cofax Application Servers are distributed. Load is balanced between multiple servers. Load can be distributed as necessary. Once the hardware and OS is in place, new
additional servers can be manually configured and running in minutes.
Very suitable for an ISP that can automatically add/remove servers on the fly.
6
Weaknesses of Installation v2.0.atPNI
Database server is a single point of failure.– On the serving side, and– On the updating side
There is a limit to how much hardware we can add to the single database server.– The incremental performance gain from adding more
computing resources (CPUs, memory, disk space) to the single server starts to diminish at a point.
A single machine, no matter how powerful does fail for the common types of problems (locked data in a table, runaway processes, memory leaks, etc.)
7
Weaknesses of Installation v2.0.atPNI
There is a practical limit of how much optimization we can do on a single machine.– There is also a limit to how much optimization the people
would want to do.
These optimizations change with time.– When the server setup changes– When the usage patterns change.
8
File Server
Installation v3.0.atPNI
HTTP Server
HTTP Server
HTTP Server
HTTP Server
Cofax Server
Cofax Server
Cofax Server
Cofax Server
Cofax Server
Cofax Server
Serving DatabaseServer
File Storage
File Storage
File Storage
File Storage
Serving DatabaseServer
Serving DatabaseServer
Serving DatabaseServer
Editing DatabaseServer
Editing DatabaseServer
9
Strengths of Installation v3.0.atPNI
The model is already tested. Only reasonable optimizations are required. Serving database is replicated across multiple
physical servers There is no single point of failure on the serving
side. Data transformation is isolated from data retrieval
10
Implementing the Distributed Database
Requires no design changes to the Cofax framework.
Requires no changes to the Java code or software application.
Requires configuration changes only. Requires the addition of new hardware resources,
database servers, tomcat servers, web servers.
11
Upgrading from Database model to Distributed Database model
Separation of “editing” and “serving” databases. Front-end database can be replicated across
multiple physical servers. Additional databases can be brought online as
needed.
12
A proven model that is able to serve very high traffic
Additional database servers can be added to handle growing web site traffic.– E.g. 100 million or more dynamic page views a day.
Can house large amounts of content– Disk storage continues to become cheaper.– 10 Years’ worth of content from 100 Daily Newspapers.
13
Replication Issues Addressed
The database replication model is based on the knowledge:
The number of “reads” from the data store outweigh the “writes”.– E.g. A data store that has 10 million records read from it
in an hour is likely to have no more than 10 thousand records written to it.
The number of “new records being added or deleted” outweigh the current records being updated.– E.g. A data store that has 10 thousand new records
added to it in an hour is likely to have between 1 hundred to 2 thousand existing records updated in that time.
14
Latency Issues Addressed
Updates from the Editing databases to the Serving databases are transactional. As tables on the editing database occur those transactions are replicated on the serving machines.
Transactional model means almost no latency between editing and serving machines.
Data is de-normalized and optimized for fast serving on the Editing databases. These fast-access tables are sent to the Serving databases.
15
Conclusion
Because of its flexible framework Cofax can scale to meet any demand.
Scaling requires only the addition of hardware resources and minor configuration changes
The current installation changes took only a few days to implement and bring online.