technological insights behind clusterpoint database

35
1 Your productivity tool that accelerates web / mobile applications software development and time-to-market of your products © Clusterpoint Ltd. The Swiss-army Knife of a Database Developer

Upload: clusterpoint

Post on 18-Jul-2015

1.314 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Technological insights behind Clusterpoint database

1

Your productivity tool that accelerates web / mobile applications

software development and time-to-market of your products

© Clusterpoint Ltd.

The Swiss-army Knife of a Database Developer

Page 2: Technological insights behind Clusterpoint database

2

Clusterpoint is an operational database with high-speed ACID-compliant

database transactions, built-in fast full text search and endless scale out ability

Platform delivers reliable distributed transactions at high performance

previously available only for SQL technology, ultra-fast web and mobile UI

responsiveness and relevance ranking of results in Big Data content search

GB ► TB ► PBXML

JSON

OLTPDOCUMENT

DATABASE INSTANT SEARCH

Use it to safely manage industry standard xml / json data at top speed and at scale

Page 3: Technological insights behind Clusterpoint database

3

Distributed XML / JSON document

store with linear scale out ability

and fault-tolerant replication

Low-cost rack&stack commodity hardware running

scalable from a day-one application software code

Expensive legacy hardware running

complex SQL application software code

Clusterpoint simplifies your database model and application software code,

generating significant TCO savings over your IT systems life-time

RELATIONAL SQL DATABASE

Fragmented Indexes, Complex Schemas,

Rigid Data Structure, Scales Up

CLUSTERPOINT DATABASE

Full Content Index, No schemas, Flexible

Data Structure, Scales Out

Page 4: Technological insights behind Clusterpoint database

4

RANKING

INDEX

All-in-one Database Server Platform with REST API & management GUI

JSON

DISTRIBUTED DOCUMENT

STORE DATABASE

SCALABLE BUILT-IN SEARCH

WITH BIG DATA INDEXING

FAULT-TOLERANT CLUSTER

WITH REPLICATION

OPEN

APIDistributed high-speed OLTP document-oriented

database with built-in full-text indexing and search

to manage structured & unsctructured Big data

1 2 3

High-performance distributed transactions architecture, ACID compliant

XML

What is inside our software product?

C/C++

Page 5: Technological insights behind Clusterpoint database

5

Free text data ingestion

Clusterpoint1800

MongoDB27

MySQL21

Numeric data ingestion

Clusterpoint24k

MongoDB25k

MySQL16k

Numeric data range search

Clusterpoint295

MongoDB280

MySQL170

Transactions (ACID, 30-node cluster)

Clusterpoint 49 000

MongoDB N/A

MySQLN/A

Free text search

Clusterpoint2300

MongoDB817

MySQL140

Strong lead over competition in benchmarks (TPS)

Transactions(ACID, one server)

Clusterpoint3500

MongoDBN/A

MySQL3655

Page 6: Technological insights behind Clusterpoint database

6

Our customers running 24/7 web and cloud services (from 2006)

Page 7: Technological insights behind Clusterpoint database

7

Founder

Gints

Ernestsons

CTO

Jurgis

Orups

Direct Sales

Director

Martins

Berzins

Infrastructure

S/w Architect

Janis

Sermulins

CEO

Zigmars

Rasscevskis

Partner Sales

Director

Peteris

Janovskis

The Team

Developer, tech

entrepreneur

with 25 years

experience in IT

products design

and services;

expert in SQL,

NoSQL, BI and

the Web search

8 years in

Google in the

role of

Technical

Lead for the

infrastructure

core software

development

team

9 years

leadership of

Clusterpoint

core software

engineering

team, expert

in C/C++, web

scale search

and databases

4 years in

Hewlett-

Packard;

5 years

Global sales

director

in SAF

Tehnika

MIT alumni,

internship in

Intel Research

(USA), 5 years

in Google

(Zurich, Swiss)

& 2 years in

TietoEnator

(Finland)

12 years in

Oracle;

Alliance &

Channel

Director

Central & East

Europe;

Regional Sales

Director

Page 8: Technological insights behind Clusterpoint database

8

Clusterpoint blends together SQL, NoSQL and SEARCH benefits

into a single database server software platform with a single API

Enjoy these features out-of-the-box:

• high performance distributed ACID transactions, including essential SQL support

• simplicity of all-in-one: database, search, high-availability replication, sharding

• fast productivity with format-independend schema-less XML/JSON database model

• cost-efficient scale out ability by clustering of commodity rack&stack hardware

• great end user experience from instant, relevant text search and real-time analytics

OLTPXML

JSON

Page 9: Technological insights behind Clusterpoint database

9

Why Clusterpoint is the Swiss-army Knife of a Database Developer?

• simplicity of use (all features inside one API)

• universal usability (handles mixed json/xml/object/text data)

• fast productivity at minimal cost (top speed on commodity h/w)

• endless scale out ability from multiplying tools (elastic clustering)

• great end user experience (fast relevant search at ms latency)

Page 10: Technological insights behind Clusterpoint database

10

Cloud-ready scalable database delivers cost-efficiency for Big Data

• FREE license, no data cap, s/w runs on Linux, Windows & MacOS/BSD

• 24/7 managed service for in-premise use & cloud DBAAS subscribers

• (new) simple one-click replication of our customer database to the Cloud

• easy capacity to process very large data sets on the Clusterpoint Cloud

• no the need to buy extra hardware for fast prototyping / development

XML JSON

HTML MIMEClusterpoint

Cloud Service

Page 11: Technological insights behind Clusterpoint database

11

Web pages

XML MS Officedocs

FAST, SCALABLE, INSTANTLY SEARCHABLE

DOCUMENT-ORIENTED OPERATIONAL DATABASE

( no fragmentation, no complex SQL tables, no joins )

Efficiently manage all your business data using Clusterpoint database

Invoices

Clusterpoint Server

RANKING INDEX

date time chars

full-text numeric

tags links geospacial

ORDERS

CONTRACTS

PAYMENTS

CUSTOMERS

INVOICES

MAIL & MSG.

SALES DOCS

LOG & AUDIT

PRODUCTS

USERS & APPS SESSIONS

JSON

Contracts

Purchaseorders

Customerprofiles

Application source code

Product descriptions

Payment orders

Sales proposals

Userprofiles

Databases

Email &messages

Session tickets

Log records

Business information

mostly lives in documents

No vendor lock-in: open data format & API. Secure ACID transactions. Full-text search. Essential SQL.

Customer

business

application

(web,

mobile

or

middleware

application

server),

that

interacts

with end-

users via

online GUI

XML

JSON

Page 12: Technological insights behind Clusterpoint database

12

All your data, indexes and replicas in one IT software system deliver solid security

HIGH-PERFORMANCE OLTP DB,

BIG DATA QUERIES & ANALYTICS,

ESSENTIAL SQL SUPPORT

ENTERPRISE SEARCH, INCL. FULL

TEXT, FACETS, SNIPPETS, STEMMING,

GEOSPACIAL, COLLATION ETC

DISTRIBUTED, FAULT-TOLERANT,

SCALABLE XML/JSON DATA STORAGE

ALL YOUR DATA IN ONE SECURE,

SCALABLE, INSTANTLY SEARCHABLE

DATABASE SOFTWARE PLATFORM

XML

JSON

ONE API

Page 13: Technological insights behind Clusterpoint database

13

Relational database indexing model:

<id>

<title>

</title>

indexes

One single structured and unstructured data index

over all text, date, numeric and data markup content

Clusterpoint database indexing model:

Multiple fragmented indexes with selected index keys,

managed by complex relations

Clusterpoint database is indexed for FREE TEXT SEARCH in all content

SIMPLE AND USER-FRIENDLY QUERIES:

Use fast and relevant web-style free

text search, essential SQL and analytics

COMPLEX SQL QUERIES:

Hard to learn, unforgiving syntax

and performance problems at scale

SQL query: tens of seconds

Our query: milliseconds

RANKED INDEX

Page 14: Technological insights behind Clusterpoint database

14

Think about our index as a “giant tree” where all database content is

organized into small parts (“leaves”) and ranked by relative “weights”

Distributed storage of

all loaded documents

Clusterpoint Index™: ranked for customer own relevance rules

words

strings

numbers

dates

names tags

values

relations

Clusterpoint database

XML&

JSON

Clusterpoint database unique RANKING INDEX is an inverted graph

with all data items having customer own defined relevance rules

RAM

Page 15: Technological insights behind Clusterpoint database

15

Query: word1 ^100% word2 ^+30% word3 ^-20%

Ranking for query terms: to overwrite policy-defined default ranking rules (terms boosting)

integer 0 ..... 232

( used when tag weightings are same)

Ranking for your database documents:

applied as document rating (your own unique

formula, time-stamp, popularity, vote etc)

Programmable by meaningful for humans relative ranking delivers

superior relevance sorting and grouping for free text search results

RANKING

INDEX

REAL-TIME BIG DATA SEARCHmilli-

seconds

<id>

<title>

<document>

</title>

80%

Text10%

Comments

100%

Ranking for your database structure: applied

as relative weights in % for your XML / JSON

tags (organizes relevance rules for search hits)

Title

Two problems solved

Page 16: Technological insights behind Clusterpoint database

16

Address

Company

Ranking rules are precisely customizable by your application needs

Email

Category

Most relevant

Product

Your data items for search

( XML / JSON tags in a database ):

Least relevant

100%0% 50%

100%0% 50%

100%0% 50%

100%0% 50%

100%0% 50%

Documents having search terms hits in tags with higher weightings will be sorted up-front

Page 17: Technological insights behind Clusterpoint database

17

If Document rating is used for extra

ranking dimension ( for example, a

time stampof a news article

could serve as the Document rating) its value will be

used to sub-group the same % tag

weighting relevancy group results among

themselves, creating

cascading sort orders for the

entire result set

Query: [ w1 w2 w3 ]Paged result set is sorted, grouped and ordered by

the customizable RANKING RULES for the search

results that best match the database query context

Top group of results

has all w1, w2, w3 hits

in the Title tag

Next group of results

has all w1, w2, w3 hits

in the Text tag

The least relevant

group with all hits in

the Comments tag

0%

100%

First

Last

First

Last

First

Last

WWW

In Clusterpoint architecture you can optionally

define additional sort orders (e.g.,

by votes, by alphabetic value,

by click-price etc); Whenever Tag weighting and

Document rating results fall into the same sub-group, they will be again

sorted by next cascading sort rule into even more human-

friendly sub-grups

Tag weighting Document rating Optional sortingRESULT

Ranking enables search results to match the relevant human intent

Pages: 1 2 3 ...

Pages: 1 2 3 ...

10%

80%

Sorted by the best relevance

Page 18: Technological insights behind Clusterpoint database

18

Ranking solves Big Data information overload problem for our customers while

reducing complexity of application software development (coding efforts)

Replace formalized, designed for expert users and for

machine-processing data sorting statements in the SQL query:

SQL SELECT .... WHERE .... LIKE ....

GROUP BY .... ORDER BY .... JOIN ....

Sorted and grouped results in output pages, matching the database content by relevance ranking

FULL CONTENT

RANKING INDEX

with

a web style ad hoc search query using simple for human users

and free text format terms in Clusterpoint API SEARCH command:

any text or ”any text” or any tex*

Instant and relevant results on the 1st page! Great customer satisfaction!

You can search Clusterpoint database with simple text search like everyone is used to search the Web:

Page 19: Technological insights behind Clusterpoint database

19

Constant and predictable query latency enables real-time Big Data search and analytics

PBGB

TBMB

Milliseconds for a

CLUSTERPOINT query

Minutes ... hours

for SQL query

Ranking index enables to scale out your XML / JSON database to billions of

documents assuring very low latency response times for web and mobile apps

FULL CONTENT

RANKING INDEX

XML

JSON

Page 20: Technological insights behind Clusterpoint database

20

Free text:

java developer London

Phrase:

“John Smith”

Wildcards:

Joh* Smi* or “John Smi*”

Pattern match in strings:

John Sm?th John Sm[iy]th

In XML database structure (SQL-like):

java developer <salary>3500..4500</salary> <area>London</area>

Awesome end-user experience

using free format text search terms web-style

to query the Clusterpoint database and

instantly getting the most relevant results

No need to learn a special querying language

syntax; ranking index takes care about data

sorting for results relevance and web paging

Developers can take advantage of combined

full-text and SQL-like structured data queries

using multiple SEARCH API options

Query your database for instant and relevant answers without SQL complexity

and enjoy the world’s most simply and efficiently searchable database

Examples of free text, structured & combined queries delivering results in milliseconds:

Page 21: Technological insights behind Clusterpoint database

21

What customer benefits Clusterpoint database content ranking delivers?

� Sorts RELEVANT data first, critical for ultra-fast, productive Big Data access

� Groups together valuable information for insight, navigation & analysis

� Reduces server-side computing costs eliminating excessive data sorting

� Makes databases instantly responsive on web and mobile GUI screens

� Organizes information by natural language driven (human) context rules

RANKINGINDEX

� Enables high-performance, natively scalable application programming

Page 22: Technological insights behind Clusterpoint database

22

Develop your application software code scalable from day one and the same code

will efficiently run when your database volume and usage will be escalating

OPEX, TCO

Database life-cycle

Save > 80% Your web or

mobile application

software code will

scale for any usage

(write once)

TEST YEAR 1 YEAR 2 YEAR 3 YEAR 4 YEAR N

Page 23: Technological insights behind Clusterpoint database

23

replica 1

replica 2

replica 3

Why pay extra for scalability, high-availabilty and fault-tolerance? All included!

OUT-OF-THE-BOX

SCALABILITY

AND

LOAD-BALANCING

OUT-OF-THE-BOX

FAULT-TOLERANCE

AND

HIGH-AVAILABILITY

Page 24: Technological insights behind Clusterpoint database

24

- - -

10- - -

Save your time and increase productivity with our elastic DBAAS cloud service capable to

process very large data volumes through efficient workload distribution

Number ofcluster nodes

10% 50% 100%

Our developers can share Clusterpoint Cloud DBAAS infrastructure at low cost

Some metrics for Clusterpoint API users:

� Built-in cluster-wide Map-Reduce

� 49 000 TPS on a 30-servers cluster

� 4Bn transactions per day per cluster

� Handles mixed web / mobile data & text

� Millisecond latency for free text queries,

with native pagination for web / mobile UI

Join Clusterpoint DBAAS Cloud for on-demand Big Data computing at the fraction

of cost compared to owning and maintaining your own hardware

DBAASDBAAS

Your work time needed

TERABYTES OF

DATA AND

BILLIONS OF

TRANSACTIONS- - -

100- - -

1%

Page 25: Technological insights behind Clusterpoint database

25

GB

DB:shard I Part IDB:

shard II DB:

shard III

RAM

Server

RAM

Server Server

TB PB

Mirror II

4Replica3 Part I

Replica4

Replica5

Server Server Server

5 N copies

RAMMirror II

Server

Server

Server

Mirror II

Mirror II

Server

Server

Server

DATABASE SHARDING

DATABASE REPLICATION

REPLICATION

OF THE ENTIRE

CLUSTER DATABASE

ACROSS MULTIPLE

DATACENTERS

I

II

III

I

II

III

Easy clustering, sharding and replication using centralized administration tool

Manager

Web GUI

Mirror II

RAMMirror II

Server

Server

Server

I

II

III

replica 1 replica 2 replica 3

Replica2

Server

Replica1

Server

Page 26: Technological insights behind Clusterpoint database

26

Clusterpoint API uses industry standard web services and REST architecture

Open engineering standards: open data format, open API, open web protocols

Applicationserver

Developer’s computer

Administrator’s computer Database Storage

Rack &Stack clusterhardware nodes

XML / JSON

.NET JAVA PHP PYTHON C/C++

CLUSTERPOINT SERVER SOFTWARE (ultra-fast C/C++ code)

insert add a document to the storage

update update or add a document

replace replace the existing document

delete delete the existing document

retrieve retrieve original document

search perform a search query

similar search for similar content

status return storage status information

......... More > 40 API commands

Clusterpoint API commands

XML / JSON

HTTP HTTPS

fast TCP/IP driver

Transparent Map Reduce for all database operations

CLIENT SOFTWARE / API LIBRARIES

Page 27: Technological insights behind Clusterpoint database

27

What about money? How can Clusterpoint database help to save

money for our customer business?

Page 28: Technological insights behind Clusterpoint database

28

Scaling SQL database escalates your costs while Clusterpoint doesn’t

Does Clusterpoint help to save money for me and my business?

Page 29: Technological insights behind Clusterpoint database

29

saved 825 hours / year

Database query time 1,5 sec reduced to 0,15 sec

SQL query

CLUSTERPOINT query

100 employees x 100 queries x 220 days x 1.35 s

10x faster, saves 1.35 sec

Minimal search latency to quickly find the most relevant results is a crucial database

feature for web / mobile applications saving end-user time and corporate money

825 workhours x $29.63 *

Clusterpoint’s 10x faster database search delivers annually worktime savings: $ 24 445

Sample scenario for 100-employees: corporate work time savings for a database

application, where each employee is doing ~ 100 database queries per day

Saving even more for 1000 emplyees ► $ 244 500 (nearly the quarter of a million)

* - Hourly cost of labor; source: US Department of Labor, Bureau of Labor Statistics, March 2014 (http://www.bls.gov/news.release/ecec.nr0.htm)

Page 30: Technological insights behind Clusterpoint database

30

A single platform simplifies our customer IT software stack, boosts

performance and cost-efficiency, and decreases our customer TCO

OLTP database,

ACID compliant

Full-Text Search & Real Time Analytics

Distributed cluster computing

XML

JSON

Masterless, transactional, high-availablity operational database platform with fault-

tolerant replication and scale out architecture using inexpensive commodity hardware

Page 31: Technological insights behind Clusterpoint database

31

21 600132 000118 400Clusterpoint delivers > 80% savings in TCO:

018m / 90 0009m / 45 000Maintaining DB + ESS integration code and indexes over

system’s life-time: DBA man-months ($60k salary / year)

0010 000Client software access licenses (if required for 100 users)

04m / 12 0002m / 10 000Implementing DB+ESS clustering and high-availability:

developer months ($60k salary / year)

0DIY / 020% / 3 x 2 000ESS software maintenance fee (3 year)

0010 000Enterprise search software (ESS) license

0

3 x 7200

0

Clusterpoint

database ***

6m / 30 000

DIY / 0

0

Open source

integration **TCO estimates for 3 years budget, calculating

cost for a 100-users company, in $

Commercial SQL

*

Database software license (enterprise edition) 14 000

Database software maintenance (3 years) 20% / 3 x 2800

Database and ESS s/w integration through custom code:

developer months ($60k salary / year)3m / 15 000

Reduce # of software platforms (complexity) to reduce your costs

•- data varies among vendors, approximated for average cost ** - assuming that open source integration takes ~ 50% more time and efforts

*** - Clusterpoint database 24/7 support price for 2 high-availability replica servers (2 x $ 3600 / year)

Page 32: Technological insights behind Clusterpoint database

32

Productivity benefits for Clusterpoint database customers

1. Reduces customer IT complexity: simplifies customer database and application software

with a schema-less XML/JSON document store model and scalable, write-once s/w code

2. Delivers high-performance computing: blends OLTP database, free text search, querying

and analytics in the same software platform, without using search engines or BI tools

3. Provides cost-efficient Big data scalability: distributed database architecture scales out

on commodity rack&stack hardware, no complex software skills in MapReduce needed

XML

JSON

TCO savings > 80%

Page 33: Technological insights behind Clusterpoint database

33

Vertical-markets

application products for

Big Data real-time

management, running on

the Clusterpoint database

All products scale out linearly by

using inexpensive rack & stack

commodity hardware architecture

Web Content Crawler, Monitoring, AnalyticsECM market sector: $4,7 billion 2012, CAGR 7.2%

GOL (Machine-data Log & Event Analytics) SIEM market sector: $1,3 billion 2013, CAGR 14%

GB -► TB -► PB

NTSS (All Network Traffic Storage & Search)

Cyber Security Market worth $95 billion in 2014, CAGR 10.3%

All products feature instantly

responsive web-style keyword

search across the entire database

content, essesntial SQL, real-time

Big Data reporting and analytics

GB -► TB -► PB

Portfolio of some of our Big Data applications

Page 34: Technological insights behind Clusterpoint database

34

MyInstaBank is the most

recent Proof-of Concept

application product in $60

billion financial services

market, driven by

Clusterpoint database

platform

Next-gen Touch&Go online banking SolutionClusterpoint database drives innovative banking payments

archive data management, reporting and analytics solution

Solution is gaining the

growing interest among

financial data

management, banking and

EPR vendors, sectors still

largely dominated by

legacy SQL platforms

We are helping legacy SQL vendors where they are struggling

Page 35: Technological insights behind Clusterpoint database

35

To see the full potential of our database software solutions, you

have to see them in action!

You are welcome to contact us to arrange a live demonstration for you!

[email protected]

USA: +1 (650) 681 9710

Europe: +371 (2) 9243460