optigrise technology solutions llc, new jersey
DESCRIPTION
Optigrise Technology Solutions LLC, New Jersey - The World's Largest Professional Community to help Business People with Up-to-Date Digital Transformation Service. Optigrise Technology is specifically designed to help enterprises succeed in their digital transformation by re-imagining businesses to generate growth with cost efficiency and business agility.TRANSCRIPT
Digital SBU
Consulting Services
Data, Analytics & InsightsService AI & CognitiveServices Digital IntegrationServices
• Cloud Consulting
• Cloud Architecture
• Cloud Migration
• Cloud Native Dev
• Cloud Testing & Ops
Data Strategy, Consulting & Architecture
Data Warehouse & Business Intelligence
Operational Databases, OLTP
Data Warehouse, OLAP & Data Mart
Business Intelligence
ETL
MDM/Master Data Management
Big Data & Analytics
• Modern Data Warehouse, DWaaS
• Big Data & Analytics, Big Data on cloud, Data Migration
• Data Visualization
• Data Ops, Data Integration & ELT
• AI Consulting
• Data Science & MachineLearning
• Conversational AI, NLP, Chatbot/Virtual Agents
• Voice, Speech & Video
• Digital IntegrationArchitecture
• API Gateway
• Micro service
• EAI and SOA
• DevOps
Focus Areas
Cloud Services
EngineeringServices Professional Services
Data Warehouse & Business Intelligence
Operational Databases
Data Warehouse & Data Mart
Business Intelligence
ETL
MDM/Master Data Management
“Data could be your biggest asset, data could be your biggest challenge”Data, Analytics & Insight is fueling digital transformation
“The goal is to turn data into information, and information into insight.” –Carly Fiorina, former executive, president, and chair of Hewlett-Packard Co.
Global Data Warehouse market is projected to reach $35 billion by 2025
(current $20B). Analytics Market is expected to reach $71.1 billion by 2022.
Typical Approach
• Siloed approach – Separate tools, process fordifferent teams.
• Separate pipeline – Separate pipeline/data flowb/w traditional data engineering, big data & MLteams.
• Focus on Data science only – While AI and predictive analytics can solve many use cases, still organizations have huge amount of data in relational & structured form. They should continue to have a strong DW/BI strategy
Our Approach
• Unified approach – Unified tools &process.
• Unified pipeline – Unified pipeline from data ingestion, data preparation to visualization for traditional DW/BI, Big data, AI & other analytics.
• Balanced Approach: Balanced approachb/w traditional DW/BI, Big data &AI
• Data Ops – Bringing in DevOps & Agile principles to Data projects.
• Cost Optimization - Cost saving on DW/BI, so that additional savings could be spent on AI & Big Data.
Dat
aW
are
ho
use
Bu
sin
ess
Inte
llig
en
ce
Big
Dat
a*
Dat
a Sc
ien
ce &
Mac
hin
eLe
arn
ing
Dat
aW
are
ho
use
Bu
sin
ess
Inte
llig
en
ce
Big
Dat
a &
An
alyt
ics
Dat
a S
cie
nce
&M
ach
ine
Lear
nin
g
Re
po
rtin
g &
Vis
ual
izat
ion
Strong Data Foundation
Re
po
rtin
g &
Vis
ual
izat
ion
Data Strategy
DW &BI
How our data services & solutions are different from others?
Business Analytics 10 yearsbefore
• Simple 3 later stack
Business AnalyticsNow
• Unified approach – Unified tools &process.
• Unified pipeline – Unified pipeline from data ingestion, data preparation to visualization for traditional DW/BI, Big data, AI & other analytics.
• Balanced Approach: Balanced approachb/w traditional DW/BI, Big data &AI
• Data Ops – Bringing in DevOps & Agile principles to Data projects.
• Cost Optimization - Cost saving on DW/BI, so that additional savings could be spent on AI & Big Data.
DW & BI
How Business Analytics solution have changed over time
ETL
DW (Data Warehouse)
BI (Business Intelligence)
ETL
DW (Data Warehouse)
Core BI
ELT, Data Ops
Data Lake
Modern Data Warehouse Hybrid Cloud
Big Data Analytics
Graph ProcessingMachine Learning
Spatial ProcessingTime Series Processing
BI (Business Intelligence) Visualization
Mobile BI Self Service AnalyticsChallenge:# of tools have exploded in recent years. This poses a huge challenge for enterprises of all size
Our Solution:We use a reference architecture basedapproach to map client’s unique need to one of our proven DW/BI reference architecture.
OLTP
• Oracle
• MS SQL Server
• Oracle
• IBM Db2
• MySQL
• PostgreSQL
• NoSQLs
• NewSQLs
Data Warehouse
• Terradata
• SQL Server DW,Azure SQL Datawarehouse
• Oracle
• SAP
• Snowflake
• AWS Redshift
• IBM BigQuery
ETL
• SSIS / SQL ServerIntegrationSvc.
• Talend
• Informatica PowerCenter
• IBM InfosphereInformationServer
• OracleData Integrator
• Ab Initio
• Apache Nifi
• SAS – DataIntegrationStudio
• SAP BusinessObjectsData Integrator
BI & Visualization
• Tableau
• Qlickview
• PowerBI
• SSRS
• FusionChats
• D3.js
MDM
• Informatica
• IBM
DW/BI Vendors and Toolchain
DW &BI
Our Offerings
Our Tech Expertize
ETL / DataMovement:• Talend, Stitch, SSIS,
Informatica, AWSGlue, Azure Data Factory
Data Warehouse:• Teradata Vantage, MSSQL
Server Data Warehouse, Oracle DW, IBM Db2DW
Business Intelligence, Visualization &Dashboards:• Talend, Power BI,Qlick
Others:• GDPR, HIPPA anddata
privacy consulting• Data archival
• Data strategy• Data consulting• DW/BIreference
architecture• Data Ops consulting• Data governance & quality
strategy• Datasecurity• Dataarchival
DW BI architecture
• ETL pipeline design• ETL build and test• Manage/monitor ETL batch
jobs• NoETL/Streaming ETL
design & build on kafka / other messagingplatform
• ELT design/build todata lake
• Cloud/SaaSETL design/build.
• BI & Analytics design• Dimensional modeling,
OLAP Cube design• Self service analytics• BI test
• Visualization &graph design/build
• Reportingdesign/build• Dashboardsdesign/build• Testing
• Data warehouse schema& model design
• Data warehousebuild/test• Data warehouse
performanceoptimization.• Cloud Data warehouseand
migration• Modern data warehouse
design with AI & Big dataanalytics
• Spark Machine Learning
ETL design/build/test
Business Intelligence Visualization &Reporting
Data warehouse design/build
• Traditional ETLtools (Talend, SSIS)
• Big data/data lakerelated ELT tools
• Data integrationtools (Streamsets, Altryx) development
Data Ops
DW/BI architecture – very small organizations
No Staging Area
• Often time in very small organizations & POCs, Data warehouse does not have separate‘Staging Area’
• Data from operational systems are moveddirectly to data warehouse
No Data Mart
• Analytics/Visualization/reporting apps directly query data warehouse.
Schema Mostly Star
Tables Fact, Dimension
DW/BI architecture – small sized organizations
Schema Star, Snowflake
Tables Fact, Dimension
Staging Area
• Data from operational systemsare moved to stagingarea.
• Later its moved to datawarehouse
No Data Mart
• Analytics/Visualization/reporting apps directly query data warehouse.
DW/BI architecture – medium & large organizations
• Uses StagingArea
• Data Marts: Departmental Data Marts based on business / subject area. BI/Visualization tools access data mart data and not raw data in data warehouse.
Schema Star, Snowflake, Fact Constellation, hybrid
Tables Fact, Dimension
Model Inmon Model, Kimball Model
DW/BI architecture – very large organizationsOften called “Three tier DW/BI architecture”
• UsesStaging Area
• Departmental Data Martsbased on business / subject area.
• OLAP Servers: OLAP Cubesused for dimensional modeling.
• BI/Visualization tools accessdata mart data and not raw data in data warehouse.
Schema Star, Snowflake, Fact Constellation, hybrid
Tables Fact, Dimension
Model Inmon Model, Kimball Model
Tier 1 Tier 2 Tier 3
Operational Databases / OLTP
Operational databases – Relational, NoSQL, Time series, Graph …
Object Relational
Relational / RDBMS
Key value Store/Cache
Document Database
Wide Column
Store
Time series Database
Search
Graph/RDF Database
Specialized Databases
• Microsoft SQL Server• Oracle• IBM Db2• MySQL• MariaDB• Sybase
• PostgreSQL• DB4o
• MongoDB• AWS DynamoDB (Cloud)• CouchBase / CouchDB• Azure CosmosDB (Cloud)• GCP Datastore(Cloud)• RavenDB• IBM Cloudant (Cloud)
• Redis• Memcached• Amazon DynamoDB(Cloud)• Azure CosmosDB (Cloud)• Aerospike• Riak• Oracle Berkley DB
• AWS Quantum Leger Database/QLDB (Blockchain database)
• Spatial Database• GIS Database
• Neo4J• Tinkerpop/Gremlin• AWS NeptuneDB (Cloud)• Azure CosmosDBw/
Gremlin API• JanusGraph• RDF Stores
• ElasticSearch• Solr• Marklogic• Amazon CloudSearch (Cloud)• Azure Search (Cloud)
• InfluxDB• Prometheus• AmazonTimestream
(Cloud)
• Cassandra• Hbase• Azure CossmosDB w/ Cassandra API(Cloud)• Google Cloud BigTable (Cloud)
Data ContinuumPolyglot persistence
Paradigm shift in applications & database technology …
Swiss army knife / One size fit all Approach Micro service styled app. Each micro service uses the database that fitsthe purpose. Polyglotpersistence.
DBaaS/Cloud databases - Relational, NoSQL, Graph …
Relational / OLTP
Amazon Aurora
Amazon RDS for Oracle
Amazon RDS for SQL Server
Amazon RDS for MySQL
Amazon RDS for PostgreSQL
Amazon RDS for MariaDB
Azure SQL Database
Azure SQL MySQL
Azure SQL PostgreSQL
Azure SQL MariaDB
Cloud Spanner
Cloud SQL (MySQL)
Cloud SQL (PostgreSQL)
Cloud SQL (SQL Server)
Db2 on Cloud
Compose for MySQL
Compose for PostgreSQL
NoSQL
Key Value Store Amazon DynamoDB Azure CosmosDB w/ etcd API Compose for etcd
Document Database Amazon DocumentDB (with MongoDB compatibility)
Amazon DynamoDB
Azure CosmosDB w/ SQL API
Azure CosmosDB w/ MongoDB APICloud Firestore Cloudant
Compose for MongoDB
Column Store Database Azure CosmosDB w/ Cassandra API Cloud Bigtable Compose for ScyllaDB
Timeseries Amazon Timestream
Graph Database Amazon Neptune Azure CosmosDB w/ Gremlin API Compose for JanusGraph
Caching/In memory Store Amazon ElastiCache for Redis
Amazon ElastiCache for MemcachedAzure Cache for Redis Cloud Memorystore Compose for Redis
Search Compose for ElasticSearch
Specialized Amazon Quantum Ledger DB (QLDB)
Data Warehouse / DW(also called Enterprise Data Warehouse / EDW)
Our technology expertise & focus in DW/EDW technologies
• Microsoft – SQL Server DW (on premise),Azure SQL Data Warehouse(cloud)
• Teradata – TeradataVantage
• Oracle -
• AWS – Redshift, RedshiftSpectrum
• Snowflake – Cloud hosted Data Warehouseas a Service (DWaaS)
• Google Cloud – Google CloudBigQuery
• IBM – IBM Db2 datawarehouse
• Neo4j – Neo4j Graph Database
Data Warehouse Categorization & Trends
Traditional Data Warehouse
Data Lake Modern Data Warehouse Next Gen Data Warehouse
• Oracle DW• SQL Server DW• Teradata
• Hadoop HDFS• S3• Azure Blob Storage,
Azure Data Lake• Databricks Lake
• Amazon Redshift• Snowflake• Azure SQL Data
Warehouse• IBM BigQuery
✓ Teradata Vantage✓ SQL Server 2019 Data
Warehouse✓ IBM Db2 Data Warehouse/
Db2 DW on Cloud✓ Oracle Autonomous DW
• Columnar storage
• High performance, optimized query engine
• Secure, Strong toolset
• SQL support, ACID compliant, Enterprise grade
• Analytical functions
• Flexible schema
• Capability to store & analyze unstructured, semi structured & structured data
• SQL on unstructureddata
• Unlimited/Elastic storage
• Cheap storage & compute
• Massive Parallel Processing (MPP)
• Separate storage & compute layers for scale & flexibility
• Ability to store/analyze semi- structured data
• Cheap storage
✓ MPP – Massively parallel processing
✓ Separate storage & compute layers
✓ SQL on unstructured data
✓ ML in the database.
✓ Unified analytical platformw/ support for big data, ML, graph, time series, spatial.
✓ Support for R/Python/Scala and Spark in the core engine
✓ Typically runs Hybrid Cloud
All major DW vendors are coming up with services around Next Gen AI & Big Data enabled data warehouse
DW &BI
Next Gen DataWarehouse
Traditional Data Warehouse
• Columnar storage
• High performance, optimized queryengine
• Secure, Strong toolset
• SQL support, ACID compliant, Enterprisegrade
• Analytical functions
Data Lake
• Flexible schema
• Capability to store &analyze unstructured, semi structured & structureddata
• SQL on unstructureddata
• Unlimited/Elastic storage
• Cheap storage & compute
Modern DataWarehouse
• Massive ParallelProcessing(MPP)
• Separate storage & compute layers for scale& flexibility
• Ability to store/analyze semi- structureddata
• Cheap storage
AI/ML
• Machine LearningAlgorithms
• Support for R/Python/Scala in the core engine
• Graph processing
• Time series
• Spatial support
Next Gen Data Warehouse
• MPP – Massivelyparallelprocessing
• Separate storage & compute layers
• SQL on unstructureddata
• Unified analyticalplatform with support for big data, ML/AI, graph, time series, spatial.
• Support for R/Python/Scala and Spark in the coreengine
Teradata Vantage – Bringing the power of AI and big data to traditional DW
• High performance SQL engine - Modern new genNewSQL engine improving query performance atscale.
• Multi genre analytics - Built in Big data, AI and graphanalytics engines
• Supports Machine Learning – Supports R and Python apart from SQL.
• Hybrid cloud solution – available on prem, on publiccloud (AWS, Azure) and Teradatacloud.
SQL Server Data Warehouse // Azure SQL Data Warehouse
• Data Virtualization: Using Polybase technology SQL Server engine access data stored on other Relational DBs (MySQL, Db2, Teradata, Oracle), NoSQL databases (MongoDB or Azure CosmosDB) and big data platforms and data lake (Hadoop HDFS, Cloudera and Spark)
• Integrated SQL and ML Analysis engine: Can analyze data using SQL engine, Spark, Spark Machine Learning and SQL Server MLservices.
• Big Data Clusters: Provides scalable compute and storage engine based on Spark embedded within the coredatabase.
• Graph processing: Provides powerful graph processing on linked data.
• BI Capabilities: BI capabilities with Power BI and Reporting Service
• Analysis Engine: Dimensional modeling capabilities with support for OLAB cubes apart from relationalmodels.
Challenges with traditional data warehouse
Challenge: Traditional data warehouses could not store unstructured and semi structured data because they follow strict schema. This restricts their usage for storing an analyzing data from NoSQL, logs, IoT data, audio/video files etc, which currently constitutes more than 50% of enterprise data.
Solution: Using data lakes and cloud storage platforms which can store unstructured, semi structured and structureddata.
Challenge: Traditional data warehouses could not analyze unstructured and semi structured data. This restricts their usagefor storing an analyzing data from NoSQL, logs, IoT data, audio/video files etc, which currently constitutes more than 50% of enterprise data.
Solution: Using big data solutions like Hadoop and Spark which cananalyze unstructured/semi structured data. Also with ML and graphprocessing capabilities could be used.
Challenge: Traditional data warehouses face challenges in scaling which causes performance issues in queries.
Solution: Modern data warehouses uses Massively parallel processing and hybrid shared disk/shared nothing architecture for scaling. This ensures their query responses arefast.
Challenge: Traditional data warehouses typically inputs data only using batch based traditional ETL /Extract Transform Loadmethod. This means data could not be analyzed realtime.
Solution: Big data solutions use streaming to consume data from sources like clickstream, event log, IoT data and real time location data from mobile devices. They also perform stream analytics on incoming data to ensure they can provide real timeanalytics.
Db2 DW – Spark & R Analytics running within core database engine
ETL – Extract Transform Load
Our technology expertise & focus in ETL & Data Integration
• Informatica - PowerCenter, PowerExchange,Data Replication
• IBM - IBM InfoSphere Information Server, IBM InfoSphere Data Replication,
• Microsoft – SQL Server Integration Service / SSIS(On premise), Azure Data Factory(Cloud)
• Talend - Talend Open Studio, Talend Data Fabric, Talend Data Management Platform
• Oracle - Oracle Data Integration Platform Cloud,Oracle GoldenGate (OGG), Oracle GoldenGate Cloud, Oracle Data Integrator(ODI).
• Apache Nifi (opensource)
Cloud Only
• AWS Glue
• Alooma - now part of GoogleCloud
• Panopfly – both data integration & light weight data warehouse. Cloud SaaS solution
• Stitch – Light weightsolution
• Azure Data Factory
Talend ETL and Data Integration Platform
•
•
•
•
•
•
•
•
• Connects to anything via 900+connectorsand components
Manages data across all environments(multi-cloud and on-premises)
Supports batch, real-time, streaming, and big data use cases. SupportsSpark.
Offers built-in machine learning, data quality,and governance capabilities
Provides full API development lifecycle support
Supports on prem and cloud hosted integration platform as a service/iPaaS solutions
Supports MDM via centralized datacalaogsupport.
Supports data quality - Profile, clean, and mask data in any format or size to deliver data you can trust for the insights youneed.
Have data cleansing and preparationfeatures.
Challenges in traditional ETL and solution
Challenge: More often than not, within large enterprises there are thousands on point to point ETL pipelines, which performs data integration from source system, app databases, COTS/SaaS to data warehouses and other systems. This causes what is calledETL hell or Integrationspaghetti, which is difficult to manage & operate and becomes a huge bottle neck for “digital transformation”. Traditional ETL is also not real time and can not scale to cope up with the growing datavolume.
Solution: Streaming and Messaging based systems like Kafka or Kinesis or Message Bus based architecture could solve these problems. Using a pub sub based architecture removes the point to point Integration spaghetti. Also modern platforms like Kafka scales extremely well and can handle real time streaming data from varioussources.
ETL hell or Integration spaghettiClean Streaming/Messaging based Integration
Business Intelligence & Visualization
Our technology expertise & focus in Business Intelligence & Analytics
• Tableau – Tableu on prem and cloudproducts
• Microsoft – Power BI, SQL ServerReporting Service (SSRS)
• Qlik - Qlikview
• SAS – SAS platform
• Looker - now part of GoogleCloud
• MicroStrategy
• IBM – Cognos
• TIBCO - Spotfire
Power BI
• Business analytics service that delivers insights to enable fast, informed decisions
• Could connect to all industry standard data warehouses.
• Transform data into stunning visuals and share them with colleagues on any device.
• Visually explore and analyze data—on-premises and in the cloud—all in one view.
• Collaborate on and share customized dashboards and interactive reports.
• Scale across your organization with built-in governance and security.
• Supports cloud and desktopversions.
Master Data Management (MDM)
Master Data Management (MDM)
Our technology expertise & focus inMDM
• Informatica: Informatica MDM, Informatica MDM Cloud
• IBM: IBM InfoSphere Master Data Management, IBM Master Data Management on Cloud
Data Security, Privacy & Compliance
GDRP
• Consent management
• Right to be secured
• Data minimization
• Right to portability
• Right to be informed
• Right to be forgotten
HIPPA
Patient health information (examples below) needs to be “protected” -
Names or part of namesAny other unique identifyingcharacteristic
Geographical identifiersDates directly related to an individual
Phone numbers Fax numbers
Email addresses Social Security numbers
Medical record numbersHealth insurance beneficiary numbers
Account numbers Certificate or license numbers
Vehicle license plate numbers
Device identifiers and serial numbers
Web URLs IP addresses
Fingerprints, retinal and voice prints
Full face or any comparable photographic images
PCI DSS
Information security standard for organizations that handle branded credit cards from the major card schemes.
• Build and Maintain a Secure Network and Systems
• Protect Cardholder Data
• Maintain a Vulnerability Management Program
• Implement Strong Access Control Measures
• Regularly Monitor and TestNetworks
• Maintain an Information Security Policy
SOX (Sarbanes-Oxley Act)
• Corporate Responsibility for Financial Reports (Section 302) - CEOs and CFOs
must review all financial reports and that the reports are "fairly presented" and don't contain misrepresentations.
• Management Assessment of Internal Controls (Section 404) - requires
companies to publish details about their internal accounting controls and their procedures for financial reporting as part of their annual financial reports
GDPR Solution
Consent Management
Change in Customer journey/UX and database for -
• Requests for consent must be simple to understand, clearly requested, and as easy to give as withdraw.
• Opt-in marketing will replace opt-out marketing in the post GDPR era.
Right to be secured All PII data be secured by pseudonymization or encryption, whether at rest or in transit.
Data minimization
Change in Customer journey/UX and database for -
• personal data collected be “adequate, relevant, and kept no longer than necessary for which the personal data areprocessed”.
• Outdated and irrelevant data must be eliminated.
Right to portabilityCustomers have the right to export their PII data in an encrypted format, such that it can easily be imported into a different IT environment. This could have huge implications in big data ecosystems. For example, a customer could request to have their telematics data transferred from one insurance carrier to another.
Right to be informedIn the post-GDPR world, customers will have the right to request and be shown how and why they were targeted for a specific marketing campaign.
Right to be forgotten
Three fundamental aspects comprise the right to be forgotten.
• First, the customer has the right to “Opt Out” from receiving marketing communications.
• Second, customers have the right to have their PII marketing data anonymized.
• Last, in most instances, customers can refuse to be analyzed. That means, even if you lawfully collect the data, customers can still say no to profiling; e.g., having their data analyzed for preferences and buying behavior.
DW &BI
Future of DW/BI
• Modern next gen data warehouses
• Cloud data warehouses
• Data Warehouse + Data Lake based solutions
• AI & Big data enabled Data warehouse platforms
• No ETL Movement
• Messaging & Streaming platforms for data integration
• Data virtualization
• SaaS/Cloud ETL platforms
• ELT (Entry Load transform) for big data workloads
• End to end DataOps
• Mobile BI
• Cloud based BI solutions
• Self Service BI & Analytics
Sample Profiles
<First Name><Last Name> Data Engineer/Developer Total Exp – 3 years Bangalore, IN
<First Name><Last Name>DBATotal Exp – 5 yearsDelhi, IN
<First Name><Last Name>Data & Insights LeadTotal Exp – 7.5 yearsNew York, USA
About
Educational Qualification B.E from PQR B.S, M.A
ProfessionCareer - 1.5 years with XYZLtd - 1 year with PQRCorp- 3 years with ASDLLC
Experience in Cloud 1.5 years 4 years 5.5 years
Cloud Tech Knowledge SQL Server, Azure SQL DB,Azure SQL Data warehouse
AWS Redshift, Snowflake,Oracle, Mongo
Hadoop, Spark, Datawarehouse, AWS Redshift, ETL
Other Tech Stack Scripting, .NET basics AWS AWS, Mongo, Java
Certification - AWS Certified(Associate) Hortonworks Hadoop Certified
Project Experience
Domain Knowledge Retail, CPG Manufacturing, Telecom BFSI, Retail