big data & advanced analytics roadshow...hadoop and spark on- premises. provisioning hdinsight...
TRANSCRIPT
![Page 1: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/1.jpg)
BIG DATA &Advanced AnalyticsRoadshowBig Data-as-a-Service Demos
![Page 2: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/2.jpg)
DEMO OVERVIEW
Hadoop and SPARK on-premises
Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI
Enabling independent scaling compute & storage
Pricing it up: Deriving insights from terabytes of data for under $10/day
1
2
3
4
![Page 3: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/3.jpg)
DEPLOYMENT
MODELSOn Premise Deployment Big Data-as-a-Service
Azure HDInsight
Azure SQL Data Warehouse
Amazon Elastic MapReduce
Amazon RedShift
Microsoft Analytics Platform System (APS)
Oracle Big Data Appliance
Hortonworks Data Platform (HDP)
Cloudera (CDH)
Pivotal Data Computing Appliance (DCA)
![Page 4: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/4.jpg)
hadoop fs -put <localsrc> ... <HDFS_dest_Path>
ON PREMISE DEMO
HADOOP/ SPARK
• Import Data from local to HDFS
• Create Hive External Tables• Run Sample Covariance script using HiveQL• Run the same Covariance script using Spark SQL
Objectives:
![Page 5: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/5.jpg)
Hadoop Component
HiveWhat is Hive
• Hive is a SQL-Like data warehousing layer that lies on top of MapReduce.
• Hive Query Language (HQL) is translated into MapReduce jobs, yet the language is familiar to SQL
professionals.
• Used for batch & interactive processing
• Supports ACID operations, UDFs, UDTF, UDAF, Window Functions
• Supports cubes, dimensions, and star schemas
• Supports Storage Based Authorization and SQL Standard Based Authorization and Authentication
![Page 6: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/6.jpg)
Yarn Application
SparkWhat is Spark
The Spark core is complemented by a set of powerful, higher-level libraries which can be seamlessly used in the same application. Spark Core API and Execution Model
• RDDs & DAG• Scala• Python • Java• R
![Page 7: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/7.jpg)
WHAT IT MEANS
COVARIANCE
A positive covariance means that asset returns moved together. If investment instruments or stocks tend to be up or down during the same time periods, they have positive covariance.
Covariance (noun)
Covariance is a financial term that represents the degree or amount that two stocks move together or apart from each other. With covariance, investors have the opportunity to seek out different investment options based upon their respective risk profile. It is a statistical measure of how one investment moves in relation to the other.
A negative covariance means returns move inversely. If one investment instrument tends to be up while the other is down, they have negative covariance.
![Page 8: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/8.jpg)
CODE
HIVEQL
select a.STOCK_SYMBOL, b.STOCK_SYMBOL, month(a.STOCK_DATE),
(AVG(a.STOCK_PRICE_HIGH*b.STOCK_PRICE_HIGH) –(AVG(a.STOCK_PRICE_HIGH)*AVG(b.STOCK_PRICE_HIGH)))
from NYSE a join NYSE b on
a.STOCK_DATE=b.STOCK_DATE where a.STOCK_SYMBOL<b.STOCK_SYMBOL
Group by a.STOCK_SYMBOL, b. STOCK_SYMBOL, month(a.STOCK_DATE);
![Page 9: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/9.jpg)
THE
RESULT
STOCKS QRR AND QTM
These are having more positive covariance than negative covariance, so having high probability that stocks will move together in same direction.
STOCKS QRR AND QXM
These are mostly having negative covariance. So there exists a greater probability of stock prices moving in an inverse direction.
STOCKS QTM AND QXM
These are mostly having positive covariance for most of all months, so these tend to move in the same direction most of the times.
![Page 10: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/10.jpg)
DEMO
HDINSIGHT &AZURE SQL DW
PROVISIONING SCALING DATA INGESTION QUERYING
![Page 11: BIG DATA & Advanced Analytics Roadshow...Hadoop and SPARK on- premises. Provisioning HDInsight clusters, Azure SQL DW databases, Machine Learning, Stream Analytics & Power BI. Enabling](https://reader034.vdocuments.net/reader034/viewer/2022050112/5f496fba6ec4b764e3744639/html5/thumbnails/11.jpg)
AZURE
PRICING IT UP
https://azure.microsoft.com/en-us/pricing/calculator/