hadoop and hana integration

6
3/2/2014 Hadoop and HANA Integration | SCN http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b 1/6 Getting Started Newsletters Store Products Services & Support About SCN Downloads Industries Training & Education Partnership Developer Center Lines of Business University Alliances Events & Webinars Innovation Login Register Welcome, Guest Search the Community Activity Communications Actions Brow se SAP HANA and In-Memory Business Data Management 0 Tweet 2 Hi Everyone, In my earlier blogs, I shared what I learnt while exploring Big Data and Hadoop - Big Data Facts and Its Importance and Hadoop,Its Importance and Use Cases In this blog, I would like to share how Hadoop and HANA can be integrated with each other. Lets start with advantages of using Hadoop: It can easily handle huge amount of data volumes It is very good for storing Unstructured data It is reliable, scalable and fault tolerant It is Open source so is less costly It provides Batch Processing Now Lets look at some of the limitations of Hadoop: It is not efficient to use for small anmount of data It is less mature It is difficult to find qualified Talent It is not suited for real time scenarios HANA and Hadoop: As you would already know by now that Hadoop can store very huge amount of data. It is well suited for storing unstructured data, is good for manipulating very large files and is tolerant to hardware and software failures. But the main challenge with Hadoop is getting information out of this huge data in real time. No we also have HANA and as you all already know that HANA is well suited for processing data in Real time. So to get real time information from massive storage such as Hadoop, we can use HANA and HANA can be directly integrated to Hadoop. So we can combine Hadoop and HANA to get real time information from huge data. Read Solving Big Data with SAP HANA and Hadoop: http://www.saphana.com/community/blogs/blog/2012/08/27/solving-big-data-with-sap-hana-and-hadoop? q4654483=1 Watch the replay of SAP Big Data Chat on HANA and Hadoop: http://timoelliott.com/blog/2013/08/sap-big-data-chat-hana-hadoop.html Read Demystifying Big Data with SAP HANA and Hadoop: http://events.sap.com/sapphirenow/en/session/2457 Read Hadoop + SAP HANA: Turning Infinite Storage into Instant Insights: http://www.saphana.com/community/blogs/blog/2013/09/20/hadoop-sap-hana-turning-infinite-storage-into-instant- insights SAP, Hadoop and HANA: As explained in SAP CIO Guide on Using Hadoop, Hadoop can be used in various ways as mentioned below: I have added Smart Data Access myself as it was not available at the time this guide was written but now we can use Smart Data Access to connect HANA with Hadoop. Lets see how Hadoop can be used in SAP world: Hadoop and HANA Integration Posted by Vivek Singh Bhoj in SAP HANA and In-Memory Business Data Management on Oct 15, 2013 7:32:34 PM Share 2 0 Like

Upload: manikandanu1

Post on 20-Oct-2015

54 views

Category:

Documents


0 download

DESCRIPTION

Hadoop and HANA

TRANSCRIPT

  • 3/2/2014 Hadoop and HANA Integration | SCN

    http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b 1/6

    Getting Started Newsletters Store

    Products Services & Support About SCN Downloads

    Industries Training & Education Partnership Developer Center

    Lines of Business University Alliances Events & Webinars Innovation

    Login RegisterWelcome, Guest Search the Community

    Activity Communications Actions

    Brow se

    SAP HANA and In-Memory Business Data Management

    Previous

    post

    Next

    post

    0 Tweet 2

    Hi Everyone,

    In my earlier blogs, I shared what I learnt while exploring Big Data and Hadoop - Big Data Facts and Its

    Importance and Hadoop,Its Importance and Use Cases

    In this blog, I would like to share how Hadoop and HANA can be integrated with each other.

    Lets start with advantages of using Hadoop:

    It can easily handle huge amount of data volumes

    It is very good for storing Unstructured data

    It is reliable, scalable and fault tolerant

    It is Open source so is less costly

    It provides Batch Processing

    Now Lets look at some of the limitations of Hadoop:

    It is not efficient to use for small anmount of data

    It is less mature

    It is difficult to find qualified Talent

    It is not suited for real time scenarios

    HANA and Hadoop:

    As you would already know by now that Hadoop can store very huge amount of data. It is well suited for storing

    unstructured data, is good for manipulating very large files and is tolerant to hardware and software failures.

    But the main challenge with Hadoop is getting information out of this huge data in real time.

    No we also have HANA and as you all already know that HANA is well suited for processing data in Real time.

    So to get real time information from massive storage such as Hadoop, we can use HANA and HANA can be directly

    integrated to Hadoop.

    So we can combine Hadoop and HANA to get real time information from huge data.

    Read Solving Big Data with SAP HANA and Hadoop:

    http://www.saphana.com/community/blogs/blog/2012/08/27/solving-big-data-with-sap-hana-and-hadoop?

    q4654483=1

    Watch the replay of SAP Big Data Chat on HANA and Hadoop:

    http://timoelliott.com/blog/2013/08/sap-big-data-chat-hana-hadoop.html

    Read Demystifying Big Data with SAP HANA and Hadoop:

    http://events.sap.com/sapphirenow/en/session/2457

    Read Hadoop + SAP HANA: Turning Infinite Storage into Instant Insights:

    http://www.saphana.com/community/blogs/blog/2013/09/20/hadoop-sap-hana-turning-infinite-storage-into-instant-

    insights

    SAP, Hadoop and HANA:

    As explained in SAP CIO Guide on Using Hadoop, Hadoop can be used in various ways as mentioned below:

    I have added Smart Data Access myself as it was not available at the time this guide was written but now we can use

    Smart Data Access to connect HANA with Hadoop.

    Lets see how Hadoop can be used in SAP world:

    Hadoop and HANA Integration

    Posted by Vivek Singh Bhoj in SAP HANA and In-Memory Business Data Management on Oct 15, 2013

    7:32:34 PM

    Share 2 0Like

  • 3/2/2014 Hadoop and HANA Integration | SCN

    http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b 2/6

    Hadoop as a flexible data store:

    As we know Hadoop is less costly so we can use Hadoop as a flexible data store by storing data from various

    sources including SAP and Non-SAP sources like Social data, streaming data, transaction data etc. By keeping all the

    data in Hadoop, we can get any information we want and can do any type of analysis.

    Hadoop as a simple database:

    We can also use Hadoop as a simple database for storing and retrieving data in very large data sets. We can

    retrieve data from Hadoop using Hive or HBase.

    Hadoop as a processing engine:

    We can use the power of MapReduce programming model for many purposes such as Pig can be used for Data

    Analysis and Mahout can be used for Data Mining. We can write MapReduce application code in language of our

    choice, which can be then arranged and executed on Hadoop.

    Hadoop for data analytics:

    We can use Hadoop for mining data held in Hadoop for business intelligence and analytics

    We have huge amount of data in Hadoop but all of data is not useful as lot of data is a low value data - so we will load

    only useful data to HANA.

    For loading data from Hadoop to HANA, we will use SAP Data Services.

    You can check the below Youtube video on how to load data from Hadoop to HANA:

    For getting more detail about the above scenarios, please refer to SAP CIO Guide on Using Hadoop.

    Accessing Hadoop using Smart Data Access:

    Smart Data Access is a new feature that was introduced with SAP HANA SPS06. It enables remote access to data as

    if they are local tables without copying the data into HANA .

    One of the main benefits of Smart Data Access is that we don't need any special syntax to access heterogeneous

    data sources.

    Lets say we have structured data stored in HANA and unstructured data stored in Hadoop.

    So now we can remotely access Hadoop data using Smart Data Access and combine both structured and

    unstructured data to create new models and get real insight to our business and make better decisions.

    How this works:

    Lets say we created a combined model using structured as well as unstructured data as told above and this model is

  • 3/2/2014 Hadoop and HANA Integration | SCN

    http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b 3/6

    available for reporting.

    So now we will make request through our reporting tool, based on our request HANA will determine the best way to

    extract data(also determines where and how data will get processed based on optimum utilization of application and

    system resources.) and will send request to Hadoop.

    Check this awesome blog on Smart Data Access with Hadoop Hive & Impala by Aron MacDonald.

    To know more about Smart Data Access, check the below blogs:

    http://scn.sap.com/community/hana-in-memory/blog/2013/08/22/smart-data-access-a-new-feature-by-hana

    http://www.saphana.com/community/learn/startups/blog/2013/08/21/hana-curious--smart-data-access

    http://www.saphana.com/community/blogs/blog/2013/07/22/smart-data-access-data-virtualization-with-sap-hana

    You can also check the videos on how to use Smart Data Access at HANA Academy(HANA is connected to Sybase IQ

    using Smart Data Access):

    http://www.saphana.com/community/hana-academy#sps6

    Also check the below blog by Aaron on Streaming Real-time Data to HADOOP and HANA:

    http://scn.sap.com/community/developer-center/hana/blog/2013/08/07/streaming-real-time-data-to-hadoop-and-

    hana

    Check out this video on how Hadoop and HANA can work together by Intel:

    Hadoop and HANA Use Cases:

    1.) Genome Analysis:

    MKI is using HANA with Hadoop to improve patient care in the realm of cancer research.

    Genome analysis is the technique used to determine and compare the genetic sequence (e.g. DNA in the

    chromosomes).

    Learn why HANA was selected for Real time Big Data Analysis to deliver advanced medical treatment

    Check the below video:

    http://events.sap.com/sapphirenow/en/session/2388

    Also Check out the below YouTube Video:

  • 3/2/2014 Hadoop and HANA Integration | SCN

    http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b 4/6

    2.) Real Time Retail Point of Sales:

    3.) Using Big Data In the Stadium to improve fan service:

    Check out more HANA Customer Stories:

    http://www.sapbigdata.com/stories/bigpoint-solves-big-data-challenges-with-sap-hana/

    Check the below blog to know more of Hadoop Use Cases:

    http://www.saphana.com/community/blogs/blog/2013/10/01/the-big-data-frenzy-and-how-humanity-benefits

    SAP's Hadoop Strategy:

    To get the latest news regarding SAP and Hadoop, follow SAP's Big data site: http://www.sapbigdata.com/

    Check this blog to know about SAP's Hadoop Strategy:

    http://www.saphana.com/community/learn/bigdata/blog/2013/05/09/saps-hadoop-strategy

    Recently SAP has signed agreements to redistribute and support Intel Distribution Apache Hadoop and Hortonworks

    Data Platform to customers.

    http://www.news-sap.com/sap-helps-customers-achieve-real-time-big-data-results/

    Hortonworks is a company that develops, distributes and supports Hadoop.

    Also read the below article by Information Week:

    http://www.informationweek.com/software/information-management/sap-expands-big-data-push/240161134

    If you are interested, you can also join Tomorrow's SAP Big Data Chat with Hortonworks:

    http://www.saphana.com/community/blogs/blog/2013/10/02/join-the-sap-big-data-chat-with-hortonworks

    Learn more about Hadoop and HANA Integration:

    Follow the channel SAP Database and Technology at https://www.brighttalk.com/channel/9727 and watch all

    Webinars for free.

    Check the below document to get links to all Big Data Webinars:

    http://scn.sap.com/docs/DOC-44661

  • 3/2/2014 Hadoop and HANA Integration | SCN

    http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b 5/6

    Average User Rating

    (6 ratings)

    0 Tweet 2

    Read about SAP Hortonworks Reference Architecture:

    http://hortonworks.com/wp-content/uploads/2013/09/Reference.Architecture.SAP_Hortonworks.v1.1.pdf

    Read about Combining SAP Real-Time Data Platform with Hortonworks Data Platform

    http://hortonworks.com/wp-content/uploads/2013/09/SAP_HortonWorks_GB_24469_en.pdf

    Thank You for reading my blog.

    2971 View s

    Share 2 0Like

    7 Comments

    Like (1)

    Naveen Kumar Oct 15, 2013 8:09 PM

    Thanks for Sharing Videos ,Links and Useful article.

    Like (1)

    Aron MacDonald Oct 15, 2013 8:09 PM

    Thanks for the mention.Very nice overview and collection of links. In addition to what you've mentioned,if you don't have BODS and are prepared to invest in a small amount of custom Java developmentthen you can also use HADOOP OOZIE to schedule data loads between HANA and HADOOP ifrequired. HADOOP Flume can also be used to Stream data (such as Twitter) in real time to HANA, if you don'thave Sybase ESP.E.g.

    http://scn.sap.com/community/developer-center/hana/blog/2013/08/07/streaming-real-time-data-to-hadoop-and-hana There are lots of interesting integration options to leverage the benefits of the 2 platforms. :-)

    Like (0)

    Vivek Singh Bhoj Oct 16, 2013 5:33 AM (in response to Aron MacDonald)

    Thaks a lot for this information Regards,Vivek

    Like (1)

    Praveen Kumar Oct 18, 2013 1:05 PM

    Good as its quite helpful for HADOOP and HANA integration. RegardsPraveen Kumar

    Like (1)

    Prashanth kumar Nov 12, 2013 11:11 AM

    nice overview .Thanks for sharing . RegardsPrashanth

    Saravana Siva Dec 30, 2013 4:10 PM

    Nice compilation ! Thanks for you efforts

  • 3/2/2014 Hadoop and HANA Integration | SCN

    http://scn.sap.com/community/hana-in-memory/blog/2013/10/15/b 6/6

    Follow SCNSite Index Contact Us SAP Help Portal

    Privacy Terms of Use Legal Disclosure Copyright

    Like (1)

    Regards,Siva

    Like (1)

    Abhinav verma Jan 25, 2014 8:12 AM

    Thanks a lot for sharing this information.Really Good!!!