big data opportunities and challenges

7
Big Data Opportunities and Challenges "Big Data" originally emerged as a term to describe datasets whose size is beyond the ability of traditional databases to capture, store, manage and analyze. However, the scope of the term has significantly expanded over the years. Big Data not only refers to the data itself but also a set of technologies that capture, store, manage and analyze large and variable collections of data to solve complex problems. Amid the proliferation of real time data from sources such as mobile devices, web, social media, sensors, log files and transactional applications, Big Data has found a host of vertical market applications, ranging from fraud detection to R&D. The Opportunities The "Big Data Market: 2014 – 2020 – Opportunities, Challenges, Strategies, Industry Verticals & Forecasts" report presented an in-depth assessment of the Big Data with the following key findings: In 2014 Big Data vendors will pocket nearly $30 Billion from hardware, software and professional services revenues Big Data investments are further expected to grow at a CAGR of nearly 17% over the next 6 years, eventually accounting for $76 Billion by the end of 2020

Upload: ahmed-banafa

Post on 26-Jan-2015

109 views

Category:

Internet


0 download

DESCRIPTION

"Big Data" originally emerged as a term to describe datasets whose size is beyond the ability of traditional databases to capture, store, manage and analyze. However, the scope of the term has significantly expanded over the years. Big Data not only refers to the data itself but also a set of technologies that capture, store, manage and analyze large and variable collections of data to solve complex problems. Amid the proliferation of real time data from sources such as mobile devices, web, social media, sensors, log files and transactional applications, Big Data has found a host of vertical market applications, ranging from fraud detection to R&D.

TRANSCRIPT

Page 1: Big Data opportunities and challenges

Big Data Opportunities and Challenges

"Big Data" originally emerged as a term to describe datasets whose size is beyond the ability of traditional databases to capture, store, manage and analyze. However, the scope of the term has significantly expanded over the years. Big Data not only refers to the data itself but also a set of technologies that capture, store, manage and analyze large and variable collections of data to solve complex problems.

Amid the proliferation of real time data from sources such as mobile devices, web, social media, sensors, log files and transactional applications, Big Data has found a host of vertical market applications, ranging from fraud detection to R&D.

The Opportunities

The "Big Data Market: 2014 – 2020 – Opportunities, Challenges, Strategies, Industry Verticals & Forecasts" report presented an in-depth assessment of the Big Data with the following key findings:

In 2014 Big Data vendors will pocket nearly $30 Billion from hardware, software and professional services revenues

Big Data investments are further expected to grow at a CAGR of nearly 17% over the next 6 years, eventually accounting for $76 Billion by the end of 2020

The market is ripe for acquisitions of pure-play Big Data startups, as competition heats up between IT incumbents

Nearly every large scale IT vendor maintains a Big Data portfolio At present, hardware sales and professional services account for more than 70% of all

Big Data investments Going forward, software vendors, particularly those in the Big Data analytics segment,

are expected to significantly increase their stake in the Big Data market as it matures

Technical Challenges

Page 2: Big Data opportunities and challenges

Dr. Kirk Borne Professor of Astrophysics and Computational Science, George Mason University listed below ten V’s representing big data’s biggest challenges (including Doug Laney’s initial 3 V’s: Volume, Variety, Velocity). These V-based characterizations represent ten different challenges associated with the main tasks involving big data (capture, cleaning, curation, integration, storage, processing, indexing, search, sharing, transfer, mining, analysis, and visualization):     

1. Volume. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to-machine data being collected.

2. Velocity. Data is streaming in at unprecedented speed and must be dealt with in a timely manner.

3. Variety. Data today comes in all types of formats—structured, numeric data in traditional databases. Information created from line-of-business applications.

4. Veracity: necessary and sufficient data to test many different hypotheses, vast training samples for rich micro-scale model-building and model validation, micro-grained “truth” about every object in your data collection, thereby empowering “whole-population analytics”.

5. Validity: data quality, governance, master data management (MDM) on massive, diverse, distributed, heterogeneous, “unclean” data collections.

6. Value: the all-important V, characterizing the business value, ROI, and potential of big data to transform your organization from top to bottom (including the bottom line).

7. Variability: dynamic, evolving, spatiotemporal data, time series, seasonal, and any other type of non-static behavior in your data sources, customers, objects of study, etc.

8. Venue: distributed, heterogeneous data from multiple platforms, from different owners’ systems, with different access and formatting requirements, private vs. public cloud.

9. Vocabulary: schema, data models, semantics, ontologies, taxonomies, and other content- and context-based metadata that describe the data’s structure, syntax, content, and provenance.

10. Vagueness: confusion over the meaning of big data (Is it Hadoop? Is it something that we’ve always had? What’s new about it? What are the tools? Which tools should we use? etc.)  .

Business Challenges:

Companies have difficulty identifying the right data and determining how to best use it. Building data-related business cases often means thinking outside of the box and looking for revenue models that are very different from the traditional business.

Companies are struggling to find the right talent capable of both working with new technologies and of interpreting the data to find meaningful business insights.

Data access and connectivity can be an obstacle. A majority of data points are not yet connected today, and companies often do not have the right platforms to aggregate and manage the data across the enterprise

The technology landscape in the data world is evolving extremely fast. Leveraging data means working with a strong and innovative technology partner that can help create the right IT architecture that can adapt to changes in the landscape in an efficient manner.

Page 3: Big Data opportunities and challenges

Leveraging big data often means working across functions like IT, engineering, finance and procurement and the ownership of data is fragmented across the organization. To address these organizational challenges means finding new ways of collaborating across functions and businesses.

Security concerns about data protection are a major obstacle preventing companies from taking full advantage of their data.

Securing Big Data 

As the amount of data being collected continues to grow, more and more companies are building big data repositories to store, aggregate and extract meaning from their data. Big data provides an enormous competitive advantage for corporations, helping businesses tailor their products to consumer needs, identify and minimize corporate inefficiencies, and share data with user groups across the enterprise. With a growth rate of 58 percent in 2013 alone, these technologies and their benefits are here to stay.

Unfortunately, legitimate organizations aren’t the only groups that are going big. Large sets of consolidated data are a tempting target for cyber attackers. Breaching an organization’s big data repository can provide criminal groups with bigger payoffs and more recognition from a single attack. And when attackers set their sights on big data repositories, the effects can be devastating for the affected organizations. Terabytes of data in these repositories may include a company’s crown jewels: customer data, employee data, and trade secrets. The recent data breach at Target is estimated to cost the company upwards of $1.1 billion. Many experts expect Home depot breach to be more. A breach in a big data repository could be even more damaging at a financial institution or healthcare provider, where the value of the data is extremely high and government regulations come into play.

Securing big data comes with its own unique challenges beyond being a high-value target. It’s not that big data security is fundamentally different from traditional data security. Big data

Page 4: Big Data opportunities and challenges

security challenges arise because of incremental differences, not fundamental ones. The differences between big data environments and traditional data environments include:

The data collected, aggregated, and analyzed for big data analysis The infrastructure used to store and house big data The technologies applied to analyze structured and unstructured big data

So what can be done to help bring the security of traditional database management to big data? Several organizations describe and define different security controls. The SANS Institute provides a list contains several controls to address the security challenges presented by big data:

Application Software Security. Use secure versions of open-source software. Big data technologies weren’t originally designed with security in mind. 

Maintenance, Monitoring, and Analysis of Audit Logs. Implement audit logging technologies to understand and monitor big data clusters. Keep in mind that security engineers in the organization need to be tasked with examining and monitoring these files. It’s important to ensure that auditing, maintaining, and analyzing logs are done consistently across the enterprise.

Secure Configurations for Hardware and Software. Build servers based on secure images for all systems in your organization’s big data architecture. Ensure patching is up to date on these machines and that administrative privileges are limited to a small number of users. (Good example is the recent Bash Bug)

Account Monitoring and Control. Manage accounts for big data users. Require strong passwords, deactivate inactive accounts, and impose a maximum permitted number of failed log-in attempts to help stop attacks from getting access to a cluster. It’s important to note that the enemy isn’t always outside of the organization. Monitoring account access can help reduce the probability of a successful compromise from the inside.

References

Page 5: Big Data opportunities and challenges

http://data-informed.com/manage-big-datas-big-security-challenges/#sthash.7UcukVRZ.dpufhttp://data-informed.com/wp-content/uploads/2013/06/Cloud-Security-Alliance-10-challenges-chart-650x400.jpghttp://www.marketwatch.com/story/the-big-data-market-2014-2020-opportunities-challenges-strategies-industry-verticals-and-forecasts-2014-07-17https://www.mapr.com/blog/top-10-big-data-challenges-%E2%80%93-serious-look-10-big-data-v%E2%80%99s#.VCefaRZ0ZBYhttp://blogs.wsj.com/experts/2014/03/26/six-challenges-of-big-data/https://www.reportbuyer.com/product/2164339/The-Big-Data-Market-2014-–-2020---Opportunities-Challenges-Strategies-Industry-Verticals-and-Forecasts.html http://www.computerweekly.com/feature/Big-data-and-analytics-a-large-challenge-offering-great-opportunitieshttp://data-informed.com/cloud-computing-experts-detail-big-data-security-and-privacy-risks/#sthash.egT33x4K.dpufhttp://dataddict.files.wordpress.com/2014/01/network-security-tips.jpg?w=585http://www.baselinemag.com/imagesvr_ce/5049/2013_bsl_DataKnowGaps_09.jpg