nurul atiqah (is089083) ainul atiqah (is088950) nur ellina (is088954)

25
Emerging Database Technologies and Applications NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Upload: bryce-gregory

Post on 27-Dec-2015

240 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Emerging Database Technologies and Applications

NURUL ATIQAH (IS089083)AINUL ATIQAH (IS088950)NUR ELLINA (IS088954)

Page 2: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

INTRODUCTION

• There are a myriad of new directions in which databases are growing, presenting new and exciting challenges that promise flux in the whole society, because of the impact and changes the database systems have provoked almost everywhere in modern life. This ranges from the way the organizations operate and make their business decisions, to the use of portable devices with database involvements.

Page 3: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Database Technologies and Applications Definitions

Active databases• The next generation of DBMSs.• Applications such as process control, power

distribution/generation, workflow control, program trading, battle management, patient monitoring are not well served by passive DBMSs

• It combines logic programming technology with database technology. This allows the database itself to react to external events an to maintain dynamically its integrity with respect to the real world.

• Conditions defined on states of the database must be monitored and actions taken

• Active databases support condition monitoring

Page 4: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)
Page 5: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

The state of Art in Active Databases

• HiPAC (High Performance ACtive database system) research project at Xerox

• PROBE for battle management application (Computer Corporation of America)

• Event/Trigger Mechanism (Univ. of Karlsruhe)• POSTGRES (Stonebraker, UC Berkeley)• Starburst project at IBM• Sybase supports simple triggers• InterBase does not impose most of the restrictions seen in

Sybase• ORACLE v. 7, INGRES, INFORMIX, etc. provide some degree

of rule and trigger support

Page 6: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Apache Derby

• Apache Derby is a relational database (RDBMS) implemented entirely in Java and based on based on the Java, JDBC, and SQL standards. It has a small footprint -- about 2MB for the base engine and embedded JDBC driver.

• Features: Completely implemented in Java - providing excellent portability. Small footprint - about 2.6 megabytes for the base engine and

embedded JDBC driver. Embedded JDBC driver - embed Derby in any Java-based

solution. Client-server mode support. Easy to install, deploy, and use.

Page 7: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Hadoop and MapReduce

• Apache Hadoop is an open-source software framework for distributed data and computing. In other words, it’s excellent for storing large sets of semi-structured data. The data can be stored redundantly, so the failure of one disk doesn’t result in data loss. Hadoop is also very good at distributed computing – processing large sets of data rapidly across multiple machines.

• MapReduce is a programming model for processing large sets of semi-structured data. For example, in a relational database, we perform queries using a set-based language – i.e., SQL. We tell the language the result we want and leave it to the system to work out how to produce it. With a more traditional language (C++, Java), we tend to spell out, step by step, how to solve the problem. Those are two different programming models. MapReduce is yet another.

• MapReduce and Hadoop are independent of each other but, in practice, work well together – hence we often find them mentioned in the same breath.

Page 8: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Advantages and Disadvantages

Advantages: Improved ease and flexibility of application development.• Developing and maintaining applications at geographically

distributed sites of an organization is facilitated owing to transparency of data distribution and control.

Increased reliability and availability:

• Reliability refers to system live time, that is, system is

running efficiently most of the time. Availability is the

probability that the system is continuously available (usable

or accessible) during a time interval.

Page 9: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Improved performance:

• Fragments the database to keep data closer to where it is needed most. This reduces data management (access and modification) time significantly.

Easier expansion (scalability):

• Allows new nodes (computers) to be added anytime without chaining the entire configuration.

Page 10: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Disadvantages: Overcomplicated for storage of simple and/or small data.

Increase security risks.• Because the data are stored at third-party data centers, which

can be located in foreign jurisdictions subject to different privacy and data integrity laws.

Cost• Databases require significant upfront and ongoing financial

resources. Developing or customizing database management systems may involve frequent changes to system requirements, which lead to schedule slippages and cost overruns.

Page 11: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Benefit to Society/Industry

• Revolutionized the IT landscape for midsize companies by slashing management costs.

• Allowing IT teams to do much more work with significantly less effort.

• It benefits midsize companies by allowing for their rapid provisioning. Also allow customers to pay for only the resources they use.

• Virtual machine images

- IT teams in midsize companies have the option of uploading their ready to use environment with the database of choice or the teams have the option of using optimized database instances created by the host.

Page 12: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

• Database as a service (DaaS)

- DaaS technology allows IT teams to use proven database technologies without having to worry about providing hardware and performance tuning.

• IT teams in midsize companies benefit from improved flexibility in their daily tasks.

• IT teams also have the ability to scale resources on the fly, ensuring that project costs don't exceed budgets while also minimizing downtime.

• Evolving the Infrastructure Stack

- This is achieved largely through the introduction of new technologies that augment traditional database technology. It improved response time and built-in redundancy.

Page 13: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

• Tourism - hotel systems and local tourist attractions information and booking facilities rely on database systems, and the major package tour operators have extensive databases for holiday planning and booking, together with financial systems for payment and invoicing.

• Education - courses, materials, and assessment all rely heavily on database technology in all sectors of education. Increasingly the linking of database technology with hypermedia delivery systems allows courseware to be maintained up-to-date and delivered to the consumers.

Page 14: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

• Government administration- the collection of taxes and the payment of social security benefits depends totally on database technology.

• Retail - the major retail stores utilise database technology in stock control and PoS (Point of Sale) systems. Modern retailers use advanced data mining techniques to determine trends in sales and consumer preference to optimise stock control, retail performance, customer convenience and profit.

Page 15: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

SUCCESS STORY

eBay:

• As a massive online presence eBay has over 97 million active buyers and sellers,

over 200 million items for sale in over 50,000 categories. This translates into 10

TB or more of incoming data per day. When tasked with finding a way to solve real

time problems by crunching predictive models they turned to Hadoop and built a

500-node Hadoop cluster using Sun servers running Linux. As time went by a need

arose to create a better real-time search engine for the auction site. This is now

being built using Hadoop and HBase.

Amazon Web Services Inc:• Amazon Elastic MapReduce provides a managed, easy to use analytics platform

built around the powerful Hadoop framework. Focus on your map/reduce queries

and take advantage of the broad ecosystem of Hadoop tools, while deploying to a

high scale, secure infrastructure platform.

Page 16: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

IBM Corp:• IBM InfoSphere BigInsights makes it simpler for people to use Hadoop and build big

data applications. It enhances this open source technology to withstand the demands of

your enterprise, adding administrative, discovery, development, provisioning, and

security features, along with best-in-class analytical capabilities from IBM Research.

The result is that you get a more developer and user-friendly solution for complex, large

scale analytics.

JPMC:

• As a financial services company with over 150 PB of online data, 30,000 databases, 3.5

billion logins to user accounts, JPMC was faced with the task of reducing fraud,

managing IT risk and mining data for customer insights. To do this they turned to

Hadoop which now gave them a single platform to store all the data making it easier to

query for insights.

Page 17: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Orbitz:

• Orbitz the online travel vendor had a need to determine metrics like

“how long does it take for a user to download a page?”. When Orbitz

developers needed to understand why production systems had issues

they needed a way to mine huge volumes of production log data. The

solution they implemented uses a combination of Hadoop and Hive to

process weblogs which are then further processed using scripts

written in R (open source statistical package which supports

visualization) to derive useful metrics related to hotel bookings and

user ratings. Hadoop ended up complementing their existing data

warehouse systems instead of replacing them.

Page 18: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Sears: • When faced with a need to evaluate the results of various marketing campaigns

among other needs, Sears appears to have entered the Hadoop world in a big

way with a 300-node Hadoop cluster storing and processing 2 PB of data.

More recently they have begun using Hadoop to set pricing based on variables

like availability of a product in a store, what a competitor would charge for a

similar product, what economic conditions exist in that area. In addition their

big data system allows them to send customized coupons to consumers by

location, for instance if you are in New York and hit by Hurricane Sandy it is

useful to receive Sears coupons for generators, bleach and other survival tools.

In an industry dominated by e-tailers like Amazon, the ability to set and change

prices dynamically, court loyalty program consumers with customized offers

are some of the many ways that brick-and-mortar firms like Sears are trying to

stay relevant.

Page 19: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

Dell Inc:• Being able to analyze your growing mountain of data can give you a

distinct competitive advantage, but big data can be more than traditional

tools can handle. Dell Apache™ Hadoop™ Solutions can help by

providing superfast analysis, data mining and processing

• In conclusion the takeways would be that Hadoop complements your

existing data warehouse, it provides an open source scalable way to store

PB of log data but you still need to build a solution that is tailored to

solve your specific needs whether they be sentiment analysis, customer

satisfaction metrics or judging the effectiveness of your marketing

campaigns. Hadoop is a tool but like any good tool it doesn’t offer you a

panacea nor can it be used in isolation.

Page 20: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

10 ways companies use Hadoop to do more than serve ads

• Online travel. Cloudera’s Hadoop distribution currently powers about 80 percent of all online travel booked worldwide, like Orbitz Worldwide.

• Mobile data. Cloudera Hadoop powers “70 percent of all smartphones in the U.S.”

• E-commerce. Powers more than 10 million online merchants in the United States. One large retailer (e-bay) added 3 percent to its net profits after using Hadoop for just 90 days.

Page 21: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

• Energy discovery. During a panel at Cloudera’s event, a Chevron representative explained just one of many ways his company uses Hadoop: to sort and process data from ships that troll the ocean collecting seismic data that might signify the presence of oil reserves.

• Energy savings. Opower, which uses Hadoop to power its service that suggests ways for consumers to save money on energy bills. Hadoop also helps on certain capabilities, such as accurate and long-term bill forecasting.

Page 22: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

• Infrastructure management. More companies (including Etsy) are gathering and analyzing data from their servers, switches and other IT gear.

• Image processing. A startup called Skybox Imaging is using Hadoop to store and process images from the high-definition images its satellites will regularly capture as they attempt to detect patterns of geographic change. Skybox recently raised $70 million for its efforts.

• Fraud detection. This is used by both financial services organizations and intelligence agencies. One of those users, Zions Bancorporation, explained to recently how a move to Hadoop lets it store all the data it can on customer transactions and spot anomalies that might suggest fraudulent behavior.

Page 23: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

• IT security. As with infrastructure management, companies also use Hadoop to process machine-generated data that can identify malware and cyber attack patterns.

• Health care. Apixio uses Hadoop to power its service that leverages semantic analysis to provide doctors, nurses and others more-relevant answers to their questions about patients’ health.

Page 24: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

References• http://www.igi-global.com/book/encyclopedia-database-technologies-

applications/347• https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd

=9&cad=rja&ved=0CGEQFjAI&url=http%3A%2F%2Fcondor.depaul.edu%2Fprosenbe%2FSlides%2Ffuture.ppt&ei=_QGkUt6RNNSciQentIHgDA&usg=AFQjCNEEOKY2ULO5rpWr3Qvm1g8Fx8GcCw

• http://sourceforge.net/projects/apachederby.mirror/• http://cse.hcmut.edu.vn/~

c503002/Files/TRUONGQuynhChi/Slides/Chap10_EmergingTechsAndApps.pdf

• http://smallbusiness.chron.com/disadvantages-business-databases-33951.html

• http://midsizeinsider.com/en-us/article/how-cloud-database-technologies-benefit

• http://www.ercim.eu/medconf/papers/jeffery.html

Page 25: NURUL ATIQAH (IS089083) AINUL ATIQAH (IS088950) NUR ELLINA (IS088954)

• http://searchdatamanagement.techtarget.com/answer/Emerging-database-technologies-How-Hadoop-and-MapReduce-compare

• http://www.technavio.com/blog/top-14-hadoop-technology-companies

• http://ravistechblog.wordpress.com/2012/11/08/fortune-500-companies-using-hadoop/

• http://venturebeat.com/2013/10/08/why-companies-need-to-move-past-hadoop-hype-and-get-started-on-analytics-now/

• http://gigaom.com/2012/06/05/10-ways-companies-are-using-hadoop-to-do-more-than-serve-ads/

• http://www.dbt.co.nz/benefits.htm