cs457/557 introduction – chapters 1-2. relevance of databases dbs are a part of most decisions in...
TRANSCRIPT
![Page 1: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/1.jpg)
CS457/557 Introduction – Chapters 1-2
![Page 2: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/2.jpg)
Relevance of Databases
• DBs are a part of most decisions in an enterprise– Traditional DBs – Operational– Data Warehouses – Decision Support– NoSQL DBs – Information
![Page 3: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/3.jpg)
Databases
• Databases play a critical role in? – Business, medicine, industry, etc., – everything?
• Databases can be?– Traditional, XML, Object-relational, multimedia,
real-time, Web, VERY large
• What databases have you used recently?
![Page 4: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/4.jpg)
Data vs. Databases
• Data – Recorded known facts, implicit meaning
• Database (DB) – Collection of related data– Logically coherent– Represents mini-world – Designed, built for specific purpose– Intended user group– Preconceived applications
![Page 5: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/5.jpg)
DBMS
• Database Management System (DBMS)– Software– Create and maintain a DB– Define types of data– Store on disk controlled by DBMS– Manipulate data
![Page 6: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/6.jpg)
DBMS cont’d
• Why a DBMS? – Program-data independence– Data abstraction– Conceptual representation– Meta data– Share data– Multiple views– Transaction processing– Higher overhead Fig. 2.3 and increased complexity
So why use a DBMS?
–OPTIMIZATION
![Page 7: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/7.jpg)
Definitions
• Database System DBS– Data + DBMS
• DBS– Schema (meta-data) - DB description, schema diagram Fig 2.1 – Instance (actual data) Fig. 1.2 - initially empty
• 3-schema architecture Fig 2.2– External view– Conceptual – structure of DB, hides physical– Internal – physical storage access paths
![Page 8: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/8.jpg)
Data Model
• Describes the structure records, types, relationships, constraints,
basic operations• DBMS based on data model • Types:– High-level (conceptual) - ER, UML, OO– Low level (physical) - XML– Implementation (representational) combines conceptual
and physical – Relational– NoSQL data models – Column, key-value, document stores
![Page 9: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/9.jpg)
DBMS Languages
• DDL - data definition language • DML - data manipulations language – High-level, nonprocedural – Set at a time – Interactive or embedded (host language)
• SQL most common/popular DB Language
![Page 10: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/10.jpg)
DBMS
• Software to create, query, manipulate data in the database
• Based on a particular data model• Allows for program-data independence• Provides language to define, manipulate data• Contains meta data
![Page 11: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/11.jpg)
Meta Data
• Data about the data
• “Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource.” NISO
![Page 12: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/12.jpg)
Meta Data
• Three categories of meta data (books as example):– Structural metadata: A way to define how objects are put
together, for example, how pages are ordered to form chapters.
– Administrative metadata: Information to help manage a resource, such as when and how it was created, types, and who has access
– Descriptive metadata: A resource for discovery and identification, including elements such as title, abstract, author, and keywords.
![Page 13: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/13.jpg)
Meta Data
•Structural– Student (Name, CWID, address, GPA, major)
•Administrative– Owner of data? • Account#, when created, modified
•Descriptive:– Everything but the content – constraints,
max/min values?
![Page 14: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/14.jpg)
Meta Data
• Metadata associated with mobile phones:– Phone number of every caller– Time of call– Duration of call– Serial numbers of phones involved– Location of each participant– Telephone calling card numbers
![Page 15: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/15.jpg)
Meta Data – According to the Guardian
• Metadata associated with emails:– Sender's name, email, and IP address– Recipient's name and email address– Date, time, and time zone– Mail client header formats– Unique identifier of email and related emails– Mail client login records with IP address– Subject of email
![Page 16: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/16.jpg)
Meta Data• Metadata associated with Facebook:– Username and unique identifier– User subscriptions– User device– Activity date, time, and time zone– User location– Username and profile bio information including:• Birthday• Hometown• work history• interests
![Page 17: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/17.jpg)
Meta Data
• Metadata associated with web browsers:– Activity including pages the user visits and when
visited– User data and possibly user login details with
auto-fill features– User IP address, internet service provider, device
hardware details, operating system, and browser version
– Cookies and cached data from websites
![Page 18: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/18.jpg)
Meta Data
• What about medical records?
![Page 19: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/19.jpg)
Additional Characteristics
• Interfaces• Actors– DBA– Designers– Users
• Naïve or parametric - same info each time• Casual - different info each time• Sophisticated - implement own applications using
databases
• Standalone – personal DBs using ready-made packages that store personal data
![Page 20: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/20.jpg)
DB classifications
• Single-user vs. multi-user • Centralized vs. distributed • Homogeneous vs. heterogeneous• Federated DBMS, multidatabase system
![Page 21: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/21.jpg)
DBS Utilities
• Loading – into DB, conversion tools• Backup – copy on durable mass storage• DB storage reorganization – of files to better
performance• Performance Monitoring – to reorganize, etc.
![Page 22: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/22.jpg)
Extending traditional (relational) databases
• Need for more complex databases• Images, videos, scientific • Object-oriented databases• Data mining (decision support systems),
spatial• Data on the web for e-commerce– XML
• Non or semi-structured data• Databases for cloud computing
![Page 23: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/23.jpg)
Application packages
• Software packages work with database backends (>1 database)
• Web enabled• Examples– Enterprise Resource Planning (ERP)
• Integrate data and processes of organization • Production, sales, distribution, marketing, finance, human
resources, etc.– Customer Relationship Management (CRM)
• Integrate customer information• Marketing and customer support
![Page 24: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/24.jpg)
Information Retrieval IR• Databases traditionally used for – Banking, insurance, retail, finance, manufacturing,
payroll• Information retrieval used for– Books, manuscripts, library• Searching based on key-words• document processing–keywords, categorization, ranking
documents
![Page 25: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/25.jpg)
Information Retrieval IR
• Advent of web, IR is exciting again!–Web pages have active objects, change
dynamically–New strategies needed
• Big Data • NoSQL
![Page 26: CS457/557 Introduction – Chapters 1-2. Relevance of Databases DBs are a part of most decisions in an enterprise – Traditional DBs – Operational – Data](https://reader036.vdocuments.net/reader036/viewer/2022062321/56649e0e5503460f94af8278/html5/thumbnails/26.jpg)
DB Management Issues
• This course 457/557– Design/Model DBs
• Weird course – theory + applications
– Relational: Query DBs, Algebra, NormalizationWe will use Oracle, MySQL
– Intro to: Security, performance, transactions, NoSQL
• Grad course 609– Redundancy – Integrity constraints and concurrency control (transactions)
– Backup and recovery – In depth: performance, NoSQL